Session Ready
Exercise

Further Cleaning

What did you notice in the last exercise? While the three columns melt into one, the dataset still has some problems. First of all, when we know Elizabeth has brown eyes, it's redundant to record that she doesn't have blue or black eyes. Therefore, what we want to do is to get rid of all rows whose value in the value column is 0. It is very easy to do this in pandas using the following command:

df1 = df2[df2.column == value]

where column is the name of the column we are examining and value is the value we want to keep. This step will give us one row for each girl that tells us only her correct eye color. Now the value column is no longer necessary, so let's delete it:

df.drop(lst, axis = 1)

Here lst is a list of the columns we want to get rid of, and axis = 1 specifies that we want to drop columns instead of rows.

Instructions
100 XP
  • Filter the dataset to keep only the rows where value is 1.
  • Delete the value column.