Get startedGet started for free

Dropping rows

When you know that a specific column will be critical to your analysis, and only a small fraction of rows are missing a value in that column, it often makes sense to remove those rows from the dataset.

During this course, the driver_gender column will be critical to many of your analyses. Because only a small fraction of rows are missing driver_gender, we'll drop those rows from the dataset.

This exercise is part of the course

Analyzing Police Activity with pandas

View Course

Exercise instructions

  • Count the number of missing values in each column.
  • Drop all rows that are missing driver_gender by passing the column name to the subset parameter of .dropna().
  • Count the number of missing values in each column again, to verify that none of the remaining rows are missing driver_gender.
  • Examine the DataFrame's .shape to see how many rows and columns remain.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Count the number of missing values in each column
print(ri.isnull().____)

# Drop all rows that are missing 'driver_gender'
ri.____(subset=[____], inplace=True)

# Count the number of missing values in each column (again)
print(ri.____.____)

# Examine the shape of the DataFrame
print(____)
Edit and Run Code