Dropping rows
When you know that a specific column will be critical to your analysis, and only a small fraction of rows are missing a value in that column, it often makes sense to remove those rows from the dataset.
During this course, the driver_gender column will be critical to many of your analyses. Because only a small fraction of rows are missing driver_gender, we'll drop those rows from the dataset.
Diese Übung ist Teil des Kurses
Analyzing Police Activity with pandas
Anleitung zur Übung
- Count the number of missing values in each column.
- Drop all rows that are missing
driver_genderby passing the column name to thesubsetparameter of.dropna(). - Count the number of missing values in each column again, to verify that none of the remaining rows are missing
driver_gender. - Examine the DataFrame's
.shapeto see how many rows and columns remain.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Count the number of missing values in each column
print(ri.isnull().____)
# Drop all rows that are missing 'driver_gender'
ri.____(subset=[____], inplace=True)
# Count the number of missing values in each column (again)
print(ri.____.____)
# Examine the shape of the DataFrame
print(____)