Session Ready
Exercise

Replacing missing values

Another way of handling missing values is to replace them all with the same value. For numerical variables, one option is to replace values with 0— you'll do this here. However, when you replace missing values, you make assumptions about what a missing value means. In this case, you will assume that a missing number sold means that no sales for that avocado type were made that week.

In this exercise, you'll see how replacing missing values can affect the distribution of a variable using histograms. You can plot histograms for multiple variables at a time as follows:

dogs[["height_cm", "weight_kg"]].hist()

pandas has been imported as pd and matplotlib.pyplot has been imported as plt. The avocados_2016 dataset is available.

Instructions 1/2
undefined XP
  • 1
  • 2
  • A list has been created, cols_with_missing, containing the names of columns with missing values: "small_sold", "large_sold", and "xl_sold".
  • Create a histogram of those columns.
  • Show the plot.