Get startedGet started for free

Replacing missing values

Another way of handling missing values is to replace them all with the same value. For numerical variables, one option is to replace values with 0— you'll do this here. However, when you replace missing values, you make assumptions about what a missing value means. In this case, you will assume that a missing number sold means that no sales for that avocado type were made that week.

In this exercise, you'll see how replacing missing values can affect the distribution of a variable using histograms. You can plot histograms for multiple variables at a time as follows:

dogs[["height_cm", "weight_kg"]].hist()

pandas has been imported as pd and matplotlib.pyplot has been imported as plt. The avocados_2016 dataset is available.

This exercise is part of the course

Data Manipulation with pandas

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# List the columns with missing values
cols_with_missing = ["small_sold", "large_sold", "xl_sold"]

# Create histograms showing the distributions cols_with_missing
avocados_2016[____].____

# Show the plot
____
Edit and Run Code