Replacing hidden missing values

In the previous two exercises, you worked on identifying and handling missing values while importing a dataset. In this exercise, you will work on identifying hidden missing values in your data and handling them. You'll use the diabetes dataset which has already been loaded for you.

The diabetes DataFrame has 0's in the column BMI. But BMI cannot be 0. It should instead be NaN. In this exercise, you'll learn to identify such discrepancies. You'll perform simple data analysis to catch missing values and replace them. Both numpy and pandas have been imported into your DataFrame as np and pd respectively.

Additionally, you can play around with the dataset like printing it's .head(), .info() etc. to get more familiar with the dataset.

Describe the basic statistics of diabetes.

script.py

IPython Shell

The Problem With Missing Data

Does Missingness Have A Pattern?

Imputation Techniques

Advanced Imputation Techniques

Exercise

Exercise

Replacing hidden missing values

Instructions 1/4