1. Learn
  2. /
  3. Courses
  4. /
  5. Dealing with Missing Data in Python

Connected

Exercise

Generate scatter plot with missingness

In this exercise you'll create a scatter plot consisting of both missing and non-missing values. You will utilize the function fill_dummy_values() which you created in the previous exercise for filling in dummy values in the DataFrame diabetes_dummy.

The nullity of a column is calculated using the .isnull() method. The nullity returns a Series (pd.Series) of True or False.

For setting different colors to the missing and non-missing values, you can simply combine the nullity using OR(|) operation on the columns that you are plotting, resulting in:

  • True \(\rightarrow\) Either col1 or col2 or both values are missing.
  • False \(\rightarrow\) Neither of col1 and col2 values are missing.

The DataFrame diabetes and the function fill_dummy_values() have been loaded for your usage.

Instructions

100 XP
  • Use OR operation to combine nullity of Skin_Fold and BMI.
  • Fill dummy values in diabetes_dummy using the function fill_dummy_values().
  • Create a scatter plot of 'BMI' versus 'Skin_Fold'; note Y versus X implies Y-axis against X-axis or Y as a function of X.