Get startedGet started for free

Generate scatter plot with missingness

In this exercise you'll create a scatter plot consisting of both missing and non-missing values. You will utilize the function fill_dummy_values() which you created in the previous exercise for filling in dummy values in the DataFrame diabetes_dummy.

The nullity of a column is calculated using the .isnull() method. The nullity returns a Series (pd.Series) of True or False.

For setting different colors to the missing and non-missing values, you can simply combine the nullity using OR(|) operation on the columns that you are plotting, resulting in:

  • True \(\rightarrow\) Either col1 or col2 or both values are missing.
  • False \(\rightarrow\) Neither of col1 and col2 values are missing.

The DataFrame diabetes and the function fill_dummy_values() have been loaded for your usage.

This exercise is part of the course

Dealing with Missing Data in Python

View Course

Exercise instructions

  • Use OR operation to combine nullity of Skin_Fold and BMI.
  • Fill dummy values in diabetes_dummy using the function fill_dummy_values().
  • Create a scatter plot of 'BMI' versus 'Skin_Fold'; note Y versus X implies Y-axis against X-axis or Y as a function of X.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Use OR operation to combine Skin_Fold and BMI nullity
nullity = ___

# Fill dummy values in diabetes_dummy
diabetes_dummy = ___

# Create a scatter plot of BMI versus Skin_Fold
diabetes_dummy.plot(x=___, y=___, kind='___', alpha=0.5,                     
                    # Set color to nullity of BMI and Skin_Fold
                    c=___, 
                    cmap='rainbow')

plt.show()
Edit and Run Code