Guess the missingness type
Analyzing the type of missingness helps you to deduce the best ways you can deal with missing data. The Pima Indians diabetes dataset is very popularly known for having missing data. Pima Indians is an ethnic group of people who are more prone to having diabetes. The dataset contains several lab tests conducted with members of this community.
In the video lesson, you learned the 3 types of missingness patterns. In this exercise you'll first visualize the missingness summary and then identify the types of missingness the DataFrame contains.
The DataFrame has already been loaded to you as diabetes
.
Note that we've used a proprietary display()
function instead of plt.show()
to make it easier for you to view the output.
This exercise is part of the course
Dealing with Missing Data in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import missingno as msno
___
# Visualize the missingness summary
___
# Display nullity matrix
display("/usr/local/share/datasets/matrix_diabetes.png")