Comparing density plots
The different imputations that you have performed previously can be graphically compared with their density plots. From these plots, you will be able to easily analyze and find the dataset that has the most similar distribution when compared to the original dataset. You will also be able to see how an imputation can biased.
In this exercise, you will compare the density plots of the Imputed DataFrames for diabetes
you created earlier.
The DataFrames diabetes_cc
, diabetes_mean_imputed
, diabetes_knn_imputed
and diabetes_mice_imputed
have already been loaded for you to use along with matplotlib.pyplot
as plt
.
This exercise is part of the course
Dealing with Missing Data in Python
Exercise instructions
- Plot a density plot for the
'Skin_Fold'
column for each DataFrame. - Set the labels using the
labels
list. - Set the label for the x-axis to
'Skin Fold'
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Plot graphs of imputed DataFrames and the complete case
diabetes_cc['___'].___(___='___', c='red', linewidth=3)
diabetes_mean_imputed['___'].plot(___='___')
diabetes_knn_imputed['___'].plot(___='___')
diabetes_mice_imputed['___'].plot(___='___')
# Create labels for the four DataFrames
labels = ['Baseline (Complete Case)', 'Mean Imputation', 'KNN Imputation', 'MICE Imputation']
plt.legend(___)
# Set the x-label as Skin Fold
plt.xlabel('___')
plt.show()