Session Ready
Exercise

Will you delete?

Before deleting missing values completely, you must consider the factors for deletion. The simplest factor to consider is the size of the missing data. More complex reasons affecting missingness may require domain knowledge. In this exercise, you will identify the reason for missingness and then perform the appropriate deletion.

You'll first use msno.matrix() and msno.heatmap() to visualize missingness and the correlation between variables with missing data. You will then determine pattern in missingness. Lastly, you'll delete depending on the type of missingness.

The diabetes DataFrame has been loaded for you.

Note that we've used a proprietary display() function instead of plt.show() to make it easier for you to view the output.

Instructions 1/4
undefined XP
  • 1
  • 2
  • 3
  • 4
  • Visualize the missingness matrix of diabetes.