Missing investors
Dealing with missing data is one of the most common tasks in data science. There are a variety of types of missingness, as well as a variety of types of solutions to missing data.
You just received a new version of the banking
DataFrame containing data on the amount held and invested for new and existing customers. However, there are rows with missing inv_amount
values.
You know for a fact that most customers below 25 do not have investment accounts yet, and suspect it could be driving the missingness. The pandas
, missingno
and matplotlib.pyplot
packages have been imported as pd
, msno
and plt
respectively. The banking
DataFrame is in your environment.
This exercise is part of the course
Cleaning Data in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print number of missing values in banking
print(____)
# Visualize missingness matrix
____
____