1. Learn
  2. /
  3. Courses
  4. /
  5. Credit Risk Modeling in Python

Connected

Exercise

Replacing missing credit data

Now, you should check for missing data. If you find missing data within loan_status, you would not be able to use the data for predicting probability of default because you wouldn't know if the loan was a default or not. Missing data within person_emp_length would not be as damaging, but would still cause training errors.

So, check for missing data in the person_emp_length column and replace any missing values with the median.

The data set cr_loan has been loaded in the workspace.

Instructions

100 XP
  • Print an array of column names that contain missing data using .isnull().
  • Print the top five rows of the data set that has missing data for person_emp_length.
  • Replace the missing data with the median of all the employment length using .fillna().
  • Create a histogram of the person_emp_length column to check the distribution.