Get startedGet started for free

Removing missing data

You replaced missing data in person_emp_length, but in the previous exercise you saw that loan_int_rate has missing data as well.

Similar to having missing data within loan_status, having missing data within loan_int_rate will make predictions difficult.

Because interest rates are set by your company, having missing data in this column is very strange. It's possible that data ingestion issues created errors, but you cannot know for sure. For now, it's best to .drop() these records before moving forward.

The data set cr_loan has been loaded in the workspace.

This exercise is part of the course

Credit Risk Modeling in Python

View Course

Exercise instructions

  • Print the number of records that contain missing data for interest rate.
  • Create an array of indices for rows that contain missing interest rate called indices.
  • Drop the records with missing interest rate data and save the results to cr_loan_clean.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Print the number of nulls
print(____[____].____().____())

# Store the array on indices
____ = ____[____[____].____].____

# Save the new data without missing data
____ = ____.____(____)
Edit and Run Code