CommencerCommencer gratuitement

Removing missing data

You replaced missing data in person_emp_length, but in the previous exercise you saw that loan_int_rate has missing data as well.

Similar to having missing data within loan_status, having missing data within loan_int_rate will make predictions difficult.

Because interest rates are set by your company, having missing data in this column is very strange. It's possible that data ingestion issues created errors, but you cannot know for sure. For now, it's best to .drop() these records before moving forward.

The data set cr_loan has been loaded in the workspace.

Cet exercice fait partie du cours

Credit Risk Modeling in Python

Afficher le cours

Instructions

  • Print the number of records that contain missing data for interest rate.
  • Create an array of indices for rows that contain missing interest rate called indices.
  • Drop the records with missing interest rate data and save the results to cr_loan_clean.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Print the number of nulls
print(____[____].____().____())

# Store the array on indices
____ = ____[____[____].____].____

# Save the new data without missing data
____ = ____.____(____)
Modifier et exécuter le code