1. Learn
  2. /
  3. Courses
  4. /
  5. Credit Risk Modeling in R

Exercise

Deleting missing data

You saw before that the interest rate (int_rate) in the data set loan_data depends on the customer. Unfortunately some observations are missing interest rates. You now need to identify how many interest rates are missing and then delete them.

In this exercise you will use the function which() to create an index of rows that contain an NA. You will then use this index to delete rows with NAs.

Instructions

100 XP
  • Take a look at the number of missing inputs for the variable int_rate using summary().
  • Use which() and is.na() to create an index of the observations without a recorded interest rate. Store the result in the object na_index.
  • Create a new data set called loan_data_delrow_na, which does not contain the observations with missing interest rates.
  • Recall that we made a copy of loan_data called loan_data_delcol_na. Instead of deleting the observations with missing interest rates, delete the entire int_rate column by setting it equal to NULL.