LoslegenKostenlos loslegen

Deleting missing data

You saw before that the interest rate (int_rate) in the data set loan_data depends on the customer. Unfortunately some observations are missing interest rates. You now need to identify how many interest rates are missing and then delete them.

In this exercise you will use the function which() to create an index of rows that contain an NA. You will then use this index to delete rows with NAs.

Diese Übung ist Teil des Kurses

Credit Risk Modeling in R

Kurs anzeigen

Anleitung zur Übung

  • Take a look at the number of missing inputs for the variable int_rate using summary().
  • Use which() and is.na() to create an index of the observations without a recorded interest rate. Store the result in the object na_index.
  • Create a new data set called loan_data_delrow_na, which does not contain the observations with missing interest rates.
  • Recall that we made a copy of loan_data called loan_data_delcol_na. Instead of deleting the observations with missing interest rates, delete the entire int_rate column by setting it equal to NULL.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Look at summary of loan_data


# Get indices of missing interest rates: na_index
na_index <- 

# Remove observations with missing interest rates: loan_data_delrow_na
___ <- loan_data[-___, ]

# Make copy of loan_data
loan_data_delcol_na <- loan_data

# Delete interest rate column from loan_data_delcol_na
Code bearbeiten und ausführen