Predicting the probability of default

In the video you looked at the predicted probability of default for one case in the test set. Luckily, you can predict the probability for all the test set cases at once using the predict() function.

After having obtained all the predictions for the test set elements, it is useful to get an initial idea of how good the model is at discriminating by looking at the range of predicted probabilities. A small range means that predictions for the test set cases do not lie far apart, and therefore the model might not be very good at discriminating good from bad customers. With low default percentages, you will notice that in general, very low probabilities of default are predicted. It's time to have a look at a first model.

log_model_small is loaded in the workspace.

This exercise is part of the course

Credit Risk Modeling in R

View Course

Exercise instructions

  • The code for the prediction of test_case in the video is copied in your workspace. Change the code such that the function predict() is applied to all cases in test_set. You can store them in the object predictions_all_small.
  • Get an initial idea of how well the model can discriminate using range()

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Make PD-predictions for all the test set elements using the "log_model_small" logistic regression model
predictions_all_small <- predict(log_model_small, newdata = test_case, type = "response")

# Look at the range of the object "predictions_all_small"