Predicting the probability of default
In the video you looked at the predicted probability of default for one case in the test set. Luckily, you can predict the probability for all the test set cases at once using the predict() function.
After having obtained all the predictions for the test set elements, it is useful to get an initial idea of how good the model is at discriminating by looking at the range of predicted probabilities. A small range means that predictions for the test set cases do not lie far apart, and therefore the model might not be very good at discriminating good from bad customers. With low default percentages, you will notice that in general, very low probabilities of default are predicted. It's time to have a look at a first model.
log_model_small
is loaded in the workspace.
This exercise is part of the course
Credit Risk Modeling in R
Exercise instructions
- The code for the prediction of
test_case
in the video is copied in your workspace. Change the code such that the functionpredict()
is applied to all cases intest_set
. You can store them in the objectpredictions_all_small
. - Get an initial idea of how well the model can discriminate using
range()
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Make PD-predictions for all the test set elements using the "log_model_small" logistic regression model
predictions_all_small <- predict(log_model_small, newdata = test_case, type = "response")
# Look at the range of the object "predictions_all_small"