Exercise

Comparing link functions for a given cut-off

In this last exercise, you will fit a model using each of the three link functions (logit, probit and cloglog), make predictions for the test set, classify the predictions in the appropriate group (default versus non-default) for a given cut-off, make a confusion matrix and compute the accuracy and sensitivity for each of the models given the cut-off value! Wow, you've learned a lot so far. And finally, you will try to identify the model that performs best in terms of accuracy given the cut-off value!

It is important to know that the differences between the models will generally be very small, and again, the results will depend on the chosen cut-off value. The observed outcome (default versus non-default) is stored in true_val in the console.

Instructions

100 XP
  • Fit three logistic regression models using links logit, probit and cloglog respectively. Part of the code is given. Use age, emp_cat, ir_cat and loan_amnt as predictors.
  • Make predictions for all models using the test_set.
  • Use a cut-off value of 14% to make predictions for each of the models, such that their performance can be evaluated.
  • Make a confusion matrix for the three models.
  • Lastly, compute the classification accuracy for all three models.