Threshold selection

You know there is a trade off between metrics like default recall, non-default recall, and model accuracy. One easy way to approximate a good starting threshold value is to look at a plot of all three using matplotlib. With this graph, you can see how each of these metrics look as you change the threshold values and find the point at which the performance of all three is good enough to use for the credit data.

The threshold values thresh, default recall values def_recalls, the non-default recall values nondef_recalls and the accuracy scores accs have been loaded into the workspace. To make the plot easier to read, the array ticks for x-axis tick marks has been loaded as well.

Plot the graph of thresh for the x-axis then def_recalls, non-default recall values, and accuracy scores on each y-axis.

Exploring and Preparing Loan Data

Logistic Regression for Defaults

Gradient Boosted Trees Using XGBoost

Model Evaluation and Implementation

Exercice

Threshold selection

Instructions 1/2