How thresholds affect performance

Setting the threshold to 0.4 shows promising results for model evaluation. Now you can assess the financial impact using the default recall which is selected from the classification reporting using the function precision_recall_fscore_support().

For this, you will estimate the amount of unexpected loss using the default recall to find what proportion of defaults you did not catch with the new threshold. This will be a dollar amount which tells you how much in losses you would have if all the unfound defaults were to default all at once.

The average loan value, avg_loan_amnt has been calculated and made available in the workspace along with preds_df and y_test.

Reassign the loan_status values using the threshold 0.4.
Store the number of defaults in preds_df by selecting the second value from the value counts and store it as num_defaults.
Get the default recall rate from the classification matrix and store it as default_recall
Estimate the unexpected loss from the new default recall by multiplying 1 - default_recall by the average loan amount and number of default loans.

Exploring and Preparing Loan Data

Logistic Regression for Defaults

Gradient Boosted Trees Using XGBoost

Model Evaluation and Implementation

Exercise

How thresholds affect performance

Instructions