How thresholds affect performance
Setting the threshold to 0.4
shows promising results for model evaluation. Now you can assess the financial impact using the default recall which is selected from the classification reporting using the function precision_recall_fscore_support()
.
For this, you will estimate the amount of unexpected loss using the default recall to find what proportion of defaults you did not catch with the new threshold. This will be a dollar amount which tells you how much in losses you would have if all the unfound defaults were to default all at once.
The average loan value, avg_loan_amnt
has been calculated and made available in the workspace along with preds_df
and y_test
.
This exercise is part of the course
Credit Risk Modeling in Python
Exercise instructions
- Reassign the
loan_status
values using the threshold0.4
. - Store the number of defaults in
preds_df
by selecting the second value from the value counts and store it asnum_defaults
. - Get the default recall rate from the classification matrix and store it as
default_recall
- Estimate the unexpected loss from the new default recall by multiplying
1 - default_recall
by the average loan amount and number of default loans.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Reassign the values of loan status based on the new threshold
____[____] = ____[____].____(lambda x: 1 if x > ____ else 0)
# Store the number of loan defaults from the prediction data
____ = preds_df[____].____()[1]
# Store the default recall from the classification report
____ = ____(____,preds_df[____])[1][1]
# Calculate the estimated impact of the new default recall rate
print(____ * ____ * (1 - ____))