Comparing model reports
You've used logistic regression models and gradient boosted trees. It's time to compare these two to see which model will be used to make the final predictions.
One of the easiest first steps for comparing different models' ability to predict the probability of default is to look at their metrics from the classification_report()
. With this, you can see many different scoring metrics side-by-side for each model. Because the data and models are normally unbalanced with few defaults, focus on the metrics for defaults for now.
The trained models clf_logistic
and clf_gbt
have been loaded into the workspace along with their predictions preds_df_lr
and preds_df_gbt
. A cutoff of 0.4
was used for each. The test set y_test
is also available.
This exercise is part of the course
Credit Risk Modeling in Python
Exercise instructions
- Print the
classification_report()
for the logistic regression predictions. - Print the
classification_report()
for the gradient boosted tree predictions. - Print the
macro average
of the F-1 Score for the logistic regression usingprecision_recall_fscore_support()
. - Print the
macro average
of the F-1 Score for the gradient boosted tree usingprecision_recall_fscore_support()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print the logistic regression classification report
target_names = ['Non-Default', 'Default']
print(____(____, ____['loan_status'], target_names=target_names))
# Print the gradient boosted tree classification report
print(____(____, ____['loan_status'], target_names=target_names))
# Print the default F-1 scores for the logistic regression
print(____(____,____['loan_status'], average = 'macro')[2])
# Print the default F-1 scores for the gradient boosted tree
print(____(____,____['loan_status'], average = 'macro')[2])