Undersampled tree performance

You've undersampled the training set and trained a model on the undersampled set.

The performance of the model's predictions not only impact the probability of default on the test set, but also on the scoring of new loan applications as they come in. You also now know that it is even more important that the recall of defaults be high, because a default predicted as non-default is more costly.

The next crucial step is to compare the new model's performance to the original model. The original predictions are stored as gbt_preds and the new model's predictions stored as gbt2_preds.

The model predictions gbt_preds and gbt2_preds are already stored in the workspace in addition to y_test.

Print the classification_report() for both the old model and new model.

Exploring and Preparing Loan Data

Logistic Regression for Defaults

Gradient Boosted Trees Using XGBoost

Model Evaluation and Implementation

Exercise

Undersampled tree performance

Instructions 1/3