Default classification reporting

It's time to take a closer look at the evaluation of the model. Here is where setting the threshold for probability of default will help you analyze the model's performance through classification reporting.

Creating a data frame of the probabilities makes them easier to work with, because you can use all the power of pandas. Apply the threshold to the data and check the value counts for both classes of loan_status to see how many predictions of each are being created. This will help with insight into the scores from the classification report.

The cr_loan_prep data set, trained logistic regression clf_logistic, true loan status values y_test, and predicted probabilities, preds are loaded in the workspace.

Create a data frame of just the probabilities of default from preds called preds_df.
Reassign loan_status values based on a threshold of 0.50 for probability of default in preds_df.
Print the value counts of the number of rows for each loan_status.
Print the classification report using y_test and preds_df.

Exploring and Preparing Loan Data

Logistic Regression for Defaults

Gradient Boosted Trees Using XGBoost

Model Evaluation and Implementation

Exercise

Default classification reporting

Instructions