CommencerCommencer gratuitement

Default classification reporting

It's time to take a closer look at the evaluation of the model. Here is where setting the threshold for probability of default will help you analyze the model's performance through classification reporting.

Creating a data frame of the probabilities makes them easier to work with, because you can use all the power of pandas. Apply the threshold to the data and check the value counts for both classes of loan_status to see how many predictions of each are being created. This will help with insight into the scores from the classification report.

The cr_loan_prep data set, trained logistic regression clf_logistic, true loan status values y_test, and predicted probabilities, preds are loaded in the workspace.

Cet exercice fait partie du cours

Credit Risk Modeling in Python

Afficher le cours

Instructions

  • Create a data frame of just the probabilities of default from preds called preds_df.
  • Reassign loan_status values based on a threshold of 0.50 for probability of default in preds_df.
  • Print the value counts of the number of rows for each loan_status.
  • Print the classification report using y_test and preds_df.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Create a dataframe for the probabilities of default
____ = pd.____(____[:,1], columns = ['prob_default'])

# Reassign loan status based on the threshold
____[____] = ____[____].apply(lambda x: 1 if x > ____ else 0)

# Print the row counts for each loan status
print(____[____].____())

# Print the classification report
target_names = ['Non-Default', 'Default']
print(____(____, ____['loan_status'], target_names=target_names))
Modifier et exécuter le code