Default classification reporting
It's time to take a closer look at the evaluation of the model. Here is where setting the threshold for probability of default will help you analyze the model's performance through classification reporting.
Creating a data frame of the probabilities makes them easier to work with, because you can use all the power of pandas. Apply the threshold to the data and check the value counts for both classes of loan_status to see how many predictions of each are being created. This will help with insight into the scores from the classification report.
The cr_loan_prep data set, trained logistic regression clf_logistic, true loan status values y_test, and predicted probabilities, preds are loaded in the workspace.
Diese Übung ist Teil des Kurses
Credit Risk Modeling in Python
Anleitung zur Übung
- Create a data frame of just the probabilities of default from
predscalledpreds_df. - Reassign
loan_statusvalues based on a threshold of0.50for probability of default inpreds_df. - Print the value counts of the number of rows for each
loan_status. - Print the classification report using
y_testandpreds_df.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Create a dataframe for the probabilities of default
____ = pd.____(____[:,1], columns = ['prob_default'])
# Reassign loan status based on the threshold
____[____] = ____[____].apply(lambda x: 1 if x > ____ else 0)
# Print the row counts for each loan status
print(____[____].____())
# Print the classification report
target_names = ['Non-Default', 'Default']
print(____(____, ____['loan_status'], target_names=target_names))