Column selection and model performance

Creating the training set from different combinations of columns affects the model and the importance values of the columns. Does a different selection of columns also affect the F-1 scores, the combination of the precision and recall, of the model? You can answer this question by training two different models on two different sets of columns, and checking the performance.

Inaccurately predicting defaults as non-default can result in unexpected losses if the probability of default for these loans was very low. You can use the F-1 score for defaults to see how the models will accurately predict the defaults.

The credit data, cr_loan_prep and the two training column sets X and X2 have been loaded in the workspace. The models gbt and gbt2 have already been trained.

Cet exercice fait partie du cours

Credit Risk Modeling in Python

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Predict the loan_status using each model
____ = gbt.____(____)
____ = gbt2.____(____)

# Print the classification report of the first model
target_names = ['Non-Default', 'Default']
print(____(____, ____, target_names=target_names))

# Print the classification report of the second model
print(____(____, ____, target_names=target_names))

Modifier et exécuter le code