1. Learn
  2. /
  3. Courses
  4. /
  5. Credit Risk Modeling in Python

Exercise

Column importance and default prediction

When using multiple training sets with many different groups of columns, it's important to keep and eye on which columns matter and which do not. It can be expensive or time-consuming to maintain a set of columns even though they might not have any impact on loan_status.

The X data for this exercise was created with the following code:

X = cr_loan_prep[['person_income','loan_int_rate',
                  'loan_percent_income','loan_amnt',
                  'person_home_ownership_MORTGAGE','loan_grade_F']]

Train an XGBClassifier() model on this data, and check the column importance to see how each one performs to predict loan_status.

The cr_loan_pret data set along with X_train and y_train have been loaded in the workspace.

Instructions

100 XP
  • Create and train a XGBClassifier() model on the X_train and y_train training sets and store it as clf_gbt.
  • Print the column importances for the columns in clf_gbt by using .get_booster() and .get_score().