Cross validating credit models
Credit loans and their data change over time, and it won't always look like what's been loaded into the current test sets. So, you can use cross-validation to try several smaller training and test sets which are derived from the original X_train
and y_train
.
Use the XGBoost function cv()
to perform cross-validation. You will need to set up all the parameters for cv()
to use on the test data.
The data sets X_train
, y_train
are loaded in the workspace along with the trained model gbt
, and the parameter dictionary params
which will print once the exercise loads.
This exercise is part of the course
Credit Risk Modeling in Python
Exercise instructions
- Set the number of folds to
5
and the stopping to10
. Store them asn_folds
andearly_stopping
. - Create the matrix object
DTrain
using the training data. - Use
cv()
on the parameters, folds, and early stopping objects. Store the results ascv_df
. - Print the contents of
cv_df
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Set the values for number of folds and stopping iterations
____ = ____
____ = ____
# Create the DTrain matrix for XGBoost
____ = xgb.____(____, label = ____)
# Create the data frame of cross validations
____ = xgb.cv(____, ____, num_boost_round = 5, nfold=____,
early_stopping_rounds=____)
# Print the cross validations data frame
____(____)