Train XGBoost models
Every Machine Learning method could potentially overfit. You will see it on this example with XGBoost. Again, you are working with the Store Item Demand Forecasting Challenge. The train
DataFrame is available in your workspace.
Firstly, let's train multiple XGBoost models with different sets of hyperparameters using XGBoost's learning API. The single hyperparameter you will change is:
max_depth
- maximum depth of a tree. Increasing this value will make the model more complex and more likely to overfit.
This exercise is part of the course
Winning a Kaggle Competition in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
import xgboost as xgb
# Create DMatrix on train data
dtrain = xgb.DMatrix(data=train[['store', 'item']],
label=train['sales'])
# Define xgboost parameters
params = {'objective': 'reg:linear',
'____': ____,
'verbosity': 0}
# Train xgboost model
xg_depth_2 = xgb.train(params=params, dtrain=dtrain)