Grid search with XGBoost
Now that you've learned how to tune parameters individually with XGBoost, let's take your parameter tuning to the next level by using scikit-learn's GridSearch
and RandomizedSearch
capabilities with internal cross-validation using the GridSearchCV
and RandomizedSearchCV
functions. You will use these to find the best model exhaustively from a collection of possible parameter values across multiple parameters simultaneously. Let's get to work, starting with GridSearchCV
!
This is a part of the course
“Extreme Gradient Boosting with XGBoost”
Exercise instructions
- Create a parameter grid called
gbm_param_grid
that contains a list of"colsample_bytree"
values (0.3
,0.7
), a list with a single value for"n_estimators"
(50
), and a list of 2"max_depth"
(2
,5
) values. - Instantiate an
XGBRegressor
object calledgbm
. - Create a
GridSearchCV
object calledgrid_mse
, passing in: the parameter grid toparam_grid
, theXGBRegressor
toestimator
,"neg_mean_squared_error"
toscoring
, and4
tocv
. Also specifyverbose=1
so you can better understand the output. - Fit the
GridSearchCV
object toX
andy
. - Print the best parameter values and lowest RMSE, using the
.best_params_
and.best_score_
attributes, respectively, ofgrid_mse
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the parameter grid: gbm_param_grid
gbm_param_grid = {
'____': [____, ____],
'____': [____],
'____': [____, ____]
}
# Instantiate the regressor: gbm
gbm = ____
# Perform grid search: grid_mse
grid_mse = ____
# Fit grid_mse to the data
____
# Print the best parameters and lowest RMSE
print("Best parameters found: ", ____)
print("Lowest RMSE found: ", np.sqrt(np.abs(____)))
This exercise is part of the course
Extreme Gradient Boosting with XGBoost
Learn the fundamentals of gradient boosting and build state-of-the-art machine learning models using XGBoost to solve classification and regression problems.
This chapter will teach you how to make your XGBoost models as performant as possible. You'll learn about the variety of parameters that can be adjusted to alter the behavior of XGBoost and how to tune them efficiently so that you can supercharge the performance of your models.
Exercise 1: Why tune your model?Exercise 2: When is tuning your model a bad idea?Exercise 3: Tuning the number of boosting roundsExercise 4: Automated boosting round selection using early_stoppingExercise 5: Overview of XGBoost's hyperparametersExercise 6: Tuning etaExercise 7: Tuning max_depthExercise 8: Tuning colsample_bytreeExercise 9: Review of grid search and random searchExercise 10: Grid search with XGBoostExercise 11: Random search with XGBoostExercise 12: Limits of grid search and random searchExercise 13: When should you use grid search and random search?What is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.