Get Started

Grid search with XGBoost

Now that you've learned how to tune parameters individually with XGBoost, let's take your parameter tuning to the next level by using scikit-learn's GridSearch and RandomizedSearch capabilities with internal cross-validation using the GridSearchCV and RandomizedSearchCV functions. You will use these to find the best model exhaustively from a collection of possible parameter values across multiple parameters simultaneously. Let's get to work, starting with GridSearchCV!

This is a part of the course

“Extreme Gradient Boosting with XGBoost”

View Course

Exercise instructions

  • Create a parameter grid called gbm_param_grid that contains a list of "colsample_bytree" values (0.3, 0.7), a list with a single value for "n_estimators" (50), and a list of 2 "max_depth" (2, 5) values.
  • Instantiate an XGBRegressor object called gbm.
  • Create a GridSearchCV object called grid_mse, passing in: the parameter grid to param_grid, the XGBRegressor to estimator, "neg_mean_squared_error" to scoring, and 4 to cv. Also specify verbose=1 so you can better understand the output.
  • Fit the GridSearchCV object to X and y.
  • Print the best parameter values and lowest RMSE, using the .best_params_ and .best_score_ attributes, respectively, of grid_mse.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create the parameter grid: gbm_param_grid
gbm_param_grid = {
    '____': [____, ____],
    '____': [____],
    '____': [____, ____]
}

# Instantiate the regressor: gbm
gbm = ____

# Perform grid search: grid_mse
grid_mse = ____


# Fit grid_mse to the data
____

# Print the best parameters and lowest RMSE
print("Best parameters found: ", ____)
print("Lowest RMSE found: ", np.sqrt(np.abs(____)))

This exercise is part of the course

Extreme Gradient Boosting with XGBoost

IntermediateSkill Level
4.2+
33 reviews

Learn the fundamentals of gradient boosting and build state-of-the-art machine learning models using XGBoost to solve classification and regression problems.

This chapter will teach you how to make your XGBoost models as performant as possible. You'll learn about the variety of parameters that can be adjusted to alter the behavior of XGBoost and how to tune them efficiently so that you can supercharge the performance of your models.

Exercise 1: Why tune your model?Exercise 2: When is tuning your model a bad idea?Exercise 3: Tuning the number of boosting roundsExercise 4: Automated boosting round selection using early_stoppingExercise 5: Overview of XGBoost's hyperparametersExercise 6: Tuning etaExercise 7: Tuning max_depthExercise 8: Tuning colsample_bytreeExercise 9: Review of grid search and random searchExercise 10: Grid search with XGBoost
Exercise 11: Random search with XGBoostExercise 12: Limits of grid search and random searchExercise 13: When should you use grid search and random search?

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free