Tuning eta
It's time to practice tuning other XGBoost hyperparameters in earnest and observing their effect on model performance! You'll begin by tuning the "eta"
, also known as the learning rate.
The learning rate in XGBoost is a parameter that can range between 0
and 1
, with higher values of "eta"
penalizing feature weights more strongly, causing much stronger regularization.
This is a part of the course
“Extreme Gradient Boosting with XGBoost”
Exercise instructions
- Create a list called
eta_vals
to store the following"eta"
values:0.001
,0.01
, and0.1
. - Iterate over your
eta_vals
list using afor
loop. - In each iteration of the
for
loop, set the"eta"
key ofparams
to be equal tocurr_val
. Then, perform 3-fold cross-validation with early stopping (5
rounds),10
boosting rounds, a metric of"rmse"
, and aseed
of123
. Ensure the output is a DataFrame. - Append the final round RMSE to the
best_rmse
list.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create your housing DMatrix: housing_dmatrix
housing_dmatrix = xgb.DMatrix(data=X, label=y)
# Create the parameter dictionary for each tree (boosting round)
params = {"objective":"reg:squarederror", "max_depth":3}
# Create list of eta values and empty list to store final round rmse per xgboost model
____ = [____, ____, ____]
best_rmse = []
# Systematically vary the eta
for curr_val in ____:
params["___"] = curr_val
# Perform cross-validation: cv_results
cv_results = ____
# Append the final round rmse to best_rmse
____.____(____["____"].tail().values[-1])
# Print the resultant DataFrame
print(pd.DataFrame(list(zip(eta_vals, best_rmse)), columns=["eta","best_rmse"]))
This exercise is part of the course
Extreme Gradient Boosting with XGBoost
Learn the fundamentals of gradient boosting and build state-of-the-art machine learning models using XGBoost to solve classification and regression problems.
This chapter will teach you how to make your XGBoost models as performant as possible. You'll learn about the variety of parameters that can be adjusted to alter the behavior of XGBoost and how to tune them efficiently so that you can supercharge the performance of your models.
Exercise 1: Why tune your model?Exercise 2: When is tuning your model a bad idea?Exercise 3: Tuning the number of boosting roundsExercise 4: Automated boosting round selection using early_stoppingExercise 5: Overview of XGBoost's hyperparametersExercise 6: Tuning etaExercise 7: Tuning max_depthExercise 8: Tuning colsample_bytreeExercise 9: Review of grid search and random searchExercise 10: Grid search with XGBoostExercise 11: Random search with XGBoostExercise 12: Limits of grid search and random searchExercise 13: When should you use grid search and random search?What is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.