Tuning eta
It's time to practice tuning other XGBoost hyperparameters in earnest and observing their effect on model performance! You'll begin by tuning the "eta"
, also known as the learning rate.
The learning rate in XGBoost is a parameter that can range between 0
and 1
, with higher values of "eta"
penalizing feature weights more strongly, causing much stronger regularization.
This is a part of the course
“Extreme Gradient Boosting with XGBoost”
Exercise instructions
- Create a list called
eta_vals
to store the following"eta"
values:0.001
,0.01
, and0.1
. - Iterate over your
eta_vals
list using afor
loop. - In each iteration of the
for
loop, set the"eta"
key ofparams
to be equal tocurr_val
. Then, perform 3-fold cross-validation with early stopping (5
rounds),10
boosting rounds, a metric of"rmse"
, and aseed
of123
. Ensure the output is a DataFrame. - Append the final round RMSE to the
best_rmse
list.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create your housing DMatrix: housing_dmatrix
housing_dmatrix = xgb.DMatrix(data=X, label=y)
# Create the parameter dictionary for each tree (boosting round)
params = {"objective":"reg:squarederror", "max_depth":3}
# Create list of eta values and empty list to store final round rmse per xgboost model
____ = [____, ____, ____]
best_rmse = []
# Systematically vary the eta
for curr_val in ____:
params["___"] = curr_val
# Perform cross-validation: cv_results
cv_results = ____
# Append the final round rmse to best_rmse
____.____(____["____"].tail().values[-1])
# Print the resultant DataFrame
print(pd.DataFrame(list(zip(eta_vals, best_rmse)), columns=["eta","best_rmse"]))