Using regularization in XGBoost

Having seen an example of l1 regularization in the video, you'll now vary the l2 regularization penalty - also known as "lambda" - and see its effect on overall model performance on the Ames housing dataset.

Cet exercice fait partie du cours

Extreme Gradient Boosting with XGBoost

Afficher le cours

Instructions

Create your DMatrix from X and y as before.
Create an initial parameter dictionary specifying an "objective" of "reg:squarederror" and "max_depth" of 3.
Use xgb.cv() inside of a for loop and systematically vary the "lambda" value by passing in the current l2 value (reg).
Append the "test-rmse-mean" from the last boosting round for each cross-validated xgboost model.
Hit 'Submit Answer' to view the results. What do you notice?

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Create the DMatrix: housing_dmatrix
housing_dmatrix = xgb.DMatrix(data=X, label=y)

reg_params = [1, 10, 100]

# Create the initial parameter dictionary for varying l2 strength: params
params = {"____":"____","____":____}

# Create an empty list for storing rmses as a function of l2 complexity
rmses_l2 = []

# Iterate over reg_params
for reg in reg_params:

    # Update l2 strength
    params["lambda"] = ____
    
    # Pass this updated param dictionary into cv
    cv_results_rmse = ____.____(dtrain=____, params=____, nfold=2, num_boost_round=5, metrics="rmse", as_pandas=True, seed=123)
    
    # Append best rmse (final round) to rmses_l2
    ____.____(____["____"].tail(1).values[0])

# Look at best rmse per l2 param
print("Best rmse as a function of l2:")
print(pd.DataFrame(list(zip(reg_params, rmses_l2)), columns=["l2", "rmse"]))

Modifier et exécuter le code