Get startedGet started for free

Why tune your model?

1. Why tune your model?

So far, you've learned how to use XGBoost to solve classification and regression problems. Now, you'll learn how to supercharge those models by tuning them. To motivate the reason behind this chapter on tuning your XGBoost model, let's just take a look at 2 cases, one where we take the simplest XGBoost model possible and compute a cross-validated RMSE, and then do the same exact thing with a tuned XGBoost model. What do you think the effect of model tuning on the overall reduction in RMSE will be?

2. Untuned model example

In lines 1-6, we simply load in the necessary libraries and ames housing data, and then convert our data into a DMatrix. In line 7, we create the most basic parameter configuration possible, only passing in the objective function we need to create a regression XGBoost model. This parameter configuration will be made much more complex as we tune our models. In fact, when performing parameter searches, we will use a dictionary that we typically call a parameter grid, because it will contain ranges of values over which we will search to find an optimal configuration. More on that later. In line 8, we run our cross-validation in XGBoost, passing in the simple parameter grid and telling it to run 4-fold cross validation, and to ouput the rmse as an evaluation metric. In line 9, we simply print the final rmse of the untuned model to screen, which is around 34600 dollars.

3. Tuned model example

Now let's take a look at a tuned example. Again, in lines 1-6, we load in the necessary libraries and ames housing data, and then convert our data into a DMatrix. In line 7, we create a more tuned parameter configuration, setting colsample_bytree, learning_rate, and max_depth to better values. These are a few of the more important xgboost parameters that can be tuned, and you will learn more about and practice tuning these parameters later in this chapter. In line 8, we run our cross-validation in XGBoost, passing in our tuned parameter grid, as well as setting the number of trees to be constructed at 200, and again running 4-fold cross validation, and outputting the rmse as an evaluation metric. In line 9, we print the final rmse of the tuned model to screen, which is around 29800 dollars. That's an almost 14% reduction in RMSE!

4. Let's tune some models!

Now that you see that you can get a significant improvement in model performance by tuning an XGBoost model, let's have you start doing some tuning yourself!