Get startedGet started for free

Review of grid search and random search

1. Review of grid search and random search

How do we find the optimal values for several hyperparameters simultaneously, leading to the lowest loss possible, when their values interact in in non-obvious, non-linear ways? Two common strategies for choosing several hyperparameter values simultaneously are Grid Search and Random Search, so it's important that we review them here, and see what their advantages and disadvantages are, by looking at some examples of how both can be used with the XGBoost and scikit-learn packages.

2. Grid search: review

Grid Search is a method of exhaustively searching through a collection of possible parameter values. For example, if you have 2 hyperparameters you would like to tune, and 4 possible values for each hyperparameter, then a grid search over that parameter space would try all 16 possible parameter configurations. In a grid search, you try every parameter configuration, evaluate some metric for that configuration, and pick the parameter configuration that gave you the best value for the metric you were using, which in our case will be the root mean squared error.

3. Grid search: example

Let's go over an example of how to grid search over several hyperparameters using XGBoost and scikit learn. In lines 1-4 we load in the necessary libraries, including GridSearchCV from sklearn dot model_selection. In lines 5-7 we load in our dataset and convert it into a DMatrix. In line 8 we create our grid of hyperparameters we want to search over. We selected 4 different learning rates (or eta values), 3 different subsample values, and a single number of trees. The total number of distinct hyperparameter configurations is 12, so 12 different models will be built. In line 9 we create our regressor, and then in line 10 we pass the xgbregressor object, parameter grid, evaluation metric, and number of cross validation folds to GridSearchCV and then immediately fit that gridsearch object in line 11, just like every other scikit learn estimator object we've done this to in the past. In line 12, having fit the gridsearch object, we can extract the best parameters the grid search found, and print them to the screen. In line 13, we get the RMSE that corresponds to the best parameters found, and see that it's ~28500 dollars.

4. Random search: review

Random search is significantly different from grid search in that the number of models that you are required to iterate over doesn't grow as you expand the overall hyperparameter space. In random search, you get to decide how many models, or iterations, you want to try out before stopping. Random search simply involves drawing a random combination of possible hyperparameter values from the range of allowable hyperparameters a set number of times. Each time, you train a model with the selected hyperparameters, evaluate the performance of that model, and then rinse and repeat. When you've created the number of models you had specified initially, you simply pick the best one. To finish this lesson off,

5. Random search: example

let's look at a full random search example. In lines 1-7, we load in the necessary modules, this time loading in RandomizedSearchCV from sklearn dot model_selection, and then load in and convert the data we need to a DMatrix object as always. In line 8 we create our parameter grid, this time generating a large number of learning rate values and subsample values using np-dot-arange. There are 20 values for learning_rate (or eta) and 20 values for subsample, which would be 400 models to try if we were to run a grid search (which we aren't doing here). In line 9 we create our xgbregressor object, and in line 10 we create our RandomizedSearchCV object, passing in the xgbregressor and parameter grid we had just created. We also set the number of iterations we want the random search to proceed to 25, so we know it will not be able to try all 400 possible parameter configurations. We also specify the evaluation metric we want to use, and that we want to run 4-fold cross-validation on each iteration. In line 11 we fit our randomizedsearchcv object, which can take a bit of time. Finally, lines 12 and 13 print the best model parameters found, and the corresponding best RMSE.

6. Let's practice!

Ok, now let's have you practice both grid search and random search in the following exercises.