Get startedGet started for free

glmnet with custom tuning grid

1. glmnet with custom tuning grid

Random forest models are relatively easy to tune, as there's really 1 parameter of importance: mtry.

2. Custom tuning glmnet models

Glmnet models, on the other hand, have 2 tuning parameters: alpha (or the mixing parameter between ridge and lasso regression) and lambda (or the strength of the penalty on the coefficients). However, there's a trick to glmnet models: for a single value of alpha, glmnet fits all values of lambda simultaneously! This is called the sub-model trick, because we can fit a number of different models simultaneously, and then explore the results of each sub-model after the fact. We can also exploit this trick to get faster-running grid searches, while still exploring finely-grained tuning grids.

3. Example: glmnet tuning

With glmnet models, I usually like to explore 2 values of alpha: 0 and 1, with a wide range of lambdas. caret will use the sub model trick to collapse the entire tuning grid down to 2 model fits, which will run pretty fast, even for 10 folds of cross-validation. Let's start by making a custom tuning grid, with alphas of 0 and 1 and lambdas between 0 and point-1. We use the sequence function to make a sequence of lambdas and we use the length argument to determine the length of this sequence. Next we fit a glmnet model using the train function with our custom tuning grid, and plot the results.

4. Compare models visually

Recall that alpha equals 0 is ridge regression, and alpha equals 1 is lasso regression. In this case, we can see that lasso regression with a small lambda penalty is the best.

5. Full regularization path

We can also plot the full regularization path for all of the models with alpha = 0. This is a special plot, specific to glmnet models. On the left is the intercept only model (high value of lambda) and on the right is the full model with no penalty (low value of lambda). The plot shows how the regression coefficients are "shrunk" from right to left as you increase the strength of the penalty on coefficient size, and therefore decrease the complexity of the model. You can also see some lines hitting zero as we increase lambda, which represents these coefficients dropping out of the model.

6. Let’s practice!

Let's explore this tuning grid on some other datasets.