glmnet with custom trainControl and tuning
As you saw in the video, the glmnet
model actually fits many models at once (one of the great things about the package). You can exploit this by passing a large number of lambda
values, which control the amount of penalization in the model. train()
is smart enough to only fit one model per alpha
value and pass all of the lambda
values at once for simultaneous fitting.
My favorite tuning grid for glmnet
models is:
expand.grid(
alpha = 0:1,
lambda = seq(0.0001, 1, length = 100)
)
This grid explores a large number of lambda
values (100, in fact), from a very small one to a very large one. (You could increase the maximum lambda
to 10, but in this exercise 1 is a good upper bound.)
If you want to explore fewer models, you can use a shorter lambda sequence. For example, lambda = seq(0.0001, 1, length = 10)
would fit 10 models per value of alpha.
You also look at the two forms of penalized models with this tuneGrid
: ridge regression and lasso regression. alpha = 0
is pure ridge regression, and alpha = 1
is pure lasso regression. You can fit a mixture of the two models (i.e. an elastic net) using an alpha
between 0 and 1. For example, alpha = 0.05
would be 95% ridge regression and 5% lasso regression.
In this problem you'll just explore the 2 extremes – pure ridge and pure lasso regression – for the purpose of illustrating their differences.
This is a part of the course
“Machine Learning with caret in R”
Exercise instructions
- Train a
glmnet
model on theoverfit
data such thaty
is the response variable and all other variables are explanatory variables. Make sure to use your customtrainControl
from the previous exercise (myControl
). Also, use a customtuneGrid
to explorealpha = 0:1
and 20 values oflambda
between 0.0001 and 1 per value of alpha. - Print
model
to the console. - Print the
max()
of the ROC statistic inmodel[["results"]]
. You can access it usingmodel[["results"]][["ROC"]]
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Train glmnet with custom trainControl and tuning: model
model <- train(
___,
___,
tuneGrid = ___(
___,
___
),
method = ___,
trControl = ___
)
# Print model to console
# Print maximum ROC statistic