glmnet with custom trainControl and tuning
As you saw in the video, the glmnet
model actually fits many models at once (one of the great things about the package). You can exploit this by passing a large number of lambda
values, which control the amount of penalization in the model. train()
is smart enough to only fit one model per alpha
value and pass all of the lambda
values at once for simultaneous fitting.
My favorite tuning grid for glmnet
models is:
expand.grid(
alpha = 0:1,
lambda = seq(0.0001, 1, length = 100)
)
This grid explores a large number of lambda
values (100, in fact), from a very small one to a very large one. (You could increase the maximum lambda
to 10, but in this exercise 1 is a good upper bound.)
If you want to explore fewer models, you can use a shorter lambda sequence. For example, lambda = seq(0.0001, 1, length = 10)
would fit 10 models per value of alpha.
You also look at the two forms of penalized models with this tuneGrid
: ridge regression and lasso regression. alpha = 0
is pure ridge regression, and alpha = 1
is pure lasso regression. You can fit a mixture of the two models (i.e. an elastic net) using an alpha
between 0 and 1. For example, alpha = 0.05
would be 95% ridge regression and 5% lasso regression.
In this problem you'll just explore the 2 extremes – pure ridge and pure lasso regression – for the purpose of illustrating their differences.
This is a part of the course
“Machine Learning with caret in R”
Exercise instructions
- Train a
glmnet
model on theoverfit
data such thaty
is the response variable and all other variables are explanatory variables. Make sure to use your customtrainControl
from the previous exercise (myControl
). Also, use a customtuneGrid
to explorealpha = 0:1
and 20 values oflambda
between 0.0001 and 1 per value of alpha. - Print
model
to the console. - Print the
max()
of the ROC statistic inmodel[["results"]]
. You can access it usingmodel[["results"]][["ROC"]]
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Train glmnet with custom trainControl and tuning: model
model <- train(
___,
___,
tuneGrid = ___(
___,
___
),
method = ___,
trControl = ___
)
# Print model to console
# Print maximum ROC statistic
This exercise is part of the course
Machine Learning with caret in R
This course teaches the big ideas in machine learning like how to build and evaluate predictive models.
In this chapter, you will use the <code>train()</code> function to tweak model parameters through cross-validation and grid search.
Exercise 1: Random forests and wineExercise 2: Random forests vs. linear modelsExercise 3: Fit a random forestExercise 4: Explore a wider model spaceExercise 5: Advantage of a longer tune lengthExercise 6: Try a longer tune lengthExercise 7: Custom tuning gridsExercise 8: Advantages of a custom tuning gridExercise 9: Fit a random forest with custom tuningExercise 10: Introducing glmnetExercise 11: Advantage of glmnetExercise 12: Make a custom trainControlExercise 13: Fit glmnet with custom trainControlExercise 14: glmnet with custom tuning gridExercise 15: Why a custom tuning grid?Exercise 16: glmnet with custom trainControl and tuningExercise 17: Interpreting glmnet plotsWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.