Get Started

glmnet with custom trainControl and tuning

As you saw in the video, the glmnet model actually fits many models at once (one of the great things about the package). You can exploit this by passing a large number of lambda values, which control the amount of penalization in the model. train() is smart enough to only fit one model per alpha value and pass all of the lambda values at once for simultaneous fitting.

My favorite tuning grid for glmnet models is:

expand.grid(
  alpha = 0:1,
  lambda = seq(0.0001, 1, length = 100)
)

This grid explores a large number of lambda values (100, in fact), from a very small one to a very large one. (You could increase the maximum lambda to 10, but in this exercise 1 is a good upper bound.)

If you want to explore fewer models, you can use a shorter lambda sequence. For example, lambda = seq(0.0001, 1, length = 10) would fit 10 models per value of alpha.

You also look at the two forms of penalized models with this tuneGrid: ridge regression and lasso regression. alpha = 0 is pure ridge regression, and alpha = 1 is pure lasso regression. You can fit a mixture of the two models (i.e. an elastic net) using an alpha between 0 and 1. For example, alpha = 0.05 would be 95% ridge regression and 5% lasso regression.

In this problem you'll just explore the 2 extremes – pure ridge and pure lasso regression – for the purpose of illustrating their differences.

This is a part of the course

“Machine Learning with caret in R”

View Course

Exercise instructions

  • Train a glmnet model on the overfit data such that y is the response variable and all other variables are explanatory variables. Make sure to use your custom trainControl from the previous exercise (myControl). Also, use a custom tuneGrid to explore alpha = 0:1 and 20 values of lambda between 0.0001 and 1 per value of alpha.
  • Print model to the console.
  • Print the max() of the ROC statistic in model[["results"]]. You can access it using model[["results"]][["ROC"]].

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Train glmnet with custom trainControl and tuning: model
model <- train(
  ___, 
  ___,
  tuneGrid = ___(
    ___,
    ___
  ),
  method = ___,
  trControl = ___
)

# Print model to console


# Print maximum ROC statistic

This exercise is part of the course

Machine Learning with caret in R

AdvancedSkill Level
4.5+
17 reviews

This course teaches the big ideas in machine learning like how to build and evaluate predictive models.

In this chapter, you will use the <code>train()</code> function to tweak model parameters through cross-validation and grid search.

Exercise 1: Random forests and wineExercise 2: Random forests vs. linear modelsExercise 3: Fit a random forestExercise 4: Explore a wider model spaceExercise 5: Advantage of a longer tune lengthExercise 6: Try a longer tune lengthExercise 7: Custom tuning gridsExercise 8: Advantages of a custom tuning gridExercise 9: Fit a random forest with custom tuningExercise 10: Introducing glmnetExercise 11: Advantage of glmnetExercise 12: Make a custom trainControlExercise 13: Fit glmnet with custom trainControlExercise 14: glmnet with custom tuning gridExercise 15: Why a custom tuning grid?Exercise 16: glmnet with custom trainControl and tuning
Exercise 17: Interpreting glmnet plots

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free