1. Learn
  2. /
  3. Courses
  4. /
  5. Machine Learning with caret in R

Exercise

10-fold cross-validation

As you saw in the video, a better approach to validating models is to use multiple systematic test sets, rather than a single random train/test split. Fortunately, the caret package makes this very easy to do:

model <- train(y ~ ., my_data)

caret supports many types of cross-validation, and you can specify which type of cross-validation and the number of cross-validation folds with the trainControl() function, which you pass to the trControl argument in train():

model <- train(
  y ~ ., 
  my_data,
  method = "lm",
  trControl = trainControl(
    method = "cv", 
    number = 10,
    verboseIter = TRUE
  )
)

It's important to note that you pass the method for modeling to the main train() function and the method for cross-validation to the trainControl() function.

Instructions

100 XP
  • Fit a linear regression to model price using all other variables in the diamonds dataset as predictors. Use the train() function and 10-fold cross-validation. (Note that we've taken a subset of the full diamonds dataset to speed up this operation, but it's still named diamonds.)
  • Print the model to the console and examine the results.