Get startedGet started for free

10-fold cross-validation

As you saw in the video, a better approach to validating models is to use multiple systematic test sets, rather than a single random train/test split. Fortunately, the caret package makes this very easy to do:

model <- train(y ~ ., my_data)

caret supports many types of cross-validation, and you can specify which type of cross-validation and the number of cross-validation folds with the trainControl() function, which you pass to the trControl argument in train():

model <- train(
  y ~ ., 
  my_data,
  method = "lm",
  trControl = trainControl(
    method = "cv", 
    number = 10,
    verboseIter = TRUE
  )
)

It's important to note that you pass the method for modeling to the main train() function and the method for cross-validation to the trainControl() function.

This exercise is part of the course

Machine Learning with caret in R

View Course

Exercise instructions

  • Fit a linear regression to model price using all other variables in the diamonds dataset as predictors. Use the train() function and 10-fold cross-validation. (Note that we've taken a subset of the full diamonds dataset to speed up this operation, but it's still named diamonds.)
  • Print the model to the console and examine the results.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Fit lm model using 10-fold CV: model
model <- train(
  ___, 
  ___,
  method = "lm",
  trControl = trainControl(
    method = "cv", 
    number = ___,
    verboseIter = TRUE
  )
)

# Print model to console
Edit and Run Code