10-fold cross-validation
As you saw in the video, a better approach to validating models is to use multiple systematic test sets, rather than a single random train/test split. Fortunately, the caret
package makes this very easy to do:
model <- train(y ~ ., my_data)
caret
supports many types of cross-validation, and you can specify which type of cross-validation and the number of cross-validation folds with the trainControl()
function, which you pass to the trControl
argument in train()
:
model <- train(
y ~ .,
my_data,
method = "lm",
trControl = trainControl(
method = "cv",
number = 10,
verboseIter = TRUE
)
)
It's important to note that you pass the method for modeling to the main train()
function and the method for cross-validation to the trainControl()
function.
Este ejercicio forma parte del curso
Machine Learning with caret in R
Instrucciones del ejercicio
- Fit a linear regression to model
price
using all other variables in thediamonds
dataset as predictors. Use thetrain()
function and 10-fold cross-validation. (Note that we've taken a subset of the fulldiamonds
dataset to speed up this operation, but it's still nameddiamonds
.) - Print the model to the console and examine the results.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Fit lm model using 10-fold CV: model
model <- train(
___,
___,
method = "lm",
trControl = trainControl(
method = "cv",
number = ___,
verboseIter = TRUE
)
)
# Print model to console