Cross-validation data frames
Now that you have withheld a portion of your data as testing data, you can use the remaining portion to find the best performing model.
In this exercise, you will split the training data into a series of 5 train-validate sets using the vfold_cv() function from the rsample package.
Diese Übung ist Teil des Kurses
Machine Learning in the Tidyverse
Anleitung zur Übung
- Build a data frame for 5-fold cross validation from the
training_datausingvfold_cv()and assign it tocv_split. - Prepare
cv_databy appending two new columns tocv_split:train: containing the train data frames by mappingtraining()across thesplitscolumn.validate: containing the validate data frames by using mappingtesting()across thesplitscolumn.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
set.seed(42)
# Prepare the data frame containing the cross validation partitions
cv_split <- vfold_cv(___, v = ___)
cv_data <- cv_split %>%
mutate(
# Extract the train data frame for each split
train = map(___, ~___(.x)),
# Extract the validate data frame for each split
validate = map(___, ~___(.x))
)
# Use head() to preview cv_data
head(cv_data)