Create the folds
Splitting data only once into training and test sets has statistical insecurities - there is a small chance that your test set contains only high-rated beans, while all the low-rated beans are in your training set. It also means that you can only measure the performance of your model once.
Cross-validation gives you a more robust estimate of your out-of-sample performance without the statistical pitfalls - it assesses your model more profoundly.
In this exercise, you will create folds of your training data chocolate_train
, which is pre-loaded.
Diese Übung ist Teil des Kurses
Machine Learning with Tree-Based Models in R
Anleitung zur Übung
- Set a seed of 20 for reproducibility.
- Create 10 folds of
chocolate_train
and save the result aschocolate_folds
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Set seed for reproducibility
___
# Build 10 folds
chocolate_folds <- ___(___, v = ___)
print(chocolate_folds)