Build a random forest model
Here you will use the same cross-validation data to build (using train) and evaluate (using validate) random forests for each partition. Since you are using the same cross-validation partitions as your regression models, you are able to directly compare the performance of the two models.
Note: We will limit our random forests to contain 100 trees to ensure they finish fitting in a reasonable time. The default number of trees for ranger() is 500.
Diese Übung ist Teil des Kurses
Machine Learning in the Tidyverse
Anleitung zur Übung
- Use
ranger()to build a random forest predictinglife_expectancyusing all features intrainfor each cross validation partition. - Add a new column
validate_predictedpredicting thelife_expectancyfor the observations invalidateusing the random forest models you just created.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
library(ranger)
# Build a random forest model for each fold
cv_models_rf <- cv_data %>%
mutate(model = map(___, ~ranger(formula = ___, data = ___,
num.trees = 100, seed = 42)))
# Generate predictions using the random forest model
cv_prep_rf <- cv_models_rf %>%
mutate(validate_predicted = map2(.x = ___, .y = ___, ~predict(.x, .y)$predictions))