Random forest performance
It is now time to see whether the random forests models you built in the previous exercise are able to outperform the logistic regression model.
Remember that the validate recall for the logistic regression model was 0.43.
Diese Übung ist Teil des Kurses
Machine Learning in the Tidyverse
Anleitung zur Übung
- Prepare the
validate_actual
andvalidate_predicted
columns for each mtry/fold combination. - Calculate the recall for each mtry/fold combination.
- Calculate the mean recall for each value of
mtry
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
cv_prep_rf <- cv_models_rf %>%
mutate(
# Prepare binary vector of actual Attrition values in validate
validate_actual = map(validate, ~.x$___ == "___"),
# Prepare binary vector of predicted Attrition values for validate
validate_predicted = map2(.x = ___, .y = ___, ~predict(.x, .y, type = "response")$predictions == "Yes")
)
# Calculate the validate recall for each cross validation fold
cv_perf_recall <- cv_prep_rf %>%
mutate(recall = map2_dbl(.x = ___, .y = ___, ~recall(actual = .x, predicted = .y)))
# Calculate the mean recall for each mtry used
cv_perf_recall %>%
group_by(___) %>%
summarise(mean_recall = mean(___))