Build & evaluate the best model
Using cross-validation you were able to identify the best model for predicting life_expectancy using all the features in gapminder. Now that you've selected your model, you can use the independent set of data (testing_data) that you've held out to estimate the performance of this model on new data.
You will build this model using all training_data and evaluate using testing_data.
Diese Übung ist Teil des Kurses
Machine Learning in the Tidyverse
Anleitung zur Übung
- Use
ranger()to build the best performing model (mtry = 4) using all of the training data. Assign this tobest_model. - Extract the
life_expectancycolumn fromtesting_dataand assign it totest_actual. - Predict
life_expectancyusing thebest_modelon thetestingdata and assign it totest_predicted. - Calculate the MAE using
test_actualandtest_predictedvectors.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Build the model using all training data and the best performing parameter
best_model <- ranger(formula = ___, data = ___,
mtry = ___, num.trees = 100, seed = 42)
# Prepare the test_actual vector
test_actual <- testing_data$___
# Predict life_expectancy for the testing_data
test_predicted <- predict(___, ___)$predictions
# Calculate the test MAE
mae(___, ___)