Session Ready
Exercise

Evaluate the grid

Earlier in the chapter we split the dataset into three parts: training, validation and test.

A dataset that is not used in training is sometimes referred to as a "holdout" set. A holdout set is used to estimate model performance and although both validation and test sets are considered to be holdout data, there is a key difference:

  • Just like a test set, a validation set is used to evaluate the performance of a model. The difference is that a validation set is specifically used to compare the performance of a group of models with the goal of choosing a "best model" from the group. All the models in a group are evaluated on the same validation set and the model with the best performance is considered to be the winner.
  • Once you have the best model, a final estimate of performance is computed on the test set.
  • A test set should only ever be used to estimate model performance and should not be used in model selection. Typically if you use a test set more than once, you are probably doing something wrong.
Instructions
100 XP
  • Write a loop that evaluates each model in the grade_models list and stores the validation RMSE in a vector called rmse_values.
  • The which.min() function can be applied to the rmse_values vector to identify the index containing the smallest RMSE value.
  • The model with the smallest validation set RMSE will be designated as the "best model".
  • Inspect the model parameters of the best model.
  • Generate predictions on the test set using the best model to compute test set RMSE.