Exercise

# Comparing model performance

Plotting gives you a nice feel for where the model performs well, and where it doesn't. Sometimes it is nice to have a statistic that gives you a score for the model. This way you can quantify how good a model is, and make comparisons across lots of models. A common statistic is the root mean square error (sometimes abbreviated to "RMSE"), which simply squares the residuals, then takes the mean, then the square root. A small RMSE score for a given dataset implies a better prediction. (By default, you can't compare between different datasets, only different models on the same dataset. Sometimes it is possible to normalize the datasets to provide a comparison between them.)

Here you'll compare the gradient boosted trees and random forest models.

Instructions

**100 XP**

`both_responses`

, containing the predicted and actual year of the track from both models, has been pre-defined as a local tibble.

- Create a sum of squares of residuals dataset.
- Add a
`residual`

column, equal to the predicted response minus the actual response. - Group the data by
`model`

. - Calculate a summary statistic,
`rmse`

, equal to the square root of the mean of the`residual`

s squared.

- Add a