Calculate RMSE
In this exercise, you will calculate the RMSE of your unemployment model. In the previous coding exercises,
you added two columns to the unemployment
dataset:
- the model's predictions (
predictions
column) - the residuals between the predictions and the outcome (
residuals
column)
You can calculate the RMSE from a vector of residuals, \(res\), as:
$$ RMSE = \sqrt{\operatorname{mean}(res^2)} $$
You want RMSE to be small. How small is "small"? One heuristic is to compare the RMSE to the standard deviation of the outcome. With a good model, the RMSE should be smaller.
The unemployment
data frame has been loaded for you.
This exercise is part of the course
Supervised Learning in R: Regression
Exercise instructions
- Review the
unemployment
data from the previous exercise. - For convenience, assign the
residuals
column fromunemployment
to the variableres
. - Calculate RMSE: square
res
, take its mean, and then square root it. Assign this to the variablermse
and print it.- Tip: you can do this in one step by wrapping the assignment in parentheses:
(rmse <- ___)
- Tip: you can do this in one step by wrapping the assignment in parentheses:
- Calculate the standard deviation of
female_unemployment
and assign it to the variablesd_unemployment
. Print it. How does the rmse of the model compare to the standard deviation of the data?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print a summary of unemployment
summary(unemployment)
# For convenience put the residuals in the variable res
res <- ___
# Calculate RMSE, assign it to the variable rmse and print it
(rmse <- ___)
# Calculate the standard deviation of female_unemployment and print it
(sd_unemployment <- ___)