Comparing RMSE and root-mean-squared Relative Error
In this exercise, you will show that log-transforming a monetary output before modeling improves mean relative error (but increases RMSE) compared to modeling the monetary output directly. You will compare the results of model.log
from the previous exercise to a model (model.abs
) that directly fits income.
The income_train
and income_test
datasets have been pre-loaded, along with your model, model.log
.
Also available:
model.abs
: a model that directly fits income to the inputs using the formulaIncome2005 ~ Arith + Word + Parag + Math + AFQT
This exercise is part of the course
Supervised Learning in R: Regression
Exercise instructions
- Fill in the blanks to add predictions from the models to
income_test
.- Don’t forget to take the exponent of the predictions from
model.log
to undo the log transform!
- Don’t forget to take the exponent of the predictions from
- Fill in the blanks to
pivot_longer()
the predictions and calculate the residuals and relative error. - Fill in the blanks to calculate the RMSE and relative RMSE for predictions.
- Which model has larger absolute error? Larger relative error?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# fmla.abs is available
fmla.abs
# model.abs is available
summary(model.abs)
# Add predictions to the test set
income_test <- income_test %>%
mutate(pred.absmodel = ___(___, income_test), # predictions from model.abs
pred.logmodel = ___(___(___, income_test))) # predictions from model.log
# pivot_longer the predictions and calculate residuals and relative error
income_long <- income_test %>%
pivot_longer(names_to = 'modeltype', values_to = 'pred', cols=c('pred.absmodel', 'pred.logmodel')) %>%
mutate(residual = ___, # residuals
relerr = ___) # relative error
# Calculate RMSE and relative RMSE and compare
income_long %>%
group_by(modeltype) %>% # group by modeltype
summarize(rmse = ___, # RMSE
rmse.rel = ___) # Root mean squared relative error