Get startedGet started for free

Comparing RMSE and root-mean-squared Relative Error

In this exercise, you will show that log-transforming a monetary output before modeling improves mean relative error (but increases RMSE) compared to modeling the monetary output directly. You will compare the results of model.log from the previous exercise to a model (model.abs) that directly fits income.

The income_train and income_test datasets have been pre-loaded, along with your model, model.log.

Also available:

  • model.abs: a model that directly fits income to the inputs using the formula

    Income2005 ~ Arith + Word + Parag + Math + AFQT

This exercise is part of the course

Supervised Learning in R: Regression

View Course

Exercise instructions

  • Fill in the blanks to add predictions from the models to income_test.
    • Don’t forget to take the exponent of the predictions from model.log to undo the log transform!
  • Fill in the blanks to pivot_longer() the predictions and calculate the residuals and relative error.
  • Fill in the blanks to calculate the RMSE and relative RMSE for predictions.
    • Which model has larger absolute error? Larger relative error?

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# fmla.abs is available
fmla.abs

# model.abs is available
summary(model.abs)

# Add predictions to the test set
income_test <- income_test %>%
  mutate(pred.absmodel = ___(___, income_test),        # predictions from model.abs
         pred.logmodel = ___(___(___, income_test)))   # predictions from model.log

# pivot_longer the predictions and calculate residuals and relative error
income_long <- income_test %>% 
  pivot_longer(names_to = 'modeltype', values_to = 'pred', cols=c('pred.absmodel', 'pred.logmodel')) %>%
  mutate(residual = ___,   # residuals
         relerr   = ___)   # relative error

# Calculate RMSE and relative RMSE and compare
income_long %>% 
  group_by(modeltype) %>%      # group by modeltype
  summarize(rmse     = ___,    # RMSE
            rmse.rel = ___)    # Root mean squared relative error
Edit and Run Code