Get startedGet started for free

Assessing predictions with RMSE

1. Assessing predictions with RMSE

You just learned about R-squared, the proportion of the total variation in house prices explained by a model. This numerical summary can be used to assess model fit, where models with R-squared values closer to 1 have better fit, and values closer to 0 have poorer fit. Let's now consider another assessment measure, but one more associated with modeling for prediction. In particular, how can you assess the quality of a model's predictions? You'll use a quantity called the root mean square error, which is a slight variation of the sum of squared residuals.

2. Refresher: Residuals

Once again, recall in your visualization of modeling with two numerical predictor variables, you marked a selection of residuals with red lines: the difference between the observed values and their corresponding fitted/predicted values on the regression plane. The sum of squared residuals takes all such residuals, squares them, and sums them. But what if you took the average instead of the sum? For example, a model might have a large sum of squared residuals, merely because it involves a large number of points! By using the average, we’ll correct for this and get a notion of "average prediction error”.

3. Mean squared error

You've seen the computation of the sum of squared residuals for Model 1 a few times now.

4. Mean squared error

Instead of using sum() in the summarize() call however, let's use the mean() function and assign this to mse, meaning mean squared error. This is the average squared error a predictive model makes. The closer your predictions y-hat are to the observed values y, the smaller the residuals will be, and hence the closer the MSE will be to 0. The further your predictions are, the larger the MSE will be. You observe an MSE of 0.0271, which is 585 divided by 21613, the total number of houses. Why is this called the MSE, and not the mean of squared residuals? No reason other than convention, they mean the same thing. Since the MSE involves squared errors, the units of MSE are the units of the outcome variable y squared. Let's instead obtain a measure of error whose units match the units of y.

5. Root mean squared error

You do this via the root mean squared error, or RMSE, which is the square-root of the MSE. Note the added mutate() line of code to compute the sqrt(). This can be thought of as the "typical prediction error" our model will make and its units match the units of the outcome variable y. While the interpretation in our case of the units of log10 dollars might not be immediately apparent to everyone, you can imagine in many other cases it being very useful for these units match.

6. RMSE of predictions on new houses

Let's now assess the quality of the predictions of log10_price for the two new houses you saw in the previous video, whose information are saved in the data frame new-houses you created earlier. Recall that you apply the get-regression-points function to model-price-3,

7. RMSE of predictions on new houses

but also with the newdata argument set to new_houses. You thus obtain predicted values log10_price_hat of 5.34 and 5.94. Now let's take this output and compute the RMSE

8. RMSE of predictions on new houses

by taking the residual, squaring them, taking the mean not the sum, and then square rooting. You get the following error message: it says the residual column is not found. Why is it not found? Because to compute residuals, you need both the predicted/fitted values y-hat, in this case log10-price-hat and the observed values y, in this case log10-price But if you don't have the latter, you can't compute the residuals, and hence you can’t compute the RMSE. This illustrates a key restriction in predictive modeling assessment: you can only assess the quality of predictions when you have access to the observed value y. You'll learn about a workaround to this issue shortly.

9. Let's practice!

But first some exercises: you'll compute on your own the RMSE, a measure of prediction error used for model assessment and selection.