1. Root Mean Squared Error (RMSE)
In this lesson, you will learn about a key metric for evaluating the prediction performance of a regression model: root mean squared error.
2. What is Root Mean Squared Error (RMSE)?
RMSE is defined as the square root of the mean squared error of the model on a dataset. You can think of the RMSE as the "typical" prediction error of your model on that data. Many regression algorithms, like linear regression, are designed to minimize squared error, so it’s a natural metric, in that sense.
3. RMSE of the Home Sales Price Model
Let’s walk through calculating the RMSE of the home sales price model we saw earlier, on the dataset houseprices. Assume price is the column of actual sale prices and prediction is the column of predicted prices. First, calculate the error vector.
4. RMSE of the Home Sales Price Model
Then square it.
5. RMSE of the Home Sales Price Model
Take the mean of the squared error and square root. This results in an RMSE of about 58.3. You can think of this as a typical prediction error of 58.3 thousand dollars.
6. Is the RMSE Large or Small?
Is the RMSE large or small? The answer depends on how accurate you need the prediction to be for your specific problem, but one way to evaluate the RMSE is to compare it to the standard deviation of the outcome.
In our example, the standard deviation is about 135 thousand dollars. Think of this as the typical difference between a specific house price in the data and the average house price. The RMSE being smaller than the standard deviation means that the model tends to estimate prices better than simply taking the average.
7. Let's practice!
Now let’s practice fitting models and calculating the RMSE.