1. Quantifying model fit
It's usually important to know whether or not predictions from your model are nonsense. In this chapter, we'll look at ways of quantifying how good your model is.
2. Bream and perch models
Previously, you ran models on mass versus length for bream and perch. By simply looking at these scatter plots, you can get a sense that there is a linear relationship between mass and length for bream but not for perch.
It would be useful to quantify how strong that linear relationship is.
3. Coefficient of determination
The first metric we'll discuss is the coefficient of determination. This is sometimes called "r-squared". For boring historical reasons, it's written with a lower case r for simple linear regression, and an upper case R when you have more than one explanatory variable.
It is defined as the proportion of the variance in the response variable that is predictable from the explanatory variable. We'll get to a human-readable explanation shortly.
A score of one means you have a perfect fit, and a score of zero means your model is no better than randomness.
What constitutes a good score depends on your dataset. A score of zero-point five on a psychological experiment may be exceptionally high because humans are inherently hard to predict, but in other cases a score of zero-point nine may be considered a poor fit.
4. summary()
summary shows several performance metrics at the end of its output. The coefficient of determination is written in the second to last line, and titled "Multiple R-squared". Its value about zero-point-eight-eight.
5. glance()
Since summary isn't easy to program with, a better way to extract the metric is to use glance from broom. Calling glance on the model returns several model metrics in a tibble.
Then you can use dplyr's pull function to pull out the r-squared value.
6. It's just correlation squared
For simple linear regression, the interpretation of the coefficient of determination is straightforward. It is simply the correlation between the explanatory and response variables, squared.
7. Residual standard error (RSE)
The second metric we'll look at is the residual standard error, or RSE.
Recall that each residual is the difference between a predicted value and an observed value. The RSE is, very roughly speaking, a measure of the typical size of the residuals. That is, how much the predictions are typically wrong by.
It has the same unit as the response variable. In the fish models, the response unit is grams.
8. summary() again
summary also displays the RSE. It is shown in the third to last line, titled "residual standard error". The value for the bream model is about seventy four.
9. glance() again
As with the coefficient of determination, to get the RSE as a variable, it's best to use glance. Here, RSE is named sigma.
10. Calculating RSE: residuals squared
To calculate the RSE yourself, it's slightly more complicated. First, you take the square of each residual.
11. Calculating RSE: sum of residuals squared
Then you take the sum of these residuals squared.
12. Calculating RSE: degrees of freedom
Then you calculate the degrees of freedom of the residuals. This is the number of observations minus the number of model coefficients.
13. Calculating RSE: square root of ratio
Finally, you take the square root of the ratio of those two numbers. Reassuringly, the value is still seventy four.
14. Interpreting RSE
You saw that the RSE for the bream model was seventy four. That means that the difference between predicted bream masses and observed bream masses is typically about seventy four grams.
15. Root-mean-square error (RMSE)
Another related metric is the root-mean-square error. This is calculated in the same way, except you don't subtract the number of coefficients in the second to last step. It performs the same task as residual standard error, namely quantifying how inaccurate the model predictions are, but is worse for comparisons between models. You need to be aware that RMSE exists, but typically you should use RSE instead.
16. Let's practice!
Let's make some metrics.