Exercise

# Linear vs. average

The \(R^2\) gives us a numerical measurement of the strength of fit relative to a null model based on the average of the response variable: $$ \hat{y}_{null} = \bar{y} $$

This model has an \(R^2\) of zero because \(SSE = SST\). That is, since the fitted values (\(\hat{y}_{null}\)) are all equal to the average (\(\bar{y}\)), the residual for each observation is the distance between that observation and the mean of the response. Since we can always fit the null model, it serves as a baseline against which all other models will be compared.

In the graphic, we visualize the residuals for the null model (`mod_null`

at left) vs. the simple linear regression model (`mod_hgt`

at right) with height as a single explanatory variable. Try to convince yourself that, if you squared the lengths of the grey arrows on the left and summed them up, you would get a larger value than if you performed the same operation on the grey arrows on the right.

It may be useful to preview these `augment()`

-ed data frames with `glimpse()`

:

```
glimpse(mod_null)
glimpse(mod_hgt)
```

Instructions

**100 XP**

- Compute the sum of the squared residuals (SSE) for the null model
`mod_null`

. - Compute the sum of the squared residuals (SSE) for the regression model
`mod_hgt`

.