Exercise

# Accidental "perfection"

It often happens that you'll include extra degrees of freedom in a model by accident, so watch out. One way that this might occur is when a variable that's intended to be quantitative is instead treated as categorical (perhaps because there's a non-numeric entry for a categorical variable.)

In this exercise, you'll simulate this by converting `year`

in the `HDD_Minneapolis`

data to a categorical level.

Instructions

**100 XP**

- Build
`model_1`

with formula`hdd ~ year * month`

, a linear architecture, and the`HDD_Minneapolis`

dataset. - Find the R-squared of that model. It's very high mainly because
`month`

tells a lot about temperature. - Change the
`year`

column in`HDD_Minneapolis`

to a categorical variable with`as.character()`

. - Create
`model_2`

to be the same as`model_1`

, but with`categorical_year`

in place of`year`

. - Find the R-squared of
`model_2`

. Although`year`

has just one degree of freedom,`categorical_year`

has more than 100!