Exercise

# Avoiding multicollinearity

Back to our sales dataset `salesData`

which is already loaded in the workspace. Additionally, the package `rms`

is loaded.

Let's estimate a multiple linear regression! Of course, we want to make use of all variables there are in the dataset.

Instructions

**100 XP**

- Go ahead and calculate a full model called
`salesModel1`

using all variables but the`id`

in order to explain the sales in this month. To do this, fill in the right variable names into the following dummy syntax:`response ~ . - excluded_variable`

. This can be read as "`response`

modeled by all variables except`excluded_variable`

." - Estimate the variance inflation factors using the
`vif()`

function from the`rms`

package. - In addition to excluding the variable
`id`

, remove the variables`preferredBrand`

and`nBrands`

in order to avoid multicollinearity. You do this by appending each of them with`-`

. Store the model in an object called`salesModel2`

. - Reestimate the variance inflation factors of the model. Would you accept the results now?