Exercise

# Collinearity and inflation (1)

The first several lines of code in the editor show the calculation of the effect size of `educ`

on `wage`

using a linear model trained on the `CPS85`

data. The 95% confidence interval on the effect size ranges from about -0.5 dollar per year of education to 1.30 dollar per year of education. In other words, the confidence interval is so wide that you can't even say whether education has a positive or a negative effect on wage.

Notice that `sector`

, `exper`

, and `age`

have all been used as covariates. In this exercise, you'll look at the collinearity among the explanatory variables to see if there is a covariate you can exclude that will dramatically reduce the width of the confidence interval.

The `collinearity()`

function (from the `statisticalModeling`

package) calculates how much the effect size might (at a maximum) be influenced by collinearity with the other explanatory variables.

Instructions

**100 XP**

- Examine the confidence interval on the effect of
`educ`

on`wage`

for`model_1`

. - Use the
`collinearity()`

function to assess the worst possible inflation introduced by collinearity among the explanatory variables. Note the huge variance inflation (\(15.27^2\)) on education. - Again using
`collinearity()`

, try omitting each of the covariates in turn to find one that can be left out that will dramatically reduce the inflation.