Get startedGet started for free

Dealing with multicollinearity

In the previous exercise, you found that multicollinearity exists in your model by reviewing the VIF values of independent variables. Follow the steps below to remove multicollinearity:

  • Step 1: Calculate VIF of the model
  • Step 2: Identify if any variable has VIF greater than or equal to 5
    • Step 2a: Remove the variable from the model if it has a VIF greater than or equal to 5
    • Step 2b: If there are multiple variables with VIF greater than 5, only remove the variable with the highest VIF
  • Step 3: Repeat steps 1 and 2 until VIF of all variables is less than 5

This exercise is part of the course

HR Analytics: Predicting Employee Churn in R

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Remove level
model_1 <- glm(turnover ~ . - ___, family = "binomial", 
               data = train_set_multi)

# Check multicollinearity again
___

# Which variable has the highest VIF value?
highest <- ___
Edit and Run Code