CommencerCommencer gratuitement

Dealing with multicollinearity

In the previous exercise, you found that multicollinearity exists in your model by reviewing the VIF values of independent variables. Follow the steps below to remove multicollinearity:

  • Step 1: Calculate VIF of the model
  • Step 2: Identify if any variable has VIF greater than or equal to 5
    • Step 2a: Remove the variable from the model if it has a VIF greater than or equal to 5
    • Step 2b: If there are multiple variables with VIF greater than 5, only remove the variable with the highest VIF
  • Step 3: Repeat steps 1 and 2 until VIF of all variables is less than 5

Cet exercice fait partie du cours

HR Analytics: Predicting Employee Churn in R

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Remove level
model_1 <- glm(turnover ~ . - ___, family = "binomial", 
               data = train_set_multi)

# Check multicollinearity again
___

# Which variable has the highest VIF value?
highest <- ___
Modifier et exécuter le code