ComeçarComece de graça

Dealing with multicollinearity

In the previous exercise, you found that multicollinearity exists in your model by reviewing the VIF values of independent variables. Follow the steps below to remove multicollinearity:

  • Step 1: Calculate VIF of the model
  • Step 2: Identify if any variable has VIF greater than or equal to 5
    • Step 2a: Remove the variable from the model if it has a VIF greater than or equal to 5
    • Step 2b: If there are multiple variables with VIF greater than 5, only remove the variable with the highest VIF
  • Step 3: Repeat steps 1 and 2 until VIF of all variables is less than 5

Este exercício faz parte do curso

HR Analytics: Predicting Employee Churn in R

Ver curso

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Remove level
model_1 <- glm(turnover ~ . - ___, family = "binomial", 
               data = train_set_multi)

# Check multicollinearity again
___

# Which variable has the highest VIF value?
highest <- ___
Editar e executar o código