Get startedGet started for free

Sifting through variable importance

The attrition dataset contains 839 observations and 30 predictors for "Attrition." You are interested in exploring the trade-off between the performance of a model that uses all available predictors versus a reduced model based on a few informative variables.

In this exercise, you'll fit a model and have a look at the variable importance of this fitted model. In the following exercise, you'll assess model performance using this model compared to using a reduced model.

The train and test splits and the vip() package are available in your environment along with a predeclared logistic regression model.

This exercise is part of the course

Feature Engineering in R

View Course

Exercise instructions

  • Create a recipe that models Attrition using all predictors.
  • Fit the workflow to the training data.
  • Use the fit_full object to graph the variable importance of your model.
  • Apply the extract_fit_parsnip() function before vip() to feed it the required information.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create a recipe that models Attrition using all the predictors
recipe_full <- ___(___, data = train)

workflow_full <- workflow() %>%
  add_model(model) %>%
  add_recipe(recipe_full)

# Fit the workflow to the training data
fit_full <- ___ %>%
  ___(data = train)

# Use the fit_full object to graph the variable importance of your model. Apply extract_fit_parsnip() function before vip()
fit_full %>% ___() %>%
  ___(aesthetics = list(fill = "steelblue"))
Edit and Run Code