Sifting through variable importance
The attrition
dataset contains 839 observations and 30 predictors for "Attrition." You are interested in exploring the trade-off between the performance of a model that uses all available predictors versus a reduced model based on a few informative variables.
In this exercise, you'll fit a model and have a look at the variable importance of this fitted model. In the following exercise, you'll assess model performance using this model compared to using a reduced model.
The train
and test
splits and the vip()
package are available in your environment along with a predeclared logistic regression model
.
This exercise is part of the course
Feature Engineering in R
Exercise instructions
- Create a recipe that models
Attrition
using all predictors. - Fit the workflow to the training data.
- Use the
fit_full
object to graph the variable importance of your model. - Apply the
extract_fit_parsnip()
function beforevip()
to feed it the required information.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a recipe that models Attrition using all the predictors
recipe_full <- ___(___, data = train)
workflow_full <- workflow() %>%
add_model(model) %>%
add_recipe(recipe_full)
# Fit the workflow to the training data
fit_full <- ___ %>%
___(data = train)
# Use the fit_full object to graph the variable importance of your model. Apply extract_fit_parsnip() function before vip()
fit_full %>% ___() %>%
___(aesthetics = list(fill = "steelblue"))