Complete feature engineering pipeline
The recipes
package is designed to encode multiple feature engineering steps into one object, making it easier to maintain data transformations in a machine learning workflow.
In this exercise, you will train a feature engineering pipeline to prepare the telecommunications data for modeling.
The telecom_df
tibble, as well as your telecom_training
and telecom_test
datasets from the previous exercises, have been loaded into your workspace.
This exercise is part of the course
Modeling with tidymodels in R
Exercise instructions
- Create a recipe that predicts
canceled_service
using all predictor variables in the training data. - Remove correlated predictor variables using a 0.8 threshold value.
- Normalize all numeric predictors.
- Create dummy variables for all nominal predictors.
- Train your recipe on the training data and apply it to the test data.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a recipe that predicts canceled_service using the training data
telecom_recipe <- ___ %>%
# Remove correlated predictors
___ %>%
# Normalize numeric predictors
___ %>%
# Create dummy variables
___
# Train your recipe and apply it to the test data
telecom_recipe %>%
___ %>%
___