Into the matrix
You identified six distinct values for EducationField
. But you suspect that others might show up as you run the model on new data. To prepare for this, you will create a hash index with 50 terms. The textrecipes
package, the attrition_train
, and the attrition_test
splits are already loaded.
This exercise is part of the course
Feature Engineering in R
Exercise instructions
- Add a step to the recipe that generates a dummy_hash index for
EducationField
. - Prepare the recipe.
- Bake the prepared recipe.
- Bind the baked recipe table and the
EducationField
values into one table and print the first 7 rows, as well as columns 1 and 18 to 20.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
recipe <- recipe(~EducationField, data = attrition_train) %>%
# Add a step to the recipe that generates a dummy_hash index for EducationField
___(EducationField, prefix = NULL, signed = FALSE, num_terms = 50L)
# Prepare the recipe
object <- recipe %>%
___
# Bake the prepped recipe
baked <- ___(object, new_data = attrition_test)
# Bind the baked recipe table and the EducationField values into one table
bind_cols(___, baked)[1:7,c(1,18:20)]