Create a missing values recipe
In the previous exercises, you manually calculated the missing value ratio and created a filter to reduce the dimensionality of house_sales_df
. The tidymodels
package contains a recipe step to apply a missing values ratio automatically—step_filter_missing()
. The advantages of the tidymodels
approach is that it allows you reuse the recipe on other data sets and simplifies the move to a production environment. In this exercise, you will use the step_filter_missing()
function to perform dimensionality reduction house_sales_df
based on missing values.
The tidyverse
and tidymodels
packages have been loaded for you.
This exercise is part of the course
Dimensionality Reduction in R
Exercise instructions
- Use
recipe()
to create a missing values filter with a threshold of 0.5. - Apply the
missing_vals_recipe
tohouse_sales_df
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create missing values recipe
missing_vals_recipe <-
___(___ ~ ., data = ___) %>%
___(___(), ___ = ___) %>%
prep()
# Apply recipe to data
___ <-
___(___, ___ = ___)
# Display the first five rows of data
___ %>% ___(___)