Get startedGet started for free

Create a missing values recipe

In the previous exercises, you manually calculated the missing value ratio and created a filter to reduce the dimensionality of house_sales_df. The tidymodels package contains a recipe step to apply a missing values ratio automatically—step_filter_missing(). The advantages of the tidymodels approach is that it allows you reuse the recipe on other data sets and simplifies the move to a production environment. In this exercise, you will use the step_filter_missing() function to perform dimensionality reduction house_sales_df based on missing values.

The tidyverse and tidymodels packages have been loaded for you.

This exercise is part of the course

Dimensionality Reduction in R

View Course

Exercise instructions

  • Use recipe() to create a missing values filter with a threshold of 0.5.
  • Apply the missing_vals_recipe to house_sales_df.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create missing values recipe
missing_vals_recipe <- 
  ___(___ ~ ., data = ___) %>% 
  ___(___(), ___ = ___) %>% 
  prep()
  
# Apply recipe to data
___ <- 
  ___(___, ___ = ___)

# Display the first five rows of data
___ %>% ___(___)
Edit and Run Code