BaşlayınÜcretsiz Başlayın

Create a low-variance recipe

The tidymodels packages provides a better way to filter no- and near-zero-variance features with its step_zv() and step_nzv() functions, respectively. These recipe steps identify low-variance features by examining the number of unique values and the ratio of the frequency of the most common values in each feature. This approach is more robust than the simple variance cutoff we used previously.

In addition, you will use the step_scale() recipe step to normalize the variance of the features. Remember it's always a good idea to normalize the data to make variances across features comparable.

The house_sales_df is available for you to use. The target variable is price. The tidyverse and tidymodels packages have also been loaded for you.

Bu egzersiz

Dimensionality Reduction in R

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Define a recipe for a low-variance filter and prepare it using house_sales_df.
  • Apply the recipe to house_sales_df and store the filtered data in filtered_house_sales_df.
  • Display the features that the recipe filtered in the step_nzv() step.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Prepare recipe
low_variance_recipe <- recipe(___ ~ ___, ___ = ___) %>% 
  step_zv(___) %>% 
  ___(___) %>% 
  ___(___) %>% 
  prep()

# Apply recipe
filtered_house_sales_df <- ___(___, new_data = ___)

# View list of features removed by the near-zero variance step 
tidy(___, number = ___)
Kodu Düzenle ve Çalıştır