Create a missing values filter
The zero-variance filter only removes some of the low-information features. Features may also contain little to no information because they have a high number of missing values. In this exercise, you'll create a missing values filter. You'll take an extreme approach and remove any feature with at least one missing value, which means you could remove features with significant information.
house_sales_df
is available on the console and tidyverse
package has been loaded for you.
This exercise is part of the course
Dimensionality Reduction in R
Exercise instructions
- Create a missing values filter using
summarize()
,across()
,sum()
, andis.na()
to remove features with zero or more missing values and store it inna_filter
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a missing values filter
___ <- ___ %>%
___(across(everything(), ~ ___)) %>%
pivot_longer(everything(), names_to = "feature", values_to = "NA_count") %>%
___(___ > ___) %>%
pull(feature)
na_filter