Preserving the most common levels
Sometimes you don't want to keep levels by proportion but instead the most common n levels. Let's see how the resulting levels kept for MLMethodNextYearSelect changes when we kept by number instead of proportion. multiple_choice_responses has been loaded for you.
Cet exercice fait partie du cours
Categorical Data in the Tidyverse
Instructions
- Remove people who didn't select a method.
- Create a new variable,
ml_method, fromMLMethodNextYearSelectthat preserves 5 most common titles and lumps the rest as "other method" using the argumentother_level. - Count the frequency of each
ml_method, sorting in descending order.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
multiple_choice_responses %>%
# Remove NAs
filter(___) %>%
# Create ml_method, retaining the 5 most common methods and renaming others "other method"
mutate(ml_method = ___(MLMethodNextYearSelect, ___, other_level = ___)) %>%
# Count the frequency of your new variable, sorted in descending order
___(ml_method, ___)