Get startedGet started for free

Preserving the most common levels

Sometimes you don't want to keep levels by proportion but instead the most common n levels. Let's see how the resulting levels kept for MLMethodNextYearSelect changes when we kept by number instead of proportion. multiple_choice_responses has been loaded for you.

This exercise is part of the course

Categorical Data in the Tidyverse

View Course

Exercise instructions

  • Remove people who didn't select a method.
  • Create a new variable, ml_method, from MLMethodNextYearSelect that preserves 5 most common titles and lumps the rest as "other method" using the argument other_level.
  • Count the frequency of each ml_method, sorting in descending order.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

multiple_choice_responses %>%
  # Remove NAs 
  filter(___) %>%
  # Create ml_method, retaining the 5 most common methods and renaming others "other method" 
  mutate(ml_method = ___(MLMethodNextYearSelect, ___, other_level = ___)) %>%
  # Count the frequency of your new variable, sorted in descending order
  ___(ml_method, ___)
Edit and Run Code