Examining number of levels
dplyr
has two other functions that can come in handy when exploring a dataset. The first is slice_max(var, n = x)
, which gets us the first x rows of a dataset based on the value of var
. The other is pull()
, which allows us to extract a column and take out the name, leaving only the value(s) from the column.
For example, if we wanted to get, as a set of values, the top two mpg
values from the classic mtcars
dataset, we would write:
mtcars %>%
slice_max(mpg, n = 2) %>%
pull(mpg)
This gets us:
[1] 32.4 33.9
Diese Übung ist Teil des Kurses
Categorical Data in the Tidyverse
Anleitung zur Übung
- Use
slice_max()
to print out the 3 rows with the highest number of factor levels. - Filtering for the variable
CurrentJobTitleSelect
,pull
the number of levels it has.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Select the 3 rows with the highest number of levels
number_of_levels %>%
___(num_levels, n = 3)
number_of_levels %>%
# Filter for where the column called variable equals CurrentJobTitleSelect
filter(___) %>%
# Pull num_levels
___