LoslegenKostenlos loslegen

Examining number of levels

dplyr has two other functions that can come in handy when exploring a dataset. The first is slice_max(var, n = x), which gets us the first x rows of a dataset based on the value of var. The other is pull(), which allows us to extract a column and take out the name, leaving only the value(s) from the column.

For example, if we wanted to get, as a set of values, the top two mpg values from the classic mtcars dataset, we would write:

mtcars %>%
  slice_max(mpg, n = 2) %>%
  pull(mpg)

This gets us:

[1] 32.4 33.9

Diese Übung ist Teil des Kurses

Categorical Data in the Tidyverse

Kurs anzeigen

Anleitung zur Übung

  • Use slice_max() to print out the 3 rows with the highest number of factor levels.
  • Filtering for the variable CurrentJobTitleSelect, pull the number of levels it has.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Select the 3 rows with the highest number of levels
number_of_levels %>%
    ___(num_levels, n = 3)
    
number_of_levels %>%
	# Filter for where the column called variable equals CurrentJobTitleSelect
    filter(___) %>%
	# Pull num_levels
    ___
Code bearbeiten und ausführen