1. Learn
  2. /
  3. Courses
  4. /
  5. Categorical Data in the Tidyverse

Connected

Exercise

Summarizing data

Let's now make a faceted plot to compare usefulness across different learning platforms.

In this exercise, we'll introduce a new dplyrfunction, add_count(). add_count() adds a column to the dataset, n, keeping the same number of rows as the original dataset. Just like count(), n defaults to be the number of rows for each group, but you can change that with the wt (weight) argument. You set wt equal to another column to make n now equal to the sum of that column for each group.

Let's say you wanted to add a column to iris that is the sum of the Petal.Length for all the flowers of the same Species. You would write:

iris %>%
   add_count(Species, wt = Petal.Length) %>%
   select(Species, Petal.Length, n)

This would give you back:

# A tibble: 150 x 3
   Species Petal.Length     n
   <fct>          <dbl> <dbl>
 1 setosa           1.4  73.1
 2 setosa           1.4  73.1
 3 virginica        6.4  278.

Instructions 1/4

undefined XP
    1
    2
    3
    4
  • Use count() to change the dataset to have one row per learning_platform usefulness pair with a column that is the number of entries with that pairing.