Summarizing data

Let's now make a faceted plot to compare usefulness across different learning platforms.

In this exercise, we'll introduce a new dplyrfunction, add_count(). add_count() adds a column to the dataset, n, keeping the same number of rows as the original dataset. Just like count(), n defaults to be the number of rows for each group, but you can change that with the wt (weight) argument. You set wt equal to another column to make n now equal to the sum of that column for each group.

Let's say you wanted to add a column to iris that is the sum of the Petal.Length for all the flowers of the same Species. You would write:

iris %>%
   add_count(Species, wt = Petal.Length) %>%
   select(Species, Petal.Length, n)

This would give you back:

# A tibble: 150 x 3
   Species Petal.Length     n
   <fct>          <dbl> <dbl>
 1 setosa           1.4  73.1
 2 setosa           1.4  73.1
 3 virginica        6.4  278.

Use count() to change the dataset to have one row per learning_platform usefulness pair with a column that is the number of entries with that pairing.

Introduction to Factor Variables

Manipulating Factor Variables

Creating Factor Variables

Case Study on Flight Etiquette

Exercise

Summarizing data

Instructions 1/4