purrr and histograms

Now you're going to put together everything you've learned, starting with two different lists, which will be turned into a faceted histogram. You're going to work again with the Stars Wars data from the sw_films and sw_people datasets to answer a question:

  • What is the distribution of heights of characters in each of the Star Wars films?

Different movies take place on different sets of planets, so you might expect to see different distributions of heights from the characters. Your first task is to transform the two datasets into data frames since ggplot() requires a data frame input. Then you will join them together, and plot the result, a histogram with a different facet, or subplot, for each film.

This exercise is part of the course

Foundations of Functional Programming with purrr

View Course

Exercise instructions

  • Create a data frame with the "title" of each film, and the "characters" from each film in the sw_films dataset.
  • Create a data frame with the "height", "mass", "name", and "url" elements from sw_people.
  • Join the two data frames together using the "characters" and "url" keys.
  • Create a ggplot() histogram with x = height, faceted by filmtitle.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Turn data into correct data frame format
film_by_character <- tibble(filmtitle = map____(___, ___)) %>%
    mutate(filmtitle, characters = map(___, ___)) %>%
    unnest(cols = c(characters))

# Pull out elements from sw_people
sw_characters <- map____(___, `[`, c(___, ___, ___, ___))

# Join the two new objects
character_data <- inner_join(___, ___, by = c(___ = ___)) %>%
    # Make sure the columns are numbers
    mutate(height = as.numeric(height), mass = as.numeric(mass))

# Plot the heights, faceted by film title
ggplot(character_data, aes(x = ___)) +
  geom_histogram(stat = "count") +
  facet_wrap(~ ___)