Get startedGet started for free

Exercise 5. group_by

Now let's practice using the group_by function.

What we are about to do is a very common operation in data science: you will split a data table into groups and then compute summary statistics for each group.

We will compute the average and standard deviation of systolic blood pressure for females for each age group separately. Remember that the age groups are contained in AgeDecade.

This exercise is part of the course

Data Science Visualization - Module 2

View Course

Exercise instructions

  • Use the functions filter, group_by, summarize, and the pipe %>% to compute the average and standard deviation of systolic blood pressure for females for each age group separately.
  • Within summarize, save the average and standard deviation of systolic blood pressure (BPSysAve) as average and standard_deviation.
  • Note: ignore warnings about implicit NAs. This warning will not prevent your code from running or being graded correctly.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

library(dplyr)
library(NHANES)
data(NHANES)
##complete the line with group_by and summarize
NHANES %>%
      filter(Gender == "female") %>%
Edit and Run Code