Get startedGet started for free

Practice Exercise. National Center for Health Statistics

To practice our dplyr skills we will be working with data from the survey collected by the United States National Center for Health Statistics (NCHS). This center has conducted a series of health and nutrition surveys since the 1960’s.

Starting in 1999, about 5,000 individuals of all ages have been interviewed every year and then they complete the health examination component of the survey. Part of this dataset is made available via the NHANES package which can be loaded this way:

library(NHANES)
data(NHANES)

The NHANES data has many missing values. Remember that the main summarization function in R will return NA if any of the entries of the input vector is an NA. Here is an example:

library(dslabs)
data(na_example)
mean(na_example)
sd(na_example)

To ignore the NAs, we can use the na.rm argument:

mean(na_example, na.rm = TRUE)
sd(na_example, na.rm = TRUE)

Try running this code, then let us know you are ready to proceed with the analysis.

This exercise is part of the course

Data Science Visualization - Module 2

View Course

Hands-on interactive exercise

Turn theory into action with one of our interactive exercises

Start Exercise