Get startedGet started for free

NHANES dataset construction

As downloaded from the NHANES website, the NHANES datasets are available only as separate .XPT files, a native format to SAS. Luckily for us, the haven package exists.

Let's combine the NHANES Demographics, Medical Conditions, and Body Measures datasets, available in their raw .XPT format and accessible through the variables DEMO_file, MCQ_file, and BMX_file. Join all 3 datasets using the SEQN variable. A good way to do this is using Reduce(), which allows you to combine elements in a helpful way.

The joining code, which is provided for you does the following:

  • Creates a list of all 3 datasets (nhanes_demo, nhanes_medical, nhanes_bodymeasures).
  • Uses a custom function inside of Reduce() to inner join all 3 datasets with the "SEQN" variable.
  • Saves this as the nhanes_combined dataset.

This exercise is part of the course

Experimental Design in R

View Course

Exercise instructions

  • Load the haven package.
  • Import the three data files with separate calls to read_xpt(), where the inputs to these 3 calls to read_xpt() are DEMO_file, MCQ_file, and BMX_file and saved as the datasets as nhanes_demo, nhanes_medical, and nhanes_bodymeasures, respectively.
  • Create nhanes_combined by merging the 3 datasets you just imported, using the provided code.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Load haven
___

# Import the three datasets using read_xpt()
nhanes_demo <- read_xpt(DEMO_file)
___
___

# Merge the 3 datasets you just created to create nhanes_combined
nhanes_combined <- list(nhanes_demo, nhanes_medical, nhanes_bodymeasures) %>%
  Reduce(function(df1, df2) inner_join(df1, df2, by = "SEQN"), .)
Edit and Run Code