NHANES dataset construction
As downloaded from the NHANES website, the NHANES datasets are available only as separate .XPT files, a native format to SAS. Luckily for us, the haven
package exists.
Let's combine the NHANES Demographics, Medical Conditions, and Body Measures datasets, available in their raw .XPT format and accessible through the variables DEMO_file
, MCQ_file
, and BMX_file
. Join all 3 datasets using the SEQN
variable. A good way to do this is using Reduce()
, which allows you to combine elements in a helpful way.
The joining code, which is provided for you does the following:
- Creates a list of all 3 datasets (
nhanes_demo
,nhanes_medical
,nhanes_bodymeasures
). - Uses a custom function inside of
Reduce()
to inner join all 3 datasets with the"SEQN"
variable. - Saves this as the
nhanes_combined
dataset.
This exercise is part of the course
Experimental Design in R
Exercise instructions
- Load the
haven
package. - Import the three data files with separate calls to
read_xpt()
, where the inputs to these 3 calls toread_xpt()
areDEMO_file
,MCQ_file
, andBMX_file
and saved as the datasets asnhanes_demo
,nhanes_medical
, andnhanes_bodymeasures
, respectively. - Create
nhanes_combined
by merging the 3 datasets you just imported, using the provided code.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load haven
___
# Import the three datasets using read_xpt()
nhanes_demo <- read_xpt(DEMO_file)
___
___
# Merge the 3 datasets you just created to create nhanes_combined
nhanes_combined <- list(nhanes_demo, nhanes_medical, nhanes_bodymeasures) %>%
Reduce(function(df1, df2) inner_join(df1, df2, by = "SEQN"), .)