Creating shadow matrix data
Missing data can be tricky to think about, as they don't usually proclaim themselves for you, and instead hide amongst the weeds of the data.
One way to help expose missing values is to change the way we think about the data - by thinking about every single data value being missing or not missing.
The as_shadow()
function in R transforms a dataframe into a shadow matrix, a special data format where the values are either missing (NA
), or Not Missing (!NA
).
The column names of a shadow matrix are the same as the data, but have a suffix added _NA
.
To keep track of and compare data values to their missingness state, use the bind_shadow()
function. Having data in this format, with the shadow matrix column bound to the regular data is called nabular
data.
This exercise is part of the course
Dealing With Missing Data in R
Exercise instructions
Using the oceanbuoys
dataset:
- Create shadow matrix data with
as_shadow()
- Create nabular data by binding the shadow to the data with
bind_shadow()
- Bind only the variables with missing values by using
bind_shadow(only_miss = TRUE)
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create shadow matrix data with `as_shadow()`
___(___)
# Create nabular data by binding the shadow to the data with `bind_shadow()`
___(___)
# Bind only the variables with missing values by using bind_shadow(only_miss = TRUE)
___(___, ___ = TRUE)