1. Learn
  2. /
  3. Courses
  4. /
  5. Dealing With Missing Data in R

Exercise

Creating shadow matrix data

Missing data can be tricky to think about, as they don't usually proclaim themselves for you, and instead hide amongst the weeds of the data.

One way to help expose missing values is to change the way we think about the data - by thinking about every single data value being missing or not missing.

The as_shadow() function in R transforms a dataframe into a shadow matrix, a special data format where the values are either missing (NA), or Not Missing (!NA).

The column names of a shadow matrix are the same as the data, but have a suffix added _NA.

To keep track of and compare data values to their missingness state, use the bind_shadow() function. Having data in this format, with the shadow matrix column bound to the regular data is called nabular data.

Instructions

100 XP

Using the oceanbuoys dataset:

  • Create shadow matrix data with as_shadow()
  • Create nabular data by binding the shadow to the data with bind_shadow()
  • Bind only the variables with missing values by using bind_shadow(only_miss = TRUE)