1. Learn
  2. /
  3. Courses
  4. /
  5. Dealing With Missing Data in R

Connected

Exercise

Other summaries of missingness

Some summaries of missingness are particularly useful for different types of data. For example, miss_var_span() and miss_var_run().

  • miss_var_span() calculates the number of missing values in a specified variable for a repeating span. This is really useful in time series data, to look for weekly (7 day) patterns of missingness.

  • miss_var_run() calculates the number of "runs" or "streaks" of missingness. This is useful to find unusual patterns of missingness, for example, you might find a repeating pattern of 5 complete and 5 missings.

Both miss_var_span() and miss_var_run() work with the group_by operator from dplyr.

Instructions

100 XP

Using the pedestrian dataset from naniar:

  • Calculate summaries of missingness for the variables in datasets using miss_var_span(), for a span of 4000.
  • Calculate summaries of missingness for the cases in datasets using miss_var_run().
  • Combine with dplyr's group_by operator for month.