Tabulating Missingness
The summaries of missingness we just calculated give us the number and percentage of missing observations for the cases and variables.
Another way to summarize missingness is by tabulating the number of times that there are 0, 1, 2, 3, missings in a variable, or in a case.
In this exercise we are going to tabulate the number of missings in each case and variable using miss_var_table()
and miss_case_table()
, and also combine these summaries with the the group_by
operator from dplyr
. to explore the summaries over a grouping variable in the dataset.
This exercise is part of the course
Dealing With Missing Data in R
Exercise instructions
For the airquality
dataset:
- Tabulate missingness for each variable using
miss_var_table()
. - Tabulate missingness for every case using
miss_case_table()
. - Combine previous tabulations with
dplyr
'sgroup_by()
function to create tabulations for each variable and case, by eachMonth
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Tabulate missingness in each variable and case of the `airquality` dataset
___(airquality)
___(___)
# Tabulate the missingness in each variable, grouped by Month, in the `airquality` dataset
airquality %>% group_by(___) %>% miss_var_table()
# Tabulate of missingness in each case, grouped by Month, in the `airquality` dataset
airquality %>% ___ %>% miss_case_table()