Performing grouped summaries of missingness
Now that you can create nabular data, let's use it to explore the data. Let's calculate summary statistics based on the missingness of another variable.
To do this we are going to use the following steps:
First,
bind_shadow()turns the data into nabular data.Next, perform some summaries on the data using
group_by()andsummarize()to calculate the mean and standard deviation, using themean()andsd()functions.
Este exercício faz parte do curso
Dealing With Missing Data in R
Instruções do exercício
For the
oceanbuoysdataset:bind_shadow(), thengroup_by()for the missingness of humidity (humidity_NA) and calculate the means and standard deviations for wind east west (wind_ew) usingsummarize()from dplyr.Repeat this, but calculating summaries for wind north south (
wind_ns).
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# `bind_shadow()` and `group_by()` humidity missingness (`humidity_NA`)
oceanbuoys %>%
___() %>%
group_by(___) %>%
summarize(wind_ew_mean = mean(___), # calculate mean of wind_ew
wind_ew_sd = ___)) # calculate standard deviation of wind_ew
# Repeat this, but calculating summaries for wind north south (`wind_ns`).
___ %>%
___ %>%
group_by(___) %>%
summarize(___ = ___(___),
___ = ___(___))