Performing grouped summaries of missingness
Now that you can create nabular data, let's use it to explore the data. Let's calculate summary statistics based on the missingness of another variable.
To do this we are going to use the following steps:
First,
bind_shadow()
turns the data into nabular data.Next, perform some summaries on the data using
group_by()
andsummarize()
to calculate the mean and standard deviation, using themean()
andsd()
functions.
Diese Übung ist Teil des Kurses
Dealing With Missing Data in R
Anleitung zur Übung
For the
oceanbuoys
dataset:bind_shadow()
, thengroup_by()
for the missingness of humidity (humidity_NA
) and calculate the means and standard deviations for wind east west (wind_ew
) usingsummarize()
from dplyr.Repeat this, but calculating summaries for wind north south (
wind_ns
).
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# `bind_shadow()` and `group_by()` humidity missingness (`humidity_NA`)
oceanbuoys %>%
___() %>%
group_by(___) %>%
summarize(wind_ew_mean = mean(___), # calculate mean of wind_ew
wind_ew_sd = ___)) # calculate standard deviation of wind_ew
# Repeat this, but calculating summaries for wind north south (`wind_ns`).
___ %>%
___ %>%
group_by(___) %>%
summarize(___ = ___(___),
___ = ___(___))