LoslegenKostenlos loslegen

Performing grouped summaries of missingness

Now that you can create nabular data, let's use it to explore the data. Let's calculate summary statistics based on the missingness of another variable.

To do this we are going to use the following steps:

  • First, bind_shadow() turns the data into nabular data.

  • Next, perform some summaries on the data using group_by() and summarize() to calculate the mean and standard deviation, using the mean() and sd() functions.

Diese Übung ist Teil des Kurses

Dealing With Missing Data in R

Kurs anzeigen

Anleitung zur Übung

  • For the oceanbuoys dataset:

  • bind_shadow(), then group_by() for the missingness of humidity (humidity_NA) and calculate the means and standard deviations for wind east west (wind_ew) using summarize() from dplyr.

  • Repeat this, but calculating summaries for wind north south (wind_ns).

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

# `bind_shadow()` and `group_by()` humidity missingness (`humidity_NA`)
oceanbuoys %>%
  ___() %>%
  group_by(___) %>% 
  summarize(wind_ew_mean = mean(___), # calculate mean of wind_ew
            wind_ew_sd = ___)) # calculate standard deviation of wind_ew
  
# Repeat this, but calculating summaries for wind north south (`wind_ns`).
___ %>%
  ___ %>%
  group_by(___) %>%
  summarize(___ = ___(___),
            ___ = ___(___))
Code bearbeiten und ausführen