BaşlayınÜcretsiz Başlayın

Performing grouped summaries of missingness

Now that you can create nabular data, let's use it to explore the data. Let's calculate summary statistics based on the missingness of another variable.

To do this we are going to use the following steps:

  • First, bind_shadow() turns the data into nabular data.

  • Next, perform some summaries on the data using group_by() and summarize() to calculate the mean and standard deviation, using the mean() and sd() functions.

Bu egzersiz

Dealing With Missing Data in R

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • For the oceanbuoys dataset:

  • bind_shadow(), then group_by() for the missingness of humidity (humidity_NA) and calculate the means and standard deviations for wind east west (wind_ew) using summarize() from dplyr.

  • Repeat this, but calculating summaries for wind north south (wind_ns).

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# `bind_shadow()` and `group_by()` humidity missingness (`humidity_NA`)
oceanbuoys %>%
  ___() %>%
  group_by(___) %>% 
  summarize(wind_ew_mean = mean(___), # calculate mean of wind_ew
            wind_ew_sd = ___)) # calculate standard deviation of wind_ew
  
# Repeat this, but calculating summaries for wind north south (`wind_ns`).
___ %>%
  ___ %>%
  group_by(___) %>%
  summarize(___ = ___(___),
            ___ = ___(___))
Kodu Düzenle ve Çalıştır