Evaluating bad imputations
In order to evaluate imputations, it helps to know what something bad looks like. To explore this, let's look at a typically bad imputation method: imputing using the mean value.
In this exercise we are going to explore how the mean imputation method works using a box plot, using the oceanbuoys
dataset.
This exercise is part of the course
Dealing With Missing Data in R
Exercise instructions
For the oceanbuoys
dataset:
- Impute the mean value with
impute_mean_all()
, and track these imputations withadd_label_shadow()
. - Explore the imputed values in humidity (
humidity
) using a box plot. - Explore the imputed values in air temperature (
air_temp_c
) using a box plot.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Impute the mean value and track the imputations
ocean_imp_mean <- bind_shadow(___) %>%
___() %>%
___()
# Explore the mean values in humidity in the imputed dataset
ggplot(___,
aes(x = ___, y = ___)) +
geom_boxplot()
# Explore the values in air temperature in the imputed dataset
ggplot(___,
aes(x = ___, y = ___)) +
geom_boxplot()