Evaluating bad imputations
In order to evaluate imputations, it helps to know what something bad looks like. To explore this, let's look at a typically bad imputation method: imputing using the mean value.
In this exercise we are going to explore how the mean imputation method works using a box plot, using the oceanbuoys dataset.
This exercise is part of the course
Dealing With Missing Data in R
Exercise instructions
For the oceanbuoys dataset:
- Impute the mean value with
impute_mean_all(), and track these imputations withadd_label_shadow(). - Explore the imputed values in humidity (
humidity) using a box plot. - Explore the imputed values in air temperature (
air_temp_c) using a box plot.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Impute the mean value and track the imputations
ocean_imp_mean <- bind_shadow(___) %>%
___() %>%
___()
# Explore the mean values in humidity in the imputed dataset
ggplot(___,
aes(x = ___, y = ___)) +
geom_boxplot()
# Explore the values in air temperature in the imputed dataset
ggplot(___,
aes(x = ___, y = ___)) +
geom_boxplot()