Aan de slagGa gratis aan de slag

Evaluating imputations (many models & variables)

When you build up an imputation model, it's a good idea to compare it to another method.

In this lesson, we are going to get you to add a final imputation model that contains an extra useful piece of information that helps explain some of the variation in the data. You are then going to compare the values, as previously done in the last lesson.

Deze oefening maakt deel uit van de cursus

Dealing With Missing Data in R

Cursus bekijken

Oefeninstructies

Using the oceanbuoys dataset:

  • Impute data using impute_lm(), adding year to the model.
  • Bind the imputation methods together, placing ocean_imp_mean into mean, ocean_imp_lm_wind into lm_wind, and ocean_imp_lm_wind_year into lm_wind_year.
  • Look at the values of air_temp_c (on the x-axis) and humidity (on the y-axis), coloring by any missings, and faceting by imputation model.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Build a model adding year to the outcome
ocean_imp_lm_wind_year <- bind_shadow(___) %>%
  impute_lm(air_temp_c ~ wind_ew + wind_ns + ___) %>%
  impute_lm(humidity ~ wind_ew + wind_ns + ___) %>%
  add_label_shadow()

# Bind the mean, lm_wind, and lm_wind_year models together
bound_models <- bind_rows(mean = ocean_imp_mean,
                          lm_wind = ocean_imp_lm_wind,
                          lm_wind_year = ___,
                          .id = "imp_model")

# Explore air_temp and humidity, coloring by any missings, and faceting by imputation model
ggplot(___, aes(x = ___, y = ___, color = any_missing)) + 
  geom_point() + facet_wrap(~___)
Code bewerken en uitvoeren