CommencerCommencer gratuitement

Combining and comparing many imputation models

To evaluate the different imputation methods, we need to put them into a single dataframe. Next, you will compare three different approaches to handling missing data using the dataset, oceanbuoys.

  • The first method is using only the completed cases and is loaded as ocean_cc.
  • The second method is imputing values using a linear model with predictions made using wind and is loaded as ocean_imp_lm_wind.

You will create the third imputed dataset, ocean_imp_lm_all, using a linear model and impute the variables sea_temp_c, air_temp_c, and humidity using the variables wind_ew, wind_ns, year, latitude, longitude.

You will then bind all of the datasets together (ocean_cc, ocean_imp_lm_wind, and ocean_imp_lm_all), calling it bound_models.

Cet exercice fait partie du cours

Dealing With Missing Data in R

Afficher le cours

Instructions

  • Create an imputed dataset named ocean_imp_lm_all using a linear model and impute the variables sea_temp_c, air_temp_c, and humidity using the variables wind_ew, wind_ns, year, latitude, longitude.
  • Bind all of the datasets together into the same object, calling it bound_models.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Create an imputed dataset using a linear models
ocean_imp_lm_all <- bind_shadow(oceanbuoys) %>%
  add_label_shadow() %>%
  impute_lm(sea_temp_c ~ wind_ew + wind_ns + ___ + ___ + ___) %>%
  impute_lm(air_temp_c ~ wind_ew + wind_ns + ___ + ___ + ___) %>%
  impute_lm(humidity ~ wind_ew + wind_ns + ___ + ___ + ___)

# Bind the datasets
bound_models <- bind_rows(cc = ___,
                          imp_lm_wind = ___,
                          imp_lm_all = ___,
                          .id = "imp_model")
# Look at the models
bound_models
Modifier et exécuter le code