Session Ready
Exercise

Evaluating imputation models

In the previous exercise, you created two imputation models named imp_mean and imp_lm. Very cool!

Two more models have been created for you, namely imp_median and imp_cc. The former imputes all variables with the median whereas the latter only retains complete cases (cc) by removing the instances with any missing values. These four models have been put together in the imp_models_long tibble, which is available in your workspace.

In this exercise, you are going to examine the output of these imputation models and judge by yourself which one performed better. From an internal evaluation perspective, this means choosing the imputation model that best preserved the original distribution of the imputed variable (imp_cc) and its relationships to other variables.

Instructions 1/4
undefined XP
  • 1
  • 2
  • 3
  • 4
  • Peek at the first few rows of the imp_models_long tibble available in your workspace.