Exercise

Model-based imputation with multiple variable types

Great job on writing the function to implement logistic regression imputation with drawing from conditional distribution. That's pretty advanced statistics you have coded! In this exercise, you will combine what you learned so far about model-based imputation to impute different types of variables in the tao data.

Your task is to iterate over variables just like you have done in the previous chapter and impute two variables:

  • is_hot, a new binary variable that was created out of air_temp, which is 1 if air_temp is at or above 26 degrees and is 0 otherwise;
  • humidity, a continuous variable you are already familiar with.

You will have to use the linear regression function you have learned before, as well as your own function for logistic regression. Let's get to it!

Instructions

100 XP
  • Set is_hot to NA in places where it was originally missing.
  • Impute is_hot with logistic regression, using sea_surface_temp as the only predictor; use your function impute_logreg().
  • Set humidity to NA in places where it was originally missing.
  • Impute humidity with linear regression, using sea_surface_temp and air_temp as predictors.