Assessing imputation quality with margin plot
In the last exercise, you have mean-imputed air_temp and added an indicator variable to denote which values were imputed, called air_temp_imp. Time to see how well this works.
Upon examining the tao data, you might have noticed that it also contains a variable called sea_surface_temp, which could reasonably be expected to be positively correlated with air_temp. If that's the case, you would expect these two temperatures to be both high or both low at the same time. Imputing mean air temperature when the sea temperature is high or low would break this relation.
To find out, in this exercise you will select the two temperature variables and the indicator variable and use them to draw a margin plot. Let's assess the mean imputation!
Cet exercice fait partie du cours
Handling Missing Data with Imputations in R
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Draw a margin plot of air_temp vs sea_surface_temp
tao_imp %>%
select(___, ___, ___) %>%
___(delimiter = ___)