Get startedGet started for free

Assessing imputation quality with margin plot

In the last exercise, you have mean-imputed air_temp and added an indicator variable to denote which values were imputed, called air_temp_imp. Time to see how well this works.

Upon examining the tao data, you might have noticed that it also contains a variable called sea_surface_temp, which could reasonably be expected to be positively correlated with air_temp. If that's the case, you would expect these two temperatures to be both high or both low at the same time. Imputing mean air temperature when the sea temperature is high or low would break this relation.

To find out, in this exercise you will select the two temperature variables and the indicator variable and use them to draw a margin plot. Let's assess the mean imputation!

This exercise is part of the course

Handling Missing Data with Imputations in R

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Draw a margin plot of air_temp vs sea_surface_temp
tao_imp %>% 
  select(___, ___, ___) %>%
  ___(delimiter = ___)
Edit and Run Code