Detecting convergence
Great job iterating over the variables in the last exercise! But how many iterations are needed? When the imputed values don't change with the new iteration, we can stop.
You will now extend your code to compute the differences between the imputed variables in subsequent iterations. To do this, you will use the Mean Absolute Percentage Change function, defined for you as follows:
mapc <- function(a, b) {
mean(abs(b - a) / a, na.rm = TRUE)
}
mapc()
outputs a single number that tells you how much b
differs from a
. You will use it to check how much the imputed variables change across iterations. Based on this, you will decide how many of them are needed!
The boolean masks missing_air_temp
and missing_humidity
are available for you, as is the hotdeck-initialized tao_imp
data.
This exercise is part of the course
Handling Missing Data with Imputations in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
diff_air_temp <- ___
diff_humidity <- ___
for (i in 1:5) {
# Assign the outcome of the previous iteration (or initialization) to prev_iter
prev_iter <- ___
# Impute air_temp and humidity at originally missing locations
tao_imp$air_temp[missing_air_temp] <- NA
tao_imp <- impute_lm(tao_imp, air_temp ~ year + latitude + sea_surface_temp + humidity)
tao_imp$humidity[missing_humidity] <- NA
tao_imp <- impute_lm(tao_imp, humidity ~ year + latitude + sea_surface_temp + air_temp)
}