Linear regression imputation
Sometimes, you can use domain knowledge, previous research or simply your common sense to describe the relations between the variables in your data. In such cases, model-based imputation is a great solution, as it allows you to impute each variable according to a statistical model that you can specify yourself, taking into account any assumptions you might have about how the variables impact each other.
For continuous variables, a popular model choice is linear regression. It doesn't restrict you to linear relations though! You can always include a square or a logarithm of a variable in the predictors. In this exercise, you will work with the simputation
package to run a single linear regression imputation on the tao
data and analyze the results. Let's give it a try!
Este exercício faz parte do curso
Handling Missing Data with Imputations in R
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Load the simputation package
___
# Impute air_temp and humidity with linear regression
formula <- ____
tao_imp <- ___(tao, formula)