Using predictor matrix
An important decision that needs to be taken when using model-based imputation is which variables should be included as predictors, and in which models. In mice()
, this is governed by the predictor matrix and by default, all variables are used to impute all others.
In case of many variables in the data or little time to do a proper model selection, you can use mice
's functionality to create a predictor matrix based on the correlations between the variables. This matrix can then be passed to mice()
. In this exercise, you will practice exactly this: you will first build a predictor matrix such that each variable will be imputed using variables most correlated to it; then, you will feed your predictor matrix to the imputing function. Let's try this simple model selection!
This exercise is part of the course
Handling Missing Data with Imputations in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create predictor matrix with minimum correlation of 0.1
pred_mat <- ___(biopics, mincor = ___)