MulaiMulai sekarang secara gratis

The mice flow: mice - with - pool

Multiple imputation by chained equations, or MICE, allows us to estimate the uncertainty from imputation by imputing a data set multiple times with model-based imputation, while drawing from conditional distributions. This way, each imputed data set is slightly different. Then, an analysis is conducted on each of them and the results are pooled together, yielding the quantities of interest, alongside their confidence intervals that reflect the imputation uncertainty.

In this exercise, you will practice the typical MICE flow: mice() - with() - pool(). You will perform a regression analysis on the biopics data to see which subject occupation, sub_type, is associated with highest movie earnings. Let's play with mice!

Latihan ini adalah bagian dari kursus

Handling Missing Data with Imputations in R

Lihat Kursus

Petunjuk latihan

  • Load the mice package and impute biopics with mice() using 5 imputations, assigning the result to biopics_multiimp.
  • Fit a linear regression model that explains earnings using year and sub_type to each imputed data set, assigning the result to lm_multiimp.
  • Pool the regression models saved in lm_multiimp together, assigning the result to lm_pooled.
  • Summarize lm_pooled such that it produces confidence intervals with a 95% confidence level.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Load mice package
___

# Impute biopics with mice using 5 imputations
biopics_multiimp <- ___(___, m = ___, seed = 3108)

# Fit linear regression to each imputed data set 
lm_multiimp <- ___(___, ___)

# Pool and summarize regression results
lm_pooled <- ___(___)
___(___, conf.int = ___, conf.level = ___)
Edit dan Jalankan Kode