LoslegenKostenlos loslegen

Bootstrapping the average maternal age

Maternal age, or the age of a mother at the time of giving birth, is an important marker of natal health in a population. Too high or low a maternal age can have adverse effects on the outcome of the birth.

You work for the US Department of Health as a Data Analyst. You are given a list, ls_df, of 51 data frames, one for each US state and Washington DC. Each data frame contains the column maternal_age. Your boss would like you to bootstrap a distribution of the mean maternal age for each state. You have already written a loop to do the bootstrap on a single data frame. You need to parallelize this calculation. The parallel package has been loaded for you.

Diese Übung ist Teil des Kurses

Parallel Programming in R

Kurs anzeigen

Anleitung zur Übung

  • Wrap the bootstrap loop into a function that returns the distribution of the mean.
  • Set up a cluster of four cores.
  • Apply the bootstrap function to ls_df in parallel using parLapply().
  • Stop the cluster once all calculations are done.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Wrap the loop into a function
boot_mean <- ___ (df) ___
  est <- rep(0, 1e3)
  for (i in 1:1e3) {
    b <- sample(df$mother_age, replace = T)
    est[i] <- mean(b)
  }
  return(est)
___
# Make a cluster of four
cl <- ___
# Apply function to ls_df in parallel
state_dist <- ___
# Stop cluster
___(cl)
Code bearbeiten und ausführen