Get startedGet started for free

Constructing a CI

You've seen one example of how p-hat can vary upon resampling, but we need to do this many many times to get a good estimate of its variability. Here you will compute a full bootstrap distribution to estimate the standard error (SE) that will be used to form a confidence interval. You'll use an additional verb from infer, calculate(), to streamline this process of calculating many statistics from many datasets.

Take a moment to inspect the output of calculate. This function reduces your data frame to just two columns: one for the "stat"s and another for the "replicate" they correspond to.

When you plot your bootstrap distribution, you'll find that it's bell-shaped. It's this shape that allows you to add and subtract two SEs to get a 95% interval.

This exercise is part of the course

Inference for Categorical Data in R

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create bootstrap distribution for proportion with High conf
boot_dist <- gss2016 %>%
  # Specify the response and success
  specify(response = ___, ___ = "___") %>%
  # Generate 500 bootstrap reps
  generate(___ = ___, type = "bootstrap") %>%
  # Calculate proportions
  calculate(stat = "___")

# See the result
boot_dist
Edit and Run Code