Get startedGet started for free

Bootstrap t-confidence interval

The previous exercises told you two things:

  1. You can measure the variability associated with \(\hat{p}\) by resampling from the original sample.
  2. Once you know the variability of \(\hat{p}\), you can use it as a way to measure how far away the true proportion is.

Note that the rate of closeness (here 95%) refers to how often a sample is chosen so that it is close to the population parameter. You won't ever know if a particular dataset is close to the parameter or far from it, but you do know that over your lifetime, 95% of the samples you collect should give you estimates that are within \(2SE\) of the true population parameter.

The votes from a single poll, one_poll, and the data from 1000 bootstrap resamples, one_poll_boot are available in your workspace. These are based on Experiment 2 from earlier in the chapter.

As in the previous exercise, when discussing the variability of a statistic, the number is referred to as the standard error.

This exercise is part of the course

Foundations of Inference in R

View Course

Exercise instructions

  • Calculate \(\hat{p}\) and assign the result to p_hat. In the call to summarize(), calculate stat as the mean of vote equalling "yes".
  • Find an interval of values that are plausible for the true parameter by calculating \(\hat{p} \pm 2SE\).
    • The lower bound of the confidence interval is p_hat minus twice the standard error of stat. Use sd() to calculate the standard error.
    • The upper bound is p_hat plus twice the standard error of stat.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# From previous exercises
one_poll <- all_polls %>%
  filter(poll == 1) %>%
  select(vote)
one_poll_boot <- one_poll %>%
  specify(response = vote, success = "yes") %>%
  generate(reps = 1000, type = "bootstrap") %>% 
  calculate(stat = "prop")
  
p_hat <- one_poll %>%
  # Calculate proportion of yes votes
  summarize(stat = ___) %>%
  pull()

# Create an interval of plausible values
one_poll_boot %>%
  summarize(
    # Lower bound is p_hat minus 2 std errs
    lower = ___,
    # Upper bound is p_hat plus 2 std errs
    upper = ___
  )
Edit and Run Code