Resampling from a sample

To investigate how much the estimates of a population proportion change from sample to sample, you will set up two sampling experiments.

In the first experiment, you will simulate repeated samples from a population. In the second, you will choose a single sample from the first experiment and repeatedly resample from that sample: a method called bootstrapping. More specifically:

Experiment 1: Assume the true proportion of people who will vote for Candidate X is 0.6. Repeatedly sample 30 people from the population and measure the variability of \(\hat{p}\) (the sample proportion).

Experiment 2: Take one sample of size 30 from the same population. Repeatedly sample 30 people (with replacement!) from the original sample and measure the variability of \(\hat{p}^*\) (the resample proportion).

It's important to realize that the first experiment relies on knowing the population and is typically impossible in practice. The second relies only on the sample of data and is therefore easy to implement for any statistic. Fortunately, as you will see, the variability in \(\hat{p}\), or the proportion of "successes" in a sample, is approximately the same whether we sample from the population or resample from a sample.

We have created 1000 random samples, each of size 30, from the population. The resulting data frame, all_polls, is available in your workspace. Take a look before getting started.

Compute the sample proportion for each of the 1000 original samples, assigning to ex1_props.
- Group by poll.
- Summarize to calculate stat as the mean() of cases of vote equalling "yes".

Introduction to ideas of inference

Completing a randomization test: gender discrimination

Hypothesis testing errors: opportunity cost

Confidence intervals

Exercise

Resampling from a sample

Instructions 1/3