Get startedGet started for free

Simulation-based CI for slope

1. Simulation-based CI for slope

Bootstrapping is another resampling method that allows for estimation of the sampling distribution of the statistic of interest (here, the slope). Because interest is now in creating a CI, there is no null hypothesis, so there won't be any reason to permute either of the variables.

2. Bootstrap resampling

Recall that the idea behind bootstrap sampling is to resample from the original dataset *with* replacement. That means, for each bootstrap sample, some of the observations will be replicated and others will be left out. In the linear model setting, each resample is taken of a pair of observations one at a time. That is, a set of twins is resampled with replacement. The slope will be calculated based on the bootstrap resample of the original observations. Notice that neither variable was permuted when resampling the observations: each of the twins was paired up with their original sibling.

3. Permutation vs. bootstrap variability

Applying resampling techniques to the same dataset can help us understand the differences between permuting the data and bootstrapping the data. The main difference between these two plots is that the permuted slopes are all centered at the flat line (in black on the left) whereas the bootstrapped slopes are centered around the observed regression line (in black on the right).

4. Permutation vs. bootstrap code

The `infer` package allows for repeated sampling from one of two possible models. First, the `generate` step permutes the variables so that the null hypothesis is true (which allows for hypothesis testing). Or the `generate` step bootstraps the data so as to estimate the variability of the slope which will be important for producing a confidence interval.

5. Sampling distribution: randomization vs. bootstrap

When the permuted and bootstrapped data are compared, we can see, for example, that the permuted slopes are centered around zero, whereas the bootstrapped slopes are centered around the original slope statistic of 0.9.

6. Let's practice!

Thanks for following along with this video, now it is your turn to practice!