Hypothesis testing for comparing two means

1. Hypothesis testing for comparing two means via simulation

In this video we demonstrate setting up and conducting a hypothesis test to compare means from two independent distributions using simulation.

2. Motivation

The motivating question is whether use of embryonic stem cells help improve heart function following a heart attack. The data can be found in the openintro package. These data were collected as part of a study in which sheep that have had a heart attack were randomly assigned to the embryonic stem cell therapy (labeled as "esc" in the data) or to a traditional therapy as a control. The researchers measured the heart pumping capacity of these sheep before and after the study. If a sheep's heart pumping capacity increased from before to after the study, this indicates a stronger recovery. In our analysis we want to evaluate the effect of the embryonic stem cell therapy on heart pumping capacity relative to the control group.

3. Analysis outline

In order to do this we will first calculate the difference between before treatment and after treatment heart pumping capacities for each sheep. We call this variable "change". We want to evaluate whether the data suggest that average change is higher, on average, for the treatment group.

4. Analysis outline

Then, we set our hypotheses. Our null hypothesis should state the status quo, in other words, "there is nothing going on". In context, this means no difference between the average changes in the treatment and control groups. The alternative hypothesis says that the average change is higher for those in the treatment group compared to the control.

5. Analysis outline

Finally, we conduct the hypothesis test. Conceptually here is how we go about it. First, we write the values of "change" on 18 index cards. Then we shuffle these cards and randomly split them into two equal sized decks, one representing treatment and other representing control. We then calculate and record the test statistic: difference in average change between treatment and control. At this point we have no control over which card ended up in which pile. Since the cards are randomly shuffled into the two decks, we would not expect to see a difference between the average change values in each deck. In other words, we would expect the simulated difference between the two sample means to be 0. But of course, this number will not be exactly 0. Just by random chance it can be just a bit different than 0, or quite different than 0. We repeat the simulation many times in order to get a sense of how much the simulated difference in means varies. Finally we calculate the p-value as the percentage of simulations where the test statistic is at least as extreme as the observed difference in sample means. But, obviously, we don't actually do this by shuffling index cards...

6. Hypothesis test: generate resamples

Instead, we use computation to conduct this simulation, specifically using the infer package in R.

7. Hypothesis test: generate resamples

We start with the data frame containing our variables of interest, and specify our model as response vs. explanatory. In our example the response variable is change and the explanatory variable is treatment group.

8. Hypothesis test: generate resamples

We then declare the null hypothesis. The null argument in the hypothesize function can be set to a point value or to independence, indicating that the response and explanatory variables are independent of each other.

9. Hypothesis test: generate resamples

Then, using the data, the model we specified, and the null hypothesis we declared, we generate many resamples by permuting the labels of the two groups.

10. Hypothesis test: generate resamples

Finally, we calculate the test statistic for each of our resamples. The statistic we're interested in is the difference in means.

11. Hypothesis test: calculate p-value

Using the simulated sample statistics, we calculate the p-value as the proportion of simulations where the simulated difference between the sample means is at least as extreme as the one observed.

12. Let's practice!

Now it's your turn.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.