Hypothesis testing - Difference of means

We want to test the hypothesis that there is a difference in the average donations received from A and B. Previously, you learned how to generate one permutation of the data. Now, we will generate a null distribution of the difference in means and then calculate the p-value.

For the null distribution, we first generate multiple permuted datasets and store the difference in means for each case. We then calculate the test statistic as the difference in means with the original dataset. Finally, we approximate the p-value by calculating twice the fraction of cases where the difference is greater than or equal to the absolute value of the test statistic (2-sided hypothesis). A p-value of less than say 0.05 could then determine statistical significance.

Generate multiple permutations of donations_A & donations_B & assign it to perm.
Set samples equal to the difference in means of permuted_A_datasets & permuted_B_datasets. We set axis=1 to have a mean for each dataset instead of an overall mean.
Set test_stat equal to the difference in means of donations_A & donations_B.
Calculate p-value p_val as twice the fraction of samples greater than or equal to the absolute value of test_stat.

Basics of Randomness & Simulation

Probability & Data Generation Process

Resampling Methods

Advanced Applications of Simulation

Exercise

Hypothesis testing - Difference of means

Instructions