1. Permutation testing
In the final section of this chapter, we will learn to perform significance testing using permutations. The theory behind permutation testing is beyond the scope of this course, but at a very high level, permutation testing tries to obtain the distribution of the test statistic under the null without making any strong assumptions about the data as opposed to classical tests like the t-test and chi-squared test which rely on probability distributions.
2. Steps involved
A permutation test is a kind of non-parametric test, which only assumes that it is possible for all treatment groups to be very similar. Consider data from two groups A and B. A permutation test typically involves the following steps. First we determine the test statistic - this could be anything, but usually we'd like to know the difference in means of the two groups. Next, the observations are pooled and a new dataset is generated for every possible permutation of labels in groups A and B. In practice, it gets restrictive to look at every single permutation, so we use a random sample of the possible permutations.
3. Steps involved
As a next step, we calculate the difference in means for each of these datasets. This set of calculated differences is the exact distribution of the difference in means under the null hypothesis where the group labels are irrelevant. As a final step, we can check to see where our test statistic falls in this distribution. For instance, if the test statistic falls within the 95% confidence interval, we can say that there's no real difference between groups A and B. You could even use this distribution to obtain a p-value if you please. Thus, permutation tests are quite simple and intuitive.
4. Discussion
There are multiple advantages of permutation testing. First of all, they are remarkably flexible for even complicated test statistics. For instance, you could calculate the null distribution of the ratio of the maximum value to the minimum value if you wanted. Additionally, you're not making any distributional assumptions. This makes permutation testing quite widely applicable.
The drawbacks, however, are that they tend to get computationally quite expensive, especially as the size of the data gets bigger. Practically speaking, only a random sample of all possible permutations is used. Furthermore, you will need to write custom code to compute the test statistic and perform this test for various scenarios. It's not as easy as running the t-test. In my opinion, this is a very powerful statistical tool to have in your back pocket.
5. Donation website
In the next three exercises, we will code up a permutation test from scratch. Suppose that you are in charge of a non-profit and are testing two different web page designs for your donation website. You are interested in seeing whether there is any difference in the donations received from each of these designs - design A and design B. You have data for donations from each of these website designs.
6. Let's practice!
Let's work through the examples and see if there's a difference.