Get Started

Calculating sample size

1. Calculating sample size

Congratulations, you are almost finished your first chapter on A/B testing.

2. Calculating the sample size of our test

Let’s finish covering the knowledge we need to calculate our tests needed sample size.

3. Null hypothesis

First, let’s discuss the Null hypothesis. This is the hypothesis that our control and treatment, that is our two phrases, have the same impact on the response. Any observed difference is just due to randomness. If we can conclude this is not the case, then we say our results are statistically significant and that there is a difference.

4. Types of error & confidence level

Rejecting the null hypothesis when it is true is called type I error, and retaining the false null hypothesis is type II error. We define our probability of not making a type I error as the Confidence Level. We will not go into great detail, but intuitively it should make sense that the higher we make this value the larger of a sample we will need. A common value of this is 0 point 95

5. Statistical power

Related to this is the idea of Statistical Power. Power is the probability of finding statistically significant results when the Null hypothesis is false.

6. Connecting the Different Components

Power and Confidence level are connected to the standard error and sensitivity of our test. To estimate our needed sample size, we can choose our desired sensitivity, set our desired confidence level & power, and then estimate our standard error using these values.

7. Power formula

Here is a formula for Power. The details are out of scope for this course. Suffice it to say that the Phi represents the normal distribution function and 'v's our variance. The key takeaway to note is that the relation between Power and n, our sample size, is that as n goes up so too does our power. Additionally, as our confidence level goes up our power goes down.

8. Sample size function

Here is that function implemented in python, now to solve for n rather than power. Again the details can be explored on your own all that is important understanding the relations between these various values.

9. Calculating our needed sample size

Let us now return to our example and apply this function to find the sample size needed for our test. In the previous chapter we found a baseline conversion rate of 0 point 03468. Let us choose 0 point 95 to be our CL and 0 point 8 to be our desired power. Then plugging these into our sample size function and we can see that to test this with these levels we will need a sample of size 45788 for each group.

10. Generality of this function

In the exercise, you will further explore this function to gain a deeper intuition. Note that this function is specific to calculations with conversion rates. The functions and calculations for different classes of response variables are analogous and with the knowledge of this case, should be easy to unpack on your own.

11. Decreasing the needed sample size

It is important to note that there are various ways to decrease the needed sample size. One is by switching the unit of observation in a way that reduces variability in the data such as from revenue to conversion, because you are decreasing the variation of results. Another way is excluding users who are irrelevant to the process. For example, if we were not excluding users who never saw a paywall then we have a more variable set of users, and thus a higher sample size requirement. More of these relationships of what impacts our sample size can be explored by thinking hard about the equation relating these various forces to those shown earlier.

12. Let's practice!

Now you are ready to start designing an A/B test. Let’s practice!