Calculating lift & significance testing

1. Calculating lift & significance testing

In this lesson, we will talk about calculating lift and statistical significance.

2. Treatment performance compared to the control

The first question you'll want to answer when running a test is, "what's the lift?". In this case, what this means is, "Was the conversion rate higher for the treatment and by how much?". Lift is calculated by taking the difference between the treatment conversion rate and the control conversion rate divided by the control conversion rate. The result is the relative percent difference of treatment compared to control.

3. Calculating lift

To calculate the lift, we calculate the conversion rates of the control and the personalization groups. Then, we calculate lift using the equation from the previous slide, and we have our result! As you can see, the personalization variant improved on the control conversion rate by 194%. That's a huge improvement and a very good signal that we should consider running personalized emails again in the future. But before we get ahead of ourselves, let's talk statistical significance.

4. T-distribution

One way to calculate statistical significance is by conducting a two-sample t-test. A t-test uses the mean and the sample variance to determine the likelihood that the variation between the two samples occurred by chance. The image on the slide shows two overlapping sample distributions. The smaller the overlap between the two distributions, the more likely that there is a true difference between the two samples. I'm not going to explain the details of the t-test, but I highly recommend you do further research if you plan to run these tests at work.

5. P-values

The t-test gives us a t-statistic and a p-value which allows us to estimate the likelihood of finding a result at least as extreme as the treatment in our test. While it depends on sample size and the test, typically a t-statistic of 1.96 evaluates to a p-value of 0.05, which translates to a 95% significance level, a commonly used threshold for significance tests.

6. T-test in Python

To run a t-test in Python, you can use the ttest_ind() function from the stats module of the scipy package. The function takes a list of outcomes for each variant. In this case, the "outcomes" are whether or not each user converted. We can utilize the control and personalization Series we created in the previous lesson as the list of outcomes. This conveniently gives us both a t-statistic and a p-value. Remember, a p-value less than 0.05 is typically considered statistically significant at 95% significance level. Since the p-value here is indeed less than 0.05, we can be confident that the difference in conversion rates is statistically significant.

7. Let's practice!

It's time for you to calculate lift and run a t-test!