Get startedGet started for free

Preparing to run an A/B test

1. Preparing to run an A/B test

Awesome job! Let’s now prepare the A/B test we will follow for the remainder of the course.

2. A/B testing example - paywall variants

Imagine our consumable paywall currently says “I hope you are enjoying the relaxing benefits of our app. Consider making a purchase” and we want to see whether, the phrase “Don’t miss out! Try one of our new products!” will increase revenue. Additionally, there are three different consumable price points which may factor into our test.

3. Considerations in test design

There are two primary concerns in test design. Ensuring that our test can be practically run and that we can derive meaningful results from it. These two objectives are strongly connected.

4. Test sensitivity

A good starting point is to ask, what percentage change would it be meaningful to detect in your response variable. 1% - 20%? It makes sense that smaller changes would be more difficult to detect, as they can more easily be overshadowed by randomness. The minimum level of change we want to detect is called __sensitivity__. A good exercise is to look at what different sensitivities look like for your experimental unit of choice. For example, let’s look here at what different changes mean for our revenue per user in the period of our test.

5. Revenue per user

We can calculate our revenue per user in this period by first, merging our data by uid. Then we can group by uid and aggregate to find the number of users and the average revenue per user.

6. Evaluating different sensitivities

Finding a 1%, 10%, and 20% change in revenue per user, it seems that 10% is a good number to land on. 1% seems too low to be easily measured, and 20% seems like an unrealistic goal for a wording change. Determining this is something that you must do by combing views like this of the data with experience and intuition.

7. Data variability

While understanding the desired change in the data due to the treatment is important, it is also important to understand the latent variability in the data. In this case, it makes sense to understand if the purchase amount is consistent across all users, or if it varies widely. A change due to the treatment will be more easily captured in the former case.

8. Standard deviation

We can find the standard deviation of our data using the pandas `std()` method by passing in a vector of our statistics. Typically, we will rely on the standard deviation of the test results in evaluating our test, but using the value of our initial data is important for planning as we will see.

9. Variability of revenue per user

Interesting. It seems like there is a lot of variability in our data. One way to contextualize this is by comparing to our mean. We see that our standard deviation is over 100% of what our mean is.

10. Variability of purchases per user

Here we have updated our data to just look at the number of purchases per user. Let us calculate the standard deviation & mean of this. Interestingly it seems to be much lower in relation to the mean. This makes sense, because we have one fewer step of variation - which price point they choose in purchasing.

11. Choosing experimental unit & response variable

Revenue is our ultimate goal, but paywall-view-to-purchase conversion is a better response as it is more granular and more directly related to the change. Additionally, we will use conversion per paywall-view as our unit for simplicity, though others should be clear.

12. Finding our baseline conversion rate

To finish, lets calculate the baseline for this metric. To do this we simply divide the number of conversions by the total number of views. We see we have a baseline of 0 point 347. Calculating variance for this quantity is outside our scope for now.

13. Let's practice!

Now good luck on the exercises!