1. Initial A/B test design
Great job on the exercises! Let’s begin exploring A/B testing in detail.
2. Increasing our app's revenue with A/B testing
For our app, we are looking to A/B test some aspects of our consumable purchase paywall with the goal of increasing our revenue.
We will discuss several approaches to this specific problem, but the math behind this could easily be applied in many similar situations. Be sure to think about how you would apply this knowledge in your own work.
3. Paywall views & Demographics data
Here is our demographics data set we as well as this additional dataset paywall_views containing a timestamp of when the user viewed the paywall and a purchase column that is 0/1 demarking if they purchased.
4. Chapter 3 goals
For the remainder of this chapter, we will lay the foundation of an A/B test analysis by introducing some key terminology and walking through the related pandas code when applicable.
5. Response variable
In an A/B test, we must define a response variable that we will use to measure our impact. This should be either a KPI or something directly related to a KPI.
Additionally, you should select a response that is directly measurable like purchases rather than something difficult to measure.
6. Factors & variants
Next, we have a set of factors that impact our Response such as the color of a paywall on purchases. Related are variants which are particular manifestations of that factor, such as a red and blue paywall.
7. Experimental unit of our test
Next we have our experimental unit. This is the unit over which metrics are measured before aggregating over the control or treatment group overall.
For example, if we were looking at purchases of a consumable as our response we could use users as our experimental unit and compare the average number of purchases per user across our two groups.
8. Calculating experimental units
For our dataset, let us find what this value is, which can be useful to do before beginning an A/B test.
First we can join our demographics data to our paywall view data. Next we can group by uid, and sum up the number of purchases. Finally, we can take the average of this value.
While this is straightforward to calculate, and if we were to assign users randomly between groups, it could be compared, it isn't very meaningful
9. Calculating experimental units
Looking at the min and max of this value we see that it varies a lot, which makes sense since the amount of time on our platform varies widely between users.
10. Experimental unit of our test
Another experimental unit that can be used is user-days, where we treat each user’s actions on a given day as a unique unit. This allows us to in some sense have more meaningful raw data.
11. Calculating user-days
We can find this value by once again joining our data, and then aggregating by day _and_ uid. Then we follow the same steps as before to aggregate and average our data. Looking at the min and max, highlights the smaller range of this data.
12. Randomness of experimental units
A few notes about experimental units before wrapping up our chapter. In almost any circumstance, we want to randomize by user, regardless of our experimental unit. Otherwise the user could have an inconsistent experience which may impact the results.
13. Designing your A/B test
While calculating these quantities is similar to some of our KPI calculations it is worth exploring the type of data we are working with now. Additionally, viewing these metrics, which measure the same underlying thing and examining their properties relative to each other is very important to practice. It will build intuition for when you are designing a test of your own.
14. Let's practice!