Get startedGet started for free

A/B testing and design

1. A/B testing and design

Hi, my name is Lauryn, your instructor for this course.

2. A/B testing

AB testing is a statistical technique to make data-driven decisions by testing differences and relationships of data collected in a controlled AB design experiment. For example, AB tests can be used to optimize marketing campaigns or test different versions of a website or advertisement. Together, we will learn to analyze AB data, making decisions and predictions.

3. A/B design

Imagine we are interested in determining whether one condition is more impactful than another, such as the time a group takes to eat Cheese versus Pepperoni pizza. We need to use between-subjects design, where each participant contributes data to a single group. We end up with two randomly selected groups to eat Cheese or Pepperoni pizza, group A or B.

4. Control group

Suppose some friends eat cheese as the topping, group A, and other friends eat pepperoni, group B. Ideally, the AB design will have a control group, cheese, to compare the experimental group, pepperoni, to.

5. Data frame organization

When running AB tests, the data frame format may vary. Wide format presents each group in a different column. The time to eat the pizza of group A, cheese, is in one column, and group B, pepperoni, is in another column. In wide format, NAs appear because the AB design is between subjects. Long format presents the measure in a single column, and the group is denoted in a separate column. The groups, cheese or pepperoni, are in one column and speed of pizza eating is in another column. Due to wide format containing NAs, long format is more ideal for AB test analysis.

6. Wide to long format

The tidyr package can be used to transform wide to long format. First call the Pizza data frame, piping into the pivot-underscore-longer function. In this function, we specify the argument cols with the columns of data points, names-underscore-to with what to name the column, and values-underscore-to with the column name for the data points. To remove the NA cells, pipe again into na-dot-omit.

7. Visualize groupings

We can visualize the data by plotting the results in a histogram. For example, the histogram here indicates the speed of a group eating a Cheese pizza and a different group eating a Pepperoni pizza. We can see that Pepperoni pizza, shown in blue, is eaten faster.

8. Separate groupings

We can also separate the plots by specifying a formula in the function facet-underscore-grid. The formula Topping tilde point denotes the Topping groups should be displayed horizontally, one on top of the other. Since no vertical groups are specified, a dot is used on the right side of the tilde.

9. A/B test hypotheses

So, what hypotheses can AB tests assess? AB tests can assess hypotheses that compare a measure between the groups or investigate the relationship, or trend, of measures in the groups. Using our pizza data, hypotheses that compare groups could be whether Cheese is enjoyed more than Pepperoni or whether Cheese pizza is eaten faster than Pepperoni pizza. These types of hypotheses determine if there is a difference between the toppings, or conditions. A hypothesis assessing the relationship of measures can investigate whether a greater enjoyment of pizza is associated with pizza being eaten faster. Hypotheses regarding the relationship of a measure can be specified in both groups, for example, investigating enjoyment and speed using both Cheese and Pepperoni groups, or one group, for example investigating enjoyment and speed only in the Cheese group. Hypotheses regarding a relationship determine if there is a trend between the measures.

10. Let's practice!

Let's practice!