Simpson's paradox in action

Generalizing our A/B test results to different segments of the population can be of utmost importance to the business. Sometimes we want to save the cost of running other tests in different cities, by different devices, etc. Making sure that our results are consistent by subpopulations can increase our confidence to make such generalizations.

Examine the simp_balanced and simp_imbalanced datasets for Simpson's paradox to gain a good sense for how this phenomena can occur in A/B testing scenarios.

This exercise is part of the course

A/B Testing in Python

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Calculate the conversion rate per variant and then browser
imbalanced_variant_rate = simp_imbalanced.____('____')['____'].____()
imbalanced_variant_browser_rate = simp_imbalanced.____(['____','____'])['____'].____()

print(imbalanced_variant_rate)
print(imbalanced_variant_browser_rate)

Edit and Run Code

This exercise is part of the course

A/B Testing in Python

IntermediateSkill Level

4.8+

Start Course for Free

In this chapter, you’ll learn the foundations of A/B testing. You’ll explore clear steps and use cases, learn the reasons and value of designing and running A/B tests, and discover the most commonly used metrics design and estimation frameworks.

Exercise 1: What is A/B testing?Exercise 2: When an A/B test is not best Exercise 3: A/B testing steps Exercise 4: Randomization effects Exercise 5: Why run experiments?Exercise 6: Correlation visualization Exercise 7: Correlation or causation?Exercise 8: Metrics design and estimation Exercise 9: Means and proportions Exercise 10: Ad impressions metrics

In Chapter 2, you’ll cover the experiment design process. Starting with learning how to formulate strong A/B testing hypotheses, you’ll also cover statistical concepts such as power, error rates, and minimum detectable effects. You’ll finish the chapter by learning to estimate the appropriate sample size needed to yield conclusive results and tackle scenarios with multiple comparisons.

Exercise 1: Hypothesis formulation and distributions Exercise 2: Strong hypothesis formulation Exercise 3: Plotting distributions Exercise 4: Central limit theorem for means Exercise 5: Experimental design: setting up testing parameters Exercise 6: Interpreting p-values Exercise 7: Error rates in the wild Exercise 8: Experimental design: power analysis Exercise 9: Plotting power curves Exercise 10: Sample size for means Exercise 11: Sample size for proportions Exercise 12: Multiple comparisons tests Exercise 13: Is a multiple comparisons correction needed?Exercise 14: Corrected p-values

Here, you’ll discover a concrete workflow for cleaning, preprocessing, and exploring AB testing data, as well as learn the necessary sanity checks we need to follow to ensure valid results. You’ll explore a detailed explanation and example of analyzing difference in proportions A/B tests.

Exercise 1: Data cleaning and exploratory analysis Exercise 2: Proportions EDA Exercise 3: A/B test data cleaning Exercise 4: Sanity checks: Internal validity Exercise 5: SRM Exercise 6: Distributions balance Exercise 7: Sanity checks: external validity Exercise 8: Novelty effects detection Exercise 9: Simpson's paradox in action

Current Exercise

Exercise 10: Analyzing difference in proportions A/B tests Exercise 11: Difference in proportions A/B test Exercise 12: Interpretation of confidence intervals Exercise 13: Confidence intervals for proportions

In the final chapter, you’ll develop frameworks for analyzing differences in means and leveraging non-parametric tests when several assumptions aren't met. You’ll also learn how to apply the Delta method when analyzing ratio metrics and discover the best practices and some advanced topics to continue the A/B testing mastery journey.

Exercise 1: Analyzing difference in means A/B tests Exercise 2: T-test for difference in means Exercise 3: Pairwise t-tests Exercise 4: Non-parametric statistical tests Exercise 5: Parametric or non-parametric?Exercise 6: Mann-Whitney U test Exercise 7: Chi-square test for independence Exercise 8: Ratio metrics and the delta method Exercise 9: Delta or not?Exercise 10: Delta method Exercise 11: A/B Testing best practices and advanced topics intro Exercise 12: Best practices Exercise 13: Day-of-the-week effect Exercise 14: Wrap-up: A/B testing in python