Sample size for proportions

Real datasets can be messy. As an Analytics Engineer working with data in the real world, you will encounter situations where the variance in the data is too high to be able to capture a meaningful difference in the metrics. This problem is more likely to happen with continuous metrics such as the average order value in the previous exercise. There are several ways to tackle this, but one of the workarounds is finding a metric that has lower variance but still aligns with the business goals.

Here you will look at calculating the sample size for a binary metric; signup rate which represents whether a user signed up for the service or not, as opposed to the paid price which may vary more between users. The homepage DataFrame and pandas, numpy libraries are already loaded for you, as well as proportion_effectsize from statsmodels.stats.proportion and power from statsmodels.stats.

Cet exercice fait partie du cours

A/B Testing in Python

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Calculate the baseline signup rate for group A
p_A = ____
print('Group A mean signup rate:', ____)

Modifier et exécuter le code

Cet exercice fait partie du cours

A/B Testing in Python

IntermédiaireNiveau de compétence

4.8+

Commencer le cours gratuitement

In this chapter, you’ll learn the foundations of A/B testing. You’ll explore clear steps and use cases, learn the reasons and value of designing and running A/B tests, and discover the most commonly used metrics design and estimation frameworks.

Exercise 1: What is A/B testing?Exercise 2: When an A/B test is not best Exercise 3: A/B testing steps Exercise 4: Randomization effects Exercise 5: Why run experiments?Exercise 6: Correlation visualization Exercise 7: Correlation or causation?Exercise 8: Metrics design and estimation Exercise 9: Means and proportions Exercise 10: Ad impressions metrics

In Chapter 2, you’ll cover the experiment design process. Starting with learning how to formulate strong A/B testing hypotheses, you’ll also cover statistical concepts such as power, error rates, and minimum detectable effects. You’ll finish the chapter by learning to estimate the appropriate sample size needed to yield conclusive results and tackle scenarios with multiple comparisons.

Exercise 1: Hypothesis formulation and distributions Exercise 2: Strong hypothesis formulation Exercise 3: Plotting distributions Exercise 4: Central limit theorem for means Exercise 5: Experimental design: setting up testing parameters Exercise 6: Interpreting p-values Exercise 7: Error rates in the wild Exercise 8: Experimental design: power analysis Exercise 9: Plotting power curves Exercise 10: Sample size for means Exercise 11: Sample size for proportions

Exercice en cours

Exercise 12: Multiple comparisons tests Exercise 13: Is a multiple comparisons correction needed?Exercise 14: Corrected p-values

Here, you’ll discover a concrete workflow for cleaning, preprocessing, and exploring AB testing data, as well as learn the necessary sanity checks we need to follow to ensure valid results. You’ll explore a detailed explanation and example of analyzing difference in proportions A/B tests.

Exercise 1: Data cleaning and exploratory analysis Exercise 2: Proportions EDA Exercise 3: A/B test data cleaning Exercise 4: Sanity checks: Internal validity Exercise 5: SRM Exercise 6: Distributions balance Exercise 7: Sanity checks: external validity Exercise 8: Novelty effects detection Exercise 9: Simpson's paradox in action Exercise 10: Analyzing difference in proportions A/B tests Exercise 11: Difference in proportions A/B test Exercise 12: Interpretation of confidence intervals Exercise 13: Confidence intervals for proportions

In the final chapter, you’ll develop frameworks for analyzing differences in means and leveraging non-parametric tests when several assumptions aren't met. You’ll also learn how to apply the Delta method when analyzing ratio metrics and discover the best practices and some advanced topics to continue the A/B testing mastery journey.

Exercise 1: Analyzing difference in means A/B tests Exercise 2: T-test for difference in means Exercise 3: Pairwise t-tests Exercise 4: Non-parametric statistical tests Exercise 5: Parametric or non-parametric?Exercise 6: Mann-Whitney U test Exercise 7: Chi-square test for independence Exercise 8: Ratio metrics and the delta method Exercise 9: Delta or not?Exercise 10: Delta method Exercise 11: A/B Testing best practices and advanced topics intro Exercise 12: Best practices Exercise 13: Day-of-the-week effect Exercise 14: Wrap-up: A/B testing in python