Power Analysis - Part I
Now we turn to power analysis. You typically want to ensure that any experiment or A/B test you run has at least 80% power. One way to ensure this is to calculate the sample size required to achieve 80% power.
Suppose that you are in charge of a news media website and you are interested in increasing the amount of time users spend on your website. Currently, the time users spend on your website is normally distributed with a mean of 1 minute and a standard deviation of 0.5 minutes. Suppose that you are introducing a feature that loads pages faster and want to know the sample size required to measure a 5% increase in time spent on the website.
In this exercise, we will set up the framework to run one simulation, run a t-test, & calculate the p-value.
This exercise is part of the course
Statistical Simulation in Python
Exercise instructions
- Initialize
effect_size
to 5%,control_mean
to 1 andcontrol_sd
to 0.5. - Using
np.random.normal()
, simulate one drawing ofcontrol_time_spent
andtreatment_time_spent
using the values you initialized. - Run a t-test on
treatment_time_spent
andcontrol_time_spent
usingst.ttest_ind()
wherest
isscipy.stats
, which is already imported. - Statistical significance
stat_sig
should beTrue
ifp_value
is less than 0.05, otherwise it should beFalse
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Initialize effect_size, control_mean, control_sd
effect_size, sample_size, control_mean, control_sd = ____, 50, ____, ____
# Simulate control_time_spent and treatment_time_spent, assuming equal variance
control_time_spent = np.random.normal(loc=control_mean, scale=____, size=sample_size)
treatment_time_spent = np.random.normal(loc=____*(1+effect_size), scale=control_sd, size=____)
# Run the t-test and get the p_value
t_stat, p_value = st.ttest_ind(____, ____)
stat_sig = p_value < ____
print("P-value: {}, Statistically Significant? {}".format(p_value, stat_sig))