Hypothesis test: did earthquake frequency change?
Obviously, there was a massive increase in earthquake frequency once wastewater injection began. Nonetheless, you will still do a hypothesis test for practice. You will not test the hypothesis that the interearthquake times have the same distribution before and after 2010, since wastewater injection may affect the distribution. Instead, you will assume that they have the same mean. So, compute the p-value associated with the hypothesis that the pre- and post-2010 interearthquake times have the same mean, using the mean of pre-2010 time gaps minus the mean of post-2010 time gaps as your test statistic.
This exercise is part of the course
Case Studies in Statistical Thinking
Exercise instructions
- Compute the observed test statistic. The variables
mean_dt_pre
andmean_dt_post
from previous exercises are in your namespace. - Shift the post-2010 data to have the same mean as the pre-2010 data. Store the result as
dt_post_shift
. - Draw 10,000 bootstrap replicates each of mean of
dt_pre
anddt_post_shift
. Store the respective results inbs_reps_pre
andbs_reps_post
. - Compute replicates of difference of means by subtracting
bs_reps_post
frombs_reps_pre
. - Compute and print the p-value. Consider "at least as extreme as" to be that the test statistic is greater than or equal to what was observed.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Compute the observed test statistic
mean_dt_diff = ____ - ____
# Shift the post-2010 data to have the same mean as the pre-2010 data
dt_post_shift = ____ - ____ + ____
# Compute 10,000 bootstrap replicates from arrays
bs_reps_pre = ____
bs_reps_post = ____
# Get replicates of difference of means
bs_reps = ____ - ____
# Compute and print the p-value
p_val = ____(____ >= ____) / 10000
print('p =', p_val)