Exercise

Visualizing Test Statistics

In this exercise, you will approach the null hypothesis by comparing the distribution of a test statistic arrived at from two different ways.

First, you will examine two "populations", grouped by early and late times, and computing the test statistic distribution. Second, shuffle the two populations, so the data is no longer time ordered, and each has a mix of early and late times, and then recompute the test statistic distribution.

To get you started, we've pre-loaded the two time duration groups, group_duration_short and group_duration_long, and two functions, shuffle_and_split() and plot_test_statistic().

Instructions

100 XP
  • Use np.random.choice() to resample group_duration_short and group_duration_long, and difference the resamples to compute the test_statistic_unshuffled.
  • Use shuffle_and_split() on the original group_duration_short and group_duration_long (specified in this order) to create two new mixed populations.
  • Resample the shuffled populations, and subtract resample_short from resample_long to compute a new test_statistic_shuffled.
  • Use plot_test_statistic() to plot both test statistic distributions, and compare visually.