How to do the permutation test

Based on our EDA and parameter estimates, it is tough to discern improvement from the semifinals to finals. In the next exercise, you will test the hypothesis that there is no difference in performance between the semifinals and finals. A permutation test is fitting for this. We will use the mean value of f as the test statistic. Which of the following simulates getting the test statistic under the null hypothesis?

Strategy 1
Take an array of semifinal times and an array of final times for each swimmer for each stroke/distance pair.
Go through each array, and for each index, swap the entry in the respective final and semifinal array with a 50% probability.
Use the resulting final and semifinal arrays to compute f and then the mean of f.
Strategy 2
Take an array of semifinal times and an array of final times for each swimmer for each stroke/distance pair and concatenate them, giving a total of 96 entries.
Scramble the concatenated array using the np.permutation() function. Assign the first 48 entries in the scrambled array to be "semifinal" and the last 48 entries to be "final."
Compute f from these new semifinal and final arrays, and then compute the mean of f.
Strategy 3
Take the array f we used in the last exercise.
Multiply each entry of f by either 1 or -1 with equal probability.
Compute the mean of this new array to get the test statistic.
Strategy 4
Define a function with signature compute_f(semi_times, final_times) to compute f from inputted swim time arrays.
Draw a permutation replicate using dcst.draw_perm_reps(semi_times, final_times, compute_f).

Possible answers

Strategy 1

Strategy 2

Strategy 3

Strategy 4

Fish sleep and bacteria growth: A review of Statistical Thinking I and II

Analysis of results of the 2015 FINA World Swimming Championships

The "Current Controversy" of the 2013 World Championships

Statistical seismology and the Parkfield region

Earthquakes and oil mining in Oklahoma

Exercise

How to do the permutation test

Instructions

Possible answers