1. How does the performance of swimmers decline over long events?
Even though the competitors in the World Championships are elite athletes, they tire over the course of longer events. In the next set of exercises, you will investigate how their performance declines during long swims. In this video, we develop a clear question and set up the analysis.
2. More swimming background
Recall that the swimming pool is 50 meters long. So, during the course of an 800 meter swim, a swimmer swims 16 lengths of the pool.
3. More swimming background
The time it takes him or her to swim a length is called a *split*. An 800 meter event has 16 splits, a 1500 meter event has 30 splits, and a 50 meter event has just one.
4. More swimming background
Here is a plot of Katie Ledecky's splits for her heat in the 800 meter freestyle in the 2015 World Championships.
Perhaps the most striking feature of this plot is the very fast first split.
5. More swimming background
This is because the first split is different from the rest. The swimmers start the race from starting blocks, and this provides an extra boost for the first split.
6. More swimming background
The last split is also fast. This is because the swimmer knows she can push hard because after the last split, the race is over.
Finally, we notice that over the course of the race, the splits get longer, meaning that Ledecky is slowing down as the race goes on.
Despite slowing down, she still had the fastest time in the heats.
7. Slowing down
Let's look at the second fastest performance in the heats, from Australian Jessica Ashwood. Unlike Ledecky, Ashwood swam faster in the last half of the race.
So, just by looking at individual athletes, it is hard to quantify the general effect of fatigue as a long race wears on.
8. Quantifying slowdown
In the next set of exercises, you will use the splits for all swimmers in the heats of the women's 800 meter freestyle. You will use the heats only so you do not overcount the elite swimmers who made it to the finals. In doing the analysis, you will not use the first two splits, nor the last two splits, because of the effects I just discussed.
From the splits of all swimmers, you will compute the mean split time for each split number. Finally, you will perform a linear regression on the mean split times versus split number. This will give you the slowdown per split.
After quantifying the slowdown, you will do a hypothesis test to determine if the observed slowdown can be chalked up to random variation.
9. Hypothesis tests for correlation
Before sending you off to the exercises, I want to remind you about how to do a hypothesis test for correlation. The null hypothesis is that the two variables, in this case split time and split number, are completely uncorrelated. To simulate the data under the null hypothesis, you scramble the order of the split numbers using `np.random.permutation()`. You then compute the Pearson correlation as the test statistic to get your permutation replicate. Finally, the p-value is the fraction of replicates that have a Pearson correlation at least as large as was observed.
10. Let's practice!
Ok, have at it!