Linear regression of average split time
We will assume that the swimmers slow down in a linear fashion over the course of the 800 m event. The slowdown per split is then the slope of the mean split time versus split number plot. Perform a linear regression to estimate the slowdown per split and compute a pairs bootstrap 95% confidence interval on the slowdown. Also show a plot of the best fit line.
Note: We can compute error bars for the mean split times and use those in the regression analysis, but we will not take those into account here, as that is beyond the scope of this course.
This exercise is part of the course
Case Studies in Statistical Thinking
Exercise instructions
- Use
np.polyfit()
to perform a linear regression to get the slowdown per split. The variablessplit_number
andmean_splits
are already in your namespace. Store the slope and interecept respectively inslowdown
andsplit_3
. - Use
dcst.draw_bs_pairs_linreg()
to compute 10,000 pairs bootstrap replicates of the slowdown per split. Store the result inbs_reps
. The bootstrap replicates of the intercept are not relevant for this analysis, so you can store them in the throwaway variable_
. - Compute the 95% confidence interval of the slowdown per split.
- Plot the split number (
split_number
) versus the mean split time (mean_splits
) as dots, along with the best-fit line.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Perform regression
____, ____ = ____
# Compute pairs bootstrap
bs_reps, _ = ____
# Compute confidence interval
conf_int = ____
# Plot the data with regressions line
_ = ____(____, ____, marker='.', linestyle='none')
_ = ____(____, ____ * ____ + ____, '-')
# Label axes and show plot
_ = plt.xlabel('split number')
_ = plt.ylabel('split time (s)')
plt.show()
# Print the slowdown per split
print("""
mean slowdown: {0:.3f} sec./split
95% conf int of mean slowdown: [{1:.3f}, {2:.3f}] sec./split""".format(
slowdown, *conf_int))