A function to do pairs bootstrap
As discussed in the video, pairs bootstrap involves resampling pairs of data. Each collection of pairs fit with a line, in this case using np.polyfit()
. We do this again and again, getting bootstrap replicates of the parameter values. To have a useful tool for doing pairs bootstrap, you will write a function to perform pairs bootstrap on a set of x,y
data.
This exercise is part of the course
Statistical Thinking in Python (Part 2)
Exercise instructions
- Define a function with call signature
draw_bs_pairs_linreg(x, y, size=1)
to perform pairs bootstrap estimates on linear regression parameters.- Use
np.arange()
to set up an array of indices going from0
tolen(x)
. These are what you will resample and use them to pick values out of thex
andy
arrays. - Use
np.empty()
to initialize the slope and intercept replicate arrays to be of sizesize
. - Write a
for
loop to:- Resample the indices
inds
. Usenp.random.choice()
to do this. - Make new \(x\) and \(y\) arrays
bs_x
andbs_y
using the the resampled indicesbs_inds
. To do this, slicex
andy
withbs_inds
. - Use
np.polyfit()
on the new \(x\) and \(y\) arrays and store the computed slope and intercept.
- Resample the indices
- Return the pair bootstrap replicates of the slope and intercept.
- Use
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def draw_bs_pairs_linreg(x, y, size=1):
"""Perform pairs bootstrap for linear regression."""
# Set up array of indices to sample from: inds
inds = ____
# Initialize replicates: bs_slope_reps, bs_intercept_reps
bs_slope_reps = ____
bs_intercept_reps = ____
# Generate replicates
for i in range(size):
bs_inds = np.random.choice(____, size=____)
bs_x, bs_y = x[____], y[____]
bs_slope_reps[i], bs_intercept_reps[i] = ____
return bs_slope_reps, bs_intercept_reps