Get startedGet started for free

A function to do pairs bootstrap

As discussed in the video, pairs bootstrap involves resampling pairs of data. Each collection of pairs fit with a line, in this case using np.polyfit(). We do this again and again, getting bootstrap replicates of the parameter values. To have a useful tool for doing pairs bootstrap, you will write a function to perform pairs bootstrap on a set of x,y data.

This exercise is part of the course

Statistical Thinking in Python (Part 2)

View Course

Exercise instructions

  • Define a function with call signature draw_bs_pairs_linreg(x, y, size=1) to perform pairs bootstrap estimates on linear regression parameters.
    • Use np.arange() to set up an array of indices going from 0 to len(x). These are what you will resample and use them to pick values out of the x and y arrays.
    • Use np.empty() to initialize the slope and intercept replicate arrays to be of size size.
    • Write a for loop to:
      • Resample the indices inds. Use np.random.choice() to do this.
      • Make new \(x\) and \(y\) arrays bs_x and bs_y using the the resampled indices bs_inds. To do this, slice x and y with bs_inds.
      • Use np.polyfit() on the new \(x\) and \(y\) arrays and store the computed slope and intercept.
    • Return the pair bootstrap replicates of the slope and intercept.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def draw_bs_pairs_linreg(x, y, size=1):
    """Perform pairs bootstrap for linear regression."""

    # Set up array of indices to sample from: inds
    inds = ____

    # Initialize replicates: bs_slope_reps, bs_intercept_reps
    bs_slope_reps = ____
    bs_intercept_reps = ____

    # Generate replicates
    for i in range(size):
        bs_inds = np.random.choice(____, size=____)
        bs_x, bs_y = x[____], y[____]
        bs_slope_reps[i], bs_intercept_reps[i] = ____

    return bs_slope_reps, bs_intercept_reps
Edit and Run Code