Get startedGet started for free

Bootstrapping a confidence interval

A useful tool for assessing the variability of some data is the bootstrap. In this exercise, you'll write your own bootstrapping function that can be used to return a bootstrapped confidence interval.

This function takes three parameters: a 2-D array of numbers (data), a list of percentiles to calculate (percentiles), and the number of boostrap iterations to use (n_boots). It uses the resample function to generate a bootstrap sample, and then repeats this many times to calculate the confidence interval.

This exercise is part of the course

Machine Learning for Time Series Data in Python

View Course

Exercise instructions

  • The function should loop over the number of bootstraps (given by the parameter n_boots) and:
    • Take a random sample of the data, with replacement, and calculate the mean of this random sample
    • Compute the percentiles of bootstrap_means and return it

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

from sklearn.utils import ____

def bootstrap_interval(data, percentiles=(2.5, 97.5), n_boots=100):
    """Bootstrap a confidence interval for the mean of columns of a 2-D dataset."""
    # Create our empty array to fill the results
    bootstrap_means = np.zeros([n_boots, data.shape[-1]])
    for ii in range(____):
        # Generate random indices for our data *with* replacement, then take the sample mean
        random_sample = ____
        bootstrap_means[ii] = random_sample.mean(axis=0)
        
    # Compute the percentiles of choice for the bootstrapped means
    percentiles = ____(bootstrap_means, percentiles, axis=0)
    return percentiles
Edit and Run Code