CommencerCommencer gratuitement

Bootstrapping a confidence interval

A useful tool for assessing the variability of some data is the bootstrap. In this exercise, you'll write your own bootstrapping function that can be used to return a bootstrapped confidence interval.

This function takes three parameters: a 2-D array of numbers (data), a list of percentiles to calculate (percentiles), and the number of boostrap iterations to use (n_boots). It uses the resample function to generate a bootstrap sample, and then repeats this many times to calculate the confidence interval.

Cet exercice fait partie du cours

Machine Learning for Time Series Data in Python

Afficher le cours

Instructions

  • The function should loop over the number of bootstraps (given by the parameter n_boots) and:
    • Take a random sample of the data, with replacement, and calculate the mean of this random sample
    • Compute the percentiles of bootstrap_means and return it

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

from sklearn.utils import ____

def bootstrap_interval(data, percentiles=(2.5, 97.5), n_boots=100):
    """Bootstrap a confidence interval for the mean of columns of a 2-D dataset."""
    # Create our empty array to fill the results
    bootstrap_means = np.zeros([n_boots, data.shape[-1]])
    for ii in range(____):
        # Generate random indices for our data *with* replacement, then take the sample mean
        random_sample = ____
        bootstrap_means[ii] = random_sample.mean(axis=0)
        
    # Compute the percentiles of choice for the bootstrapped means
    percentiles = ____(bootstrap_means, percentiles, axis=0)
    return percentiles
Modifier et exécuter le code