Running a simple bootstrap
Welcome to the first exercise in the bootstrapping section. We will work through an example where we learn to run a simple bootstrap. As we saw in the video, the main idea behind bootstrapping is sampling with replacement.
Suppose you own a factory that produces wrenches. You want to be able to characterize the average length of the wrenches and ensure that they meet some specifications. Your factory produces thousands of wrenches every day, but it's infeasible to measure the length of each wrench. However, you have access to a representative sample of 100 wrenches. Let's use bootstrapping to get the 95% confidence interval (CI) for the average lengths.
Examine the list wrench_lengths
, which has 100 observed lengths of wrenches, in the shell.
This exercise is part of the course
Statistical Simulation in Python
Exercise instructions
- Draw a random sample with replacement from
wrench_lengths
and store it intemp_sample
. Setsize = len(wrench_lengths)
. - Calculate the mean length of each sample, assign it to
sample_mean
, and then append it tomean_lengths
. - Calculate the bootstrapped mean (
boot_mean
) and bootstrapped 95% confidence interval (boot_95_ci
) by usingnp.percentile()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Draw some random sample with replacement and append mean to mean_lengths.
mean_lengths, sims = [], 1000
for i in range(sims):
temp_sample = ____(____, replace=____, size=____)
sample_mean = ____
mean_lengths.append(sample_mean)
# Calculate bootstrapped mean and 95% confidence interval.
boot_mean = np.mean(____)
boot_95_ci = ____(mean_lengths, [2.5, 97.5])
print("Bootstrapped Mean Length = {}, 95% CI = {}".format(boot_mean, boot_95_ci))