Get startedGet started for free

Bootstrap replicates of the mean and the SEM

In this exercise, you will compute a bootstrap estimate of the probability distribution function of the mean annual rainfall at the Sheffield Weather Station. Remember, we are estimating the mean annual rainfall we would get if the Sheffield Weather Station could repeat all of the measurements from 1883 to 2015 over and over again. This is a probabilistic estimate of the mean. You will plot the PDF as a histogram, and you will see that it is Normal.

In fact, it can be shown theoretically that under not-too-restrictive conditions, the value of the mean will always be Normally distributed. (This does not hold in general, just for the mean and a few other statistics.) The standard deviation of this distribution, called the standard error of the mean, or SEM, is given by the standard deviation of the data divided by the square root of the number of data points. I.e., for a data set, sem = np.std(data) / np.sqrt(len(data)). Using hacker statistics, you get this same result without the need to derive it, but you will verify this result from your bootstrap replicates.

This exercise is part of the course

Statistical Thinking in Python (Part 2)

View Course

Exercise instructions

  • Draw 10,000 bootstrap replicates of the mean annual rainfall using your draw_bs_reps() function.
  • Compute and print the standard error of the mean.
  • Compute and print the standard deviation of your bootstrap replicates.
  • Make a histogram of the replicates using the normed=True keyword argument and 50 bins. Be sure to label the axes.
  • Show your plot.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Take 10,000 bootstrap replicates of the mean: bs_replicates
bs_replicates = ____

# Compute and print SEM
print(____ / np.sqrt(____))

# Compute and print standard deviation of bootstrap replicates
print(____)

# Make a histogram of the results
_ = plt.hist(____, ____=50, ____=True)
_ = plt.xlabel('mean annual rainfall (mm)')
_ = plt.ylabel('PDF')

# Show the plot
plt.show()
Edit and Run Code