LoslegenKostenlos loslegen

Visualizing bootstrap samples

In this exercise, you will generate bootstrap samples from the set of annual rainfall data measured at the Sheffield Weather Station in the UK from 1883 to 2015. The data are stored in the NumPy array rainfall in units of millimeters (mm). By graphically displaying the bootstrap samples with an ECDF, you can get a feel for how bootstrap sampling allows probabilistic descriptions of data.

Diese Übung ist Teil des Kurses

Statistical Thinking in Python (Part 2)

Kurs anzeigen

Anleitung zur Übung

  • Write a for loop to acquire 50 bootstrap samples of the rainfall data and plot their ECDF.
    • Use np.random.choice() to generate a bootstrap sample from the NumPy array rainfall. Be sure that the size of the resampled array is len(rainfall).
    • Use the function ecdf() that you wrote in the prequel to this course to generate the x and y values for the ECDF of the bootstrap sample bs_sample.
    • Plot the ECDF values. Specify color='gray' (to make gray dots) and alpha=0.1 (to make them semi-transparent, since we are overlaying so many) in addition to the marker='.' and linestyle='none' keyword arguments.
  • Use ecdf() to generate x and y values for the ECDF of the original rainfall data available in the array rainfall.
  • Plot the ECDF values of the original data.
  • Hit submit to visualize the samples!

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

for _ in range(50):
    # Generate bootstrap sample: bs_sample
    bs_sample = ____(____, size=____)

    # Compute and plot ECDF from bootstrap sample
    x, y = ____
    _ = plt.plot(____, ____, ____='.', ____='none',
                 ____='gray', ____=0.1)

# Compute and plot ECDF from original data
x, y = ____
_ = plt.plot(____, ____, ____='.')

# Make margins and label axes
plt.margins(0.02)
_ = plt.xlabel('yearly rainfall (mm)')
_ = plt.ylabel('ECDF')

# Show the plot
plt.show()
Code bearbeiten und ausführen