The bootstrap histogram
You are considering a vacation to Cincinnati in May, but you have a severe sensitivity to NO2. You pull a few years of pollution data from Cincinnati in May and look at a bootstrap estimate of the average NO2 levels. You only have one estimate to look at the best way to visualize the results of your bootstrap estimates is with a histogram.
While you like the intuition of the bootstrap histogram by itself, your partner who will be going on the vacation with you, likes seeing percent intervals. To accommodate them, you decide to highlight the 95% interval by shading the region.
This exercise is part of the course
Improving Your Data Visualizations in Python
Exercise instructions
- Provide the
percentile()
function with the upper and lower percentiles needed to get a 95% interval. - Shade the background of the plot in the 95% interval.
- Draw histogram of bootstrap means with 100 bins.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
cinci_may_NO2 = pollution.query("city == 'Cincinnati' & month == 5").NO2
# Generate bootstrap samples
boot_means = bootstrap(cinci_may_NO2, 1000)
# Get lower and upper 95% interval bounds
lower, upper = np.percentile(boot_means, [____, ____])
# Plot shaded area for interval
plt.axvspan(____, ____, color = 'gray', alpha = 0.2)
# Draw histogram of bootstrap samples
sns.histplot(____, ____ = 100)
plt.show()