Get startedGet started for free

Basic confidence intervals

You are a data scientist for a fireworks manufacturer in Des Moines, Iowa. You need to make a case to the city that your company's large fireworks show has not caused any harm to the city's air. To do this, you look at the average levels for pollutants in the week after the fourth of July and how they compare to readings taken after your last show. By showing confidence intervals around the averages, you can make a case that the recent readings were well within the normal range.

This data is loaded as average_ests with a row for each measured pollutant.

This exercise is part of the course

Improving Your Data Visualizations in Python

View Course

Exercise instructions

  • Create the lower and upper 95% interval boundaries:

    • Create the lower boundary by subtracting 1.96 standard errors ('std_err') from the 'mean' of estimates.
    • Create the upper boundary by adding 1.96 standard errors ('std_err') to the 'mean' of estimates.
  • Pass pollutant as the faceting variable to sns.FacetGrid() and unlink the x-axes of the plots so intervals are all well-sized.

  • Pass the constructed interval boundaries to the mapped plt.hlines() function.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Construct CI bounds for averages
average_ests['lower'] = average_ests['____'] - 1.96*average_ests['____']
average_ests['upper'] = average_ests['____'] + 1.96*average_ests['____']

# Setup a grid of plots, with non-shared x axes limits
g = sns.FacetGrid(average_ests, row = '____', ____ = False)

# Plot CI for average estimate
g.map(plt.hlines, 'y', '____', '____')

# Plot observed values for comparison and remove axes labels
g.map(plt.scatter, 'seen', 'y', color = 'orangered').set_ylabels('').set_xlabels('') 

plt.show()
Edit and Run Code