Get Started

Confidence bands

1. Confidence bands

Now that we've covered the very basics of confidence intervals, we will move onto confidence bands on continuous functions.

2. Continuous estimation functions

When I say the phrase 'continuous estimation functions' I am referring to instances where we would plot a continuous line. For instance, we could be looking at the ratio of the black to white sheep in a given pen over every hour of a day. Just like our point estimates in the last lesson these estimates often have uncertainty and should be represented as such in your visualizations.

3. Lots of confidence intervals

While you could plot an individual confidence interval at every place an estimate was taken, for instance, every hour for the sheep example, doing so can get cluttered very quickly.

4. The confidence band

A more elegant and attractive way of representing the uncertainty in continuous functions is to draw a ribbon covering the confidence interval width of choice. This is done by drawing two separate lines, one corresponding to each interval's lower bound and one corresponding to each interval's upper bound, and then shading between them.

5. Plotting confidence bands

Plotting a confidence band in Python is just as simple as plotting a confidence interval. We can use the Matplotlib function fill_between() to draw the band. We simply need to pass it the upper and lower levels as y1 and y2 respectively. A shaded ribbon will be drawn accordingly. It's also common to add a simple line plot on top of the band to show the point-estimate as a reference with plt.plot().

6. Separate if possible (a)

Showing confidence bands will almost always give you more information about the underlying data than just showing a point-estimate line, but you do need to be careful when looking at multiple bands. When comparing multiple bands on the same plot, the overlaps make it difficult to determine where one band starts, and another one ends, effectively removing all the benefits of showing the extra information of uncertainty in the first place.

7. Separate if possible (b)

Because of the problems with band overlap, whenever possible break multiple confidence bands into separate faceted plots. While you sacrifice some of the benefits you get from directly comparing overlaid lines, you allow the reader to more accurately read each band, which is almost always the better alternative.

8. Directly comparing two bands (a)

Sometimes though, the direct comparison of classes is the most important part of your data science question. In this scenario, you should limit the number of bands you are comparing to two. Unfortunately, the default plot styles will make your band comparison pretty poor. Whichever band happens to be on top will obscure the patterns of the band beneath it, and colors of the bands, with their large amounts of filled-in space, will clash.

9. Directly comparing two bands (b)

Luckily, with just a couple quick tweaks we can vastly improve the legibility of our overlapped bands. First, by lowering the opacity of the bands, you can see the behaviors of both bands in areas of overlap. A value of around 40% opacity, or alpha of 0.4, is usually good. Second, by swapping default fills with well-paired colors you can reduce the eye-strain-inducing color clash. A good option is usually two colors from a ColorBrewer categorical palette. Just make sure you're not using two vastly different hues or intensities.

10. Let's draw some bands!

Let's now get our hands dirty and put these tips to use in the exercises.