LoslegenKostenlos loslegen

90, 95, and 99% intervals

You are a data scientist for an outdoor adventure company in Fairbanks, Alaska. Recently, customers have been having issues with SO2 pollution, leading to costly cancellations. The company has sensors for CO, NO2, and O3 but not SO2 levels.

You've built a model that predicts SO2 values based on the values of pollutants with sensors (loaded as pollution_model, a statsmodels object). You want to investigate which pollutant's value has the largest effect on your model's SO2 prediction. This will help you know which pollutant's values to pay most attention to when planning outdoor tours. To maximize the amount of information in your report, show multiple levels of uncertainty for the model estimates.

Diese Übung ist Teil des Kurses

Improving Your Data Visualizations in Python

Kurs anzeigen

Anleitung zur Übung

  • Fill in the appropriate interval width percents (from 90,95, and 99%) according to the values list in alpha.
  • In the for loop, color the interval by its assigned color.
  • Pass the loop's width percentage value to plt.hlines() to label the legend.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Add interval percent widths
alphas = [     0.01,  0.05,   0.1] 
widths = [ '__% CI', '__%', '__%']
colors = ['#fee08b','#fc8d59','#d53e4f']

for alpha, color, width in zip(alphas, colors, widths):
    # Grab confidence interval
    conf_ints = pollution_model.conf_int(alpha)
    
    # Pass current interval color and legend label to plot
    plt.hlines(y = conf_ints.index, xmin = conf_ints[0], xmax = conf_ints[1],
               colors = ____, ____ = width, linewidth = 10) 

# Draw point estimates
plt.plot(pollution_model.params, pollution_model.params.index, 'wo', label = 'Point Estimate')

plt.legend()
plt.show() 
Code bearbeiten und ausführen