ComeçarComece de graça

90, 95, and 99% intervals

You are a data scientist for an outdoor adventure company in Fairbanks, Alaska. Recently, customers have been having issues with SO2 pollution, leading to costly cancellations. The company has sensors for CO, NO2, and O3 but not SO2 levels.

You've built a model that predicts SO2 values based on the values of pollutants with sensors (loaded as pollution_model, a statsmodels object). You want to investigate which pollutant's value has the largest effect on your model's SO2 prediction. This will help you know which pollutant's values to pay most attention to when planning outdoor tours. To maximize the amount of information in your report, show multiple levels of uncertainty for the model estimates.

Este exercício faz parte do curso

Improving Your Data Visualizations in Python

Ver curso

Instruções do exercício

  • Fill in the appropriate interval width percents (from 90,95, and 99%) according to the values list in alpha.
  • In the for loop, color the interval by its assigned color.
  • Pass the loop's width percentage value to plt.hlines() to label the legend.

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Add interval percent widths
alphas = [     0.01,  0.05,   0.1] 
widths = [ '__% CI', '__%', '__%']
colors = ['#fee08b','#fc8d59','#d53e4f']

for alpha, color, width in zip(alphas, colors, widths):
    # Grab confidence interval
    conf_ints = pollution_model.conf_int(alpha)
    
    # Pass current interval color and legend label to plot
    plt.hlines(y = conf_ints.index, xmin = conf_ints[0], xmax = conf_ints[1],
               colors = ____, ____ = width, linewidth = 10) 

# Draw point estimates
plt.plot(pollution_model.params, pollution_model.params.index, 'wo', label = 'Point Estimate')

plt.legend()
plt.show() 
Editar e executar o código