CommencerCommencer gratuitement

Lots of bootstraps with beeswarms

As a current resident of Cincinnati, you're curious to see how the average NO2 values compare to Des Moines, Indianapolis, and Houston: a few other cities you've lived in.

To look at this, you decide to use bootstrap estimation to look at the mean NO2 values for each city. Because the comparisons are of primary interest, you will use a swarm plot to compare the estimates.

The DataFrame pollution_may is provided along with the bootstrap() function seen in the slides for performing your bootstrap resampling.

Cet exercice fait partie du cours

Improving Your Data Visualizations in Python

Afficher le cours

Instructions

  • Run bootstrap resampling on each city_NO2 vector.
  • Add city name as a column in the bootstrap DataFrame, cur_boot.
  • Color all swarm plot points 'coral' to avoid the color-size problem.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Initialize a holder DataFrame for bootstrap results
city_boots = pd.DataFrame()

for city in ['Cincinnati', 'Des Moines', 'Indianapolis', 'Houston']:
    # Filter to city
    city_NO2 = pollution_may[pollution_may.city  ==  city].NO2
    # Bootstrap city data & put in DataFrame
    cur_boot = pd.DataFrame({'NO2_avg': bootstrap(____, 100), 'city': ____})
    # Append to other city's bootstraps
    city_boots = pd.concat([city_boots,cur_boot])

# Beeswarm plot of averages with citys on y axis
sns.swarmplot(y = "city", x = "NO2_avg", data = city_boots, ____ = '____')

plt.show()
Modifier et exécuter le code