ComeçarComece de graça

Choosing the right variable to encode with color

You're tasked with visualizing pollution values for Long Beach and nearby cities over time. The supplied code makes the below (hard-to-read plot), which consists of maximum pollution values (provided as max_pollutant_values) with the bars colored by the city.

Mutlicolor and busy bar plots with four rows corresponding to the four pollutants in dataset

You can quickly improve this with a few tweaks. By modifying the cities shown to only those in the western half of the country you will avoid clutter. Next, swapping the color-encoding from city to year allows you to use an ordinal palette, saving the reader from continually referring to the legend to check which color corresponds to which city.

Este exercício faz parte do curso

Improving Your Data Visualizations in Python

Ver curso

Instruções do exercício

  • Remove 'Indianapolis', 'Des Moines', 'Cincinnati', 'Houston' from the cities vector.
  • Swap the encodings of the city and year variables.
  • Use the 'BuGn' ColorBrewer palette to map your colors appropriately for the newly ordinal variable.

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Reduce to just cities in the western half of US
cities = ['Fairbanks', 'Long Beach', 'Vandenberg Air Force Base', 'Denver', 
          'Indianapolis', 'Des Moines', 'Cincinnati', 'Houston']

# Filter data to desired cities
city_maxes = max_pollutant_values[max_pollutant_values.city.isin(cities)]

# Swap city and year encodings
sns.catplot(x = 'year', hue = 'city',
              y = 'value', row = 'pollutant',    
              # Change palette to one appropriate for ordinal categories
              data = city_maxes, palette = 'muted',
              sharey = False, kind = 'bar')
plt.show()
Editar e executar o código