Get startedGet started for free

Dealing with too many categories

Sometimes you may be short on figure space and need to show a lot of data at once. Here you want to show the year-long trajectory of every pollutant for every city in the pollution dataset. Each pollutant trajectory will be plotted as a line with the y-value corresponding to standard deviations from year's average. This means you will have a lot of lines on your plot at once -- way more than you could separate clearly with color.

To deal with this, you have decided to highlight on a small subset of city pollutant combinations (wanted_combos). This subset is the most important to you, and the other trajectories will provide valuable context for comparison. To focus attention, you will set all the non-highlighted trajectories lines to of the same 'other' color.

This exercise is part of the course

Improving Your Data Visualizations in Python

View Course

Exercise instructions

  • Modify the list comprehension to isolate the desired combinations of city and pollutant (wanted_combos).
  • Tell the line plot to color the lines by the newly created color_cats column in your DataFrame.
  • Use the units argument to determine how, i.e., from which column, the data points should be connected to form each line.
  • Disable the binning of points with the estimator argument.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Choose the combos that get distinct colors
wanted_combos = ['Vandenberg Air Force Base NO2', 'Long Beach CO', 'Cincinnati SO2']

# Assign a new column to DataFrame for isolating the desired combos
city_pol_month['color_cats'] = [x if x in ____ else 'other' for x in city_pol_month['city_pol']]

# Plot lines with color driven by new column and lines driven by original categories
sns.lineplot(x = "month",
             y = "value",
             hue = '____',
             units = '____',
             estimator = ____,
             palette = 'Set2',
             data = city_pol_month)
plt.show()
Edit and Run Code