Programmatically creating a highlight
You are continuing your work for the city of Houston. Now you want to look at the behavior of both NO2 and SO2 when the un-plotted ozone (O3) value was at its highest.
To do this, replace the logic in the current list comprehension with one that compares a row's O3
value with the highest observed O3 in the dataset. Note: use sns.scatterplot()
instead of sns.regplot()
. This is because sns.scatterplot()
can take a non-color vector as its hue
argument and colors the points automatically while providing a helpful legend.
This exercise is part of the course
Improving Your Data Visualizations in Python
Exercise instructions
- Find the value corresponding to the highest observed
O3
value in thehouston_pollution
DataFrame. Make sure to type the letterO
and not the number zero! - Append the column
'point_type'
to thehouston_pollution
DataFrame to mark if the row contains the highest observed O3. - Pass this newly created column to the
hue
argument ofsns.scatterplot()
to color the points.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
houston_pollution = pollution[pollution.city == 'Houston'].copy()
# Find the highest observed O3 value
max_O3 = houston_pollution.O3.____
# Make a column that denotes which day had highest O3
houston_pollution['____'] = ['Highest O3 Day' if ____ == ____ else 'Others' for O3 in houston_pollution.O3]
# Encode the hue of the points with the O3 generated column
sns.scatterplot(x = 'NO2',
y = 'SO2',
hue = '____',
data = houston_pollution)
plt.show()