Improving your KDEs
One way of enhancing KDEs is with the addition of a rug plot. Rug plots are little dashes drawn beneath the density that show precisely where each data point falls. Adding a rug plot is particularly useful when you don't have a ton of data.
With small amounts of data you often have gaps along your support with no data, and it can be hard to tell whether a non-zero KDE line means data was present or is due to a wide kernel. A rug plot helps address this.
Let's return to the sns.kdeplot()
function to draw two KDEs: one looking at the data for Vandenberg Air Force Base
and the other looking at all the other cities in the pollution data. Since there is much less data contributing to the shape of the Vandenberg plot, add a rug plot beneath it.
This exercise is part of the course
Improving Your Data Visualizations in Python
Exercise instructions
- Make the Vandenberg plot
'steelblue'
. - Turn on rug plot functionality in the Vandenberg plot.
- Set the color of the non-Vandenberg plot to
'gray
'.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
sns.kdeplot(pollution[pollution.city == 'Vandenberg Air Force Base'].O3,
label = 'Vandenberg',
# Turn the color blue to stand out
color = '____')
# Turn on rugplot
sns.____(pollution[pollution.city == 'Vandenberg Air Force Base'].O3,
label = 'Vandenberg',
color = 'steelblue')
sns.kdeplot(pollution[pollution.city != 'Vandenberg Air Force Base'].O3,
label = 'Other cities',
# Turn the color gray
color = '____')
plt.show()