1. Categorical Plot Types
In the first two chapters of this course, we covered the basics of how to use the Seaborn API for creating and customizing plots using different Seaborn and matplotlib approaches. These chapters provide the foundation for exploring additional plot types. This lesson will focus on the many different categorical plots that Seaborn supports.
2. Categorical Data
In our earlier exercises we looked at distribution and linear regression plots, used on numerical values. Seaborn also supports many plot types with categorical data. Categorical data is data which includes a limited or fixed number of values and is most useful when combined with numeric data. For the rest of this lesson, we will be looking at US Healthcare reimbursement data related to Renal Failure category codes and their associated reimbursement values. In these examples, the codes are the categorical variables and the average hospital charge is the numerical value we will analyze in our plots.
3. Plot types - show each observation
Seaborn breaks categorical data plots into three groups. The first group includes the stripplot() and the swarmplot(), which show all of the individual observations on the plot.
4. Plot types - abstract representations
The second category contains the familiar boxplot(), as well as a violinplot() and boxenplot(). These plots show an abstract representation of the categorical data.
5. Plot types - statistical estimates
The final group of plots show statistical estimates of the categorical variables. The barplot() and pointplot() contain useful summaries of data. The countplot() shows the number of instances of each observation.
6. Plots of each observation - stripplot
Seaborn's stripplot() shows every observation in the dataset. In some cases, it can be difficult to see individual data points. We can use the jitter parameter in order to more easily see how the Average Covered Charges vary by Diagnostic Reimbursement Code.
7. Plots of each observation - swarmplot
We can plot a more sophisticated visualization of all the data using a swarmplot(). This plot uses a complex algorithm to place the observations in a manner where they do not overlap. The downside of this approach is that the swarmplot() does not scale well to large datasets.
8. Abstract representations - boxplot
The next category of plots show abstract representations of the data. A boxplot() is the most common of this type. This plot is used to show several measures related to the distribution of the data, including the median, upper and lower quartiles, as well as outliers.
9. Abstract representation - violinplot
The violinplot() is a combination of a kernel density plot and a box plot and can be suitable for providing an alternative view of the distribution of data. Because the plot uses a kernel density calculation it does not show all data points. This can be useful for displaying large datasets but it can be computationally intensive to create.
10. Abstract representation - boxenplot
The final plot in the grouping is the boxenplot(), which is an enhanced box plot. The API is the same as the boxplot() and violinplot() but can scale more effectively to large datasets. The boxenplot() is a hybrid between a boxplot() and a violinplot() and is relatively quick to render and easy to interpret.
11. Statistical estimates - barplot
The final category of plots are statistical estimates of the data. The barplot() shows an estimate of the value as well as a confidence interval. In this example, we include the hue parameter described in Chapter 1, which provides another useful way for us to look at this categorical data.
12. Statistical estimates - pointplot
The pointplot() is similar to the barplot() in that it shows a summary measure and confidence interval. A pointplot() can be very useful for observing how values change across categorical values.
13. Statistical estimates - countplot
The final categorical plot is the countplot(), which displays the number of instances of each variable.
14. Let's practice!
Now that we have gone through all of the categorical plots available in Seaborn, let's practice making some of our own.