Get startedGet started for free

Seaborn bar plots

1. Seaborn bar plots

Bar charts may be one of the most common visualizations, but creating them using a categorical Series can be a whole new animal.

2. Traditional bar chart

Let's look at a typical bar chart. We have provided the code for clarity, but this type of visual creation will not be covered in this course. This bar chart shows the number of reviews in the dataset by the traveler type. Couples was the most common, with over 200 reviews, while solo travelers had the least, with fewer than 30 reviews. This is a great simple summarization of this variable. However, Seaborn bar charts serve a different purpose. Our goal is to summarize a numerical variable across the different levels of a categorical variable.

3. The syntax

It's probably no surprise, but the syntax is almost identical to that of creating a boxplot. The only difference is that the kind parameter is set to bar, instead of box. The resulting bar chart looks a little different, and has some funny black lines. The height of each bar is a point estimate for the mean of the data, while the black band represents a confidence interval for that value. Confidence intervals are common in statistics and in this context, the intervals roughly represent a range of values for which we are 95% confident the true mean of the data will fall within. For example, if we looked at a distribution of the Score among those with a traveler type of friends, the estimated mean of that data would be just above four, and the confidence interval would be quite small. While the solo mean is below four and has a fairly large confidence interval.

4. Ordering your categories

We previously learned how to create categorical Series, and we can use this to our advantage when creating visualizations. If we set the data type of the traveler type Series to category, the order of the categories displayed in our visualization will be updated. Note that the traveler type categories have been placed in alphabetical order.

5. Updated visualization

Here is the same visualization, but with traveler type displayed in the order of the categories. Note that the catplot function has a parameter called order, but not all visualization methods have this parameter. It's best practice for us to order our category outside of the catplot function so that all of our visualizations are the same.

6. The hue parameter

Sometimes visualizing the data across one variable isn't enough, and we want to split the data a second time. The hue parameter can be used for this. Hue is set to a variable in the dataset and is used the split the data a second time. It also tells Seaborn to color the graphic by this variable. In this example we want to look at the Score variable across traveler type and tennis court.

7. Bar plot across two variables

It looks like tennis courts may not persuade business travelers to give a high rating, but the majority of all other traveler type categories gave hotels higher reviews if they had tennis courts. We generally won't use a barplot to compare distributions across categories like we would using a box plot, but this approach gives us a quick way to compare where the estimated mean of the Score is among different categories.

8. Bar plot practice

Time to practice what we have learned in this video.