Get startedGet started for free

Using FacetGrid, catplot and lmplot

1. Using FacetGrid, catplot and lmplot

One of Seaborn's most powerful features is its ability to combine multiple smaller plots into a larger visualization that can help identify trends in data with many variables. This video will discuss the concepts between the visualization tool and discuss the difference between the FacetGrid, catplots, and lmplots.

2. Multiple plots of data

The concept of small multiples is useful for analyzing data with many variables. The idea is that you can quickly identify trends in data by comparing multiple plots side by side using the same scales and axes. These plots are referred to as a trellis or lattice plot. In data science, this concept is also frequently called faceting. In this specific example, we can look at the college tuition data and how it varies across the type of degree the school provides, the region, and whether the school is controlled as a public or private institution.

3. Tidy data

One very important requirement for Seaborn to create these plots is that the data must be in tidy format. This means that each row of the data is a single observation and the columns contain the variables. Once the data is in this format, Seaborn can perform a lot of the heavy lifting needed to create these small multiple plots.

4. FacetGrid

Seaborn's FacetGrid manages the back end data manipulations to make sure that the data is split across rows, columns, and hue, and then used to make the appropriate plot type. The key point to remember is that FacetGrid() provides a lot of flexibility, but you must use a two step process of defining the Facets and mapping the plot type.

5. FacetGrid Categorical Example

This example shows how to map a boxplot onto a data-aware FacetGrid(). The first step is to set up FacetGrid() with the column defined as the Highest Degree awarded by the school. The next step is to plot a boxplot of the Tuition values. In this case, we also define the order we want the degrees to be displayed in. This example could be expanded to include other variables to divide the data by rows.

6. catplot()

Seaborn's FacetGrid() is very powerful and flexible but involves multiple steps to create. The catplot() function is a shortcut to creating FacetGrids. The underlying returned value is a FacetGrid but the process for creating one is much simpler. The single catplot() function takes care of the two step process for you.

7. FacetGrid for regression

The FacetGrid() function also supports standard matplotlib plots. In this example, we can look at a simple scatter plot of Tuition compared to SAT Average across the different degree categories. We can use the same two step setup and mapping process as we did for the box plot.

8. lmplot

The lmplot() function is similar to the catplot() function. It provides a shortcut for plotting regression and scatter plots on FacetGrids. In this example, we create a plot that is similar to the FacetGrid() scatter plot. We have also disabled regression lines with the fit_reg equals False parameter.

9. lmplot with regression

Here, the data is filtered to look only at Regions 4 and 5 for those schools that only offer Bachelor's or Graduate level degrees. The example shows how to define the column as the Highest Degree offered and for the row to filter the data by Region. Behind the scenes, Seaborn filters the data so these plots show only a subset of data for each plot.

10. Let's practice!

In this section we discussed how the FacetGrid, catplot(), and lmplot() can be very powerful tools for creating many small plots of data. In the following exercises, you will get a chance to create some of your own.