Get startedGet started for free

Reordering categories

1. Reordering categories

We have talked about the ordering of categories several times already, so let's take a look at how to reorder categories in a Pandas Series.

2. Why would you reorder?

You might reorder variables in pandas for a few different reasons. First, if the variable wasn't set as an ordinal variable upon creation, you might reorder your variables and set the Series as ordinal at this time. You might also set the order so that an analysis is displayed in a specific order, making the results easier to understand. And finally, it's good to remember that converting a Series to categorical can save on memory.

3. Reordering example

Let's look at an example. The coat variable has four values. If the wirehaired value is somewhere between medium and long, we might want to reorder the categories to short, medium, wirehaired, and long. Notice that we also specified that ordered is true here, as these are lengths of a dogs coat, and length has a natural order. As a quick aside, here is an example of using the inplace parameter. By setting this to true, the coat variable is updated without needing to set the variable equal to an updated version of itself. Several functions and methods in pandas have this as a parameter and it is generally used as a way to reduce the amount of typed code.

4. Grouping when ordered=True

Here we have borrowed the reorder categories setup that we used on the previous slide. Now that we have reordered our categories, several methods and visualizations will use this order when printing output. Note that the order of the printout is based on the order of the new categories parameter. Whichever order is specified using this parameter will be used. Take a look at this groupby statement. The average age for each group will be shown in the order of the categories of the coat column.

5. Grouping when ordered=False

Let's use another reorder-categories call. In this example, we want the output of our summary statistics to be short, medium, long, and then wirehaired. However, this isn't the natural order because wirehaired is shorter than long. In this context, we will set the ordered parameter equal to false because we don't want the coat column to be treated as an ordinal categorical variable. This means that you can still reorder categories for display purposes without the category being ordinal. Here is the group by statement followed by the output. We see the order we specified - short, medium, long, and then wirehaired.

6. Reordering practice

Let's practice reordering columns.