Get startedGet started for free

Pivoting and unpivoting

1. Pivoting and unpivoting

In our final video, we'll learn to reshape data by pivoting and unpivoting DataFrames.

2. Long vs. wide format

Data can come in different formats which we can think of as long or wide format. In long format, each row is a single observation. In this example, each row is a count of a specific bedroom type per property. In wide format, rows contain multiple observations across columns. In this case, we have the bedroom counts by type in its own column. We may need to reshape our data from one format to another for different tasks, such as visualizations.

3. Long vs. wide format

Let's see how to transform from long to wide format with pivot.

4. Pivoting from long to wide

To transform from long to wide format, we use the .pivot() method. This reshapes data so values from one column become new column names.

5. Pivoting from long to wide

The "on" parameter in .pivot specifies which column's values will become our new column names - in this case, "bedroom_type".

6. Pivoting from long to wide

The "index" parameter defines which column identifies unique rows in our output - here we use "name".

7. Pivoting from long to wide

The "values" parameter indicates which column contains the values for our table cells - in this case, "count". Our result is now in wide format. Each row now represents one property, each bedroom type is a column, and the counts fill the cells. This makes comparing bedroom types across properties much easier.

8. Pivoting with aggregations

What if multiple values exist for the same index and column combination? Imagine that we had one row per bedroom for each property. Here we see that the Waves property has two rows where the bedroom_type is single.

9. Pivoting with aggregations

In this case, we add the "aggregate_function" parameter in .pivot to combine multiple values. Common options include "sum", "mean", "min" and "max". Here we use "sum" to add all bedroom counts for each property and type. Notice, for example, that Waves has 2 single beds in the output.

10. Unpivoting from wide to long

Now we do the reverse operation which converts wide data back to long format, suitable for visualizations.

11. Unpivoting from wide to long

Polars provides the .unpivot method for this operation. In Pandas, this method is called .melt.

12. Unpivoting from wide to long

The "index" parameter in unpivot specifies which columns remain unchanged - here the name column stays as our identifier.

13. Unpivoting from wide to long

The "on" parameter lists which columns to convert into rows - here we're unpivoting the "doubles" and "singles" columns. We then get the output in long format. The variable column shows which column the data came from and the value column shows the data.

14. Unpivoting from wide to long

We can also choose to set the name of the variable and value columns in the output with the variable_name and value_name arguments to unpivot. Here we set them to bedroom_type and count respectively.

15. Let's practice!

Now it's time for you to practice reshaping your data.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.