Get startedGet started for free

Wide to long function

1. Wide to long function

In addition to melt, there is another function that can help us transform the data from wide to long, the wide_to_long function.

2. Wide to long transformation

Let's see the following DataFrame. We can see that the names of some columns are similar. There are two columns that start with age, and two with weight. Those columns are the same variable but for different years.

3. Wide to long transformation

If we would like to transform it to a long DataFrame like the one we see in the slide, we cannot do it with melt. We need another function, the wide to long function. Notice that this is a pandas function, and not a DataFrame method.

4. Wide to long function

This function takes several arguments. The first one is the DataFrame we want to transform.

5. Wide to long function

The next one is the stubnames argument. With it, we can specify the prefix, which is how the names of the wide columns start. In our example, we know our columns start with age and weight.

6. Wide to long function

The j argument tells pandas how we want to name the column that contains the suffix or the end of the wide columns. In our case, we will call it year.

7. Wide to long function

Finally, the i argument takes the column or list of columns we will use as unique identifiers. In our case, it's the name. Notice that this column will be the index of the long DataFrame.

8. Reshaping data

Let's see an example. We have the following dataset.

9. Reshaping data

We will apply wide to long function, passing in the books DataFrame

10. Reshaping data

telling pandas our columns have the prefixes ratings and sold,

11. Reshaping data

and that we want to call the new column with the suffix year

12. Reshaping data

and that the title column should be the unique index. We can see in the output our new long DataFrame. Now, title and year are indexes, while the columns rating and sold contains the values for each year.

13. DataFrame with index

It is important to mention that if we have a DataFrame with a named index as you see in the example, and we apply the wide to long function,the resulting DataFrame will not keep the original index.

14. DataFrame with index

If want to keep it, we modify the original DataFrame by resetting the index without dropping it. And then apply the transformation including the new column. As we can see in the output, the title is now part of the long DataFrame.

15. sep argument

This new DataFrame is very similar to the previous one, but the name of the columns contains an underscore between the prefix, ratings or sold, and the suffix, the year.

16. sep argument

If we apply the transformation as before, we'll get an empty DataFrame. This happens because pandas doesn't recognize the name of the columns. It always assumes that the prefix is immediately followed by a numeric suffix.

17. sep argument

To overcome this, we can use the sep argument. We specify that the separator element is an underscore. Now, pandas understands that the prefix ratings or sold is separated by an underscore from the year, and returns the correct DataFrame.

18. suffix argument

Finally, if the names of the wide columns do not end in a number, like in the DataFrame you see in the slide,

19. suffix argument

and we apply the same transformation as before, we'll get an empty DataFrame since pandas assumes the suffixes are numeric.

20. suffix argument

To solve this, we use the suffix argument. We pass the following expression: backslash w plus. This expression indicates to pandas that the name of the column ends in a word. Now, pandas recognizes the names of the columns and the correct DataFrame is returned.

21. Let's practice!

You have learned about using the wide to long function. It's time to practice!