Grouping and summing: the beginner's pivot table
1. Grouping and summing: the beginner's pivot table
Let's explore another common spreadsheet component, pivot tables, and how to achieve this same view in Python.2. Fruit stores
For this, we'll use an example dataset of purchases from different fruit stores. There are about 30 rows in this dataset.3. Fruit stores
Each row tells us at which store the fruit was purchased,4. Fruit stores
the name of the fruit purchased,5. Fruit stores
the quantity purchased,6. Fruit stores
and the revenue generated from that sale.7. Pivoting in spreadsheets
In spreadsheets, to pivot the data, we'd highlight the raw table, insert a pivot table, then use the pivot table editor to create a summarized view of our raw data.8. Pivoting in spreadsheets
Here, we've dragged our numerical quantity-underscore-purchased and revenue fields into the Values section of our Pivot Table Editor. This, by default, results in a pivot table that sums up each field, a total of 57 products purchased and about 133 dollars in revenue. How would we achieve the same result in Python?9. .sum()
To summarize our dataset, represented here in Python as fruit-underscore-sales, we would just use the dot-sum method. We pass the optional argument numeric-underscore-only-equals-True, within the parentheses, so that we only sum up numerical data.10. Pure summary with .sum()
Compared to our spreadsheet pivot table, you can see we've achieved the same result. A total of 57 products purchased, and about 133 dollars in revenue. But often times when pivoting in spreadsheets, we care about more than just a pure summary. In these cases,11. Using pivot table rows in spreadsheets
it's most common to drag some qualitative field into the rows section of the Pivot Table Editor. Here, we've added store to the rows section of the Pivot Table Editor to give us a summary of our data by store. We can now see Pete's Discount Fruit achieved 75 dollars and 90 cents in revenue, outperforming Derek's Fruit Stand, which only achieved 57 dollars and 22 cents. To achieve this same view in Python,12. A simple pivot table in Python - .groupby().sum()
we must group by store using the dot-groupby method, then follow it with the dot-sum method. Dot-groupby tells Python to apply any subsequent operation to each unique group within the specified column. The process of placing one method next to another is called method-chaining, which is common in Python. Method-chaining allows us to create these powerful one-liners to do things like summarize our datasets.13. A simple pivot table in Python - .groupby().sum()
Practically, this one-liner is the equivalent of dragging store into the rows section of a Pivot Table Editor, then inserting all numerical fields into the values section of the Pivot Table Editor.14. A simple pivot table in Python - .groupby().sum()
Note that the dot-sum method does not require any arguments, and the dot-groupby method is also passed the argument as-underscore-index-equals-False in order to preserve our simple index. In the exercises, this will be done for you.15. Your turn!
Now it's your turn to create some summary tables using groupby and sum. Good luck!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.