1. More useful summary tools
Grouping and summing is a technique you will use often. Let's explore other tools available at your disposal, and how to combine these tools to answer questions.
2. .mean()
First, instead of using dot-sum, we can use the dot-mean method to generate tables with numeric columns averaged instead of summed. Recall our fruit-underscore-sales table, where every line item is one sale. If we group by store and use dot-mean this time, we get the average quantity of fruit purchased per store, and the average revenue from each store. We can see that Pete's Discount Fruit generates, on average, over one dollar more of revenue per sale than Derek's Fruit Stand, a helpful bit of context as we explore our data.
3. Steps to an answer
We can also systematically reduce our data down to the information we need, as outlined here over a few steps that will take us from transactional data down to a table that answers a question, like, what fruit brings in the most revenue for each store?
4. 1: Fruit store transactions
First, let's take a quick look at our raw data, where each row is a sale of fruit at a given store.
5. 2: Total sales by fruit for each store
Second, a summary we've seen before, the total number of each fruit sold and its overall revenue for each store, stored here as totals.
6. 3: Sorted sales by fruit for each store
Third, another step we saw earlier, simply sorting all entries in the table by revenue in descending order, and assigning it back to totals. No dot-groupby needed for this step.
7. 4: Top row for each store
Finally, something new, we take our totals, group by store, then use the dot-head method with the optional argument 1 to display the first row of each store.
You'll also notice we used the dot-reset-underscore-index method after dot-head. This is to preserve our simple index after we pull out the top row for each store. This will be done for you in the exercises.
8. Steps to an answer
Here's all the code together. Consider why this works: we start with a set of data that contains more information than we actually need, in the form of transactions,
9. Steps to an answer
so we condense this data down to a format that's useful to us, in the form of total sales for each fruit within each store.
10. Steps to an answer
Next, we sort this information in a more meaningful and useful order, descending by revenue, and finally
11. Steps to an answer
we take the top row of our sorted data, but using dot-groupby first so we can get the top row for each store. Now we know Pete's store could benefit from some more blueberries and Derek's store could use more dragonfruits.
12. Your turn!
Now it's your turn to practice these techniques in the exercises.