1. Analyzing coffee sales by store
I’m working with my Chief Revenue Officer on an important analysis. In this scenario, we want to compare the performance of our stores, taking into account differences in size, sales volume, and revenue. Our goal is to find a way to evaluate stores fairly, despite these differences.
To start, I’ll navigate to my schema containing our coffee data and take a quick look at the sales table. Since we need to analyze sales at the store level, I’ll perform some aggregations. Specifically, I’ll calculate the total number of sales, total revenue, and group the results by store_id. For this, I’ll use the COUNT() and SUM() functions, with store_id in the GROUP BY clause.
With this aggregated data ready, I’ll join it with the stores table, which contains details about each store’s size. I’ll use a sub-query for the sales data, placing it within a larger query and performing a LEFT JOIN to bring in the store information. By using a sub-query, I can treat the aggregated sales data as a virtual dataset without creating a new table or view. This approach keeps the process efficient while combining all the necessary data in one place.
Now that the dataset is complete, I’m ready to apply ranking logic. I’ll create three different ranks: one for store size, one for total sales, and one for total revenue. Using the RANK() function, I’ll generate these ranks by partitioning each metric by store_id, allowing me to see how each store compares across these dimensions. Once the query runs, I’ll have a clear view of each store’s performance rankings.
These insights will help us make data-driven decisions, such as where to invest more resources or which stores might need additional support.
This scenario highlights how sub-queries and window functions can simplify complex calculations and provide deeper insights. Now, let’s apply these techniques to our insurance data!
2. Let's practice!