Analyzing coffee sales by store

1. Analyzing coffee sales by store

I’m working with my Chief Revenue Officer on an important analysis. In this scenario, we want to compare the performance of our stores, taking into account differences in size, sales volume, and revenue. Our goal is to find a way to evaluate stores fairly, despite these differences. To start, I’ll navigate to my schema containing our coffee data and take a quick look at the sales table. Since we need to analyze sales at the store level, I’ll perform some aggregations. Specifically, I’ll calculate the total number of sales, total revenue, and group the results by store_id. For this, I’ll use the COUNT() and SUM() functions, with store_id in the GROUP BY clause. With this aggregated data ready, I’ll join it with the stores table, which contains details about each store’s size. I’ll use a sub-query for the sales data, placing it within a larger query and performing a LEFT JOIN to bring in the store information. By using a sub-query, I can treat the aggregated sales data as a virtual dataset without creating a new table or view. This approach keeps the process efficient while combining all the necessary data in one place. Now that the dataset is complete, I’m ready to apply ranking logic. I’ll create three different ranks: one for store size, one for total sales, and one for total revenue. Using the RANK() function, I’ll generate these ranks by partitioning each metric by store_id, allowing me to see how each store compares across these dimensions. Once the query runs, I’ll have a clear view of each store’s performance rankings. These insights will help us make data-driven decisions, such as where to invest more resources or which stores might need additional support. This scenario highlights how sub-queries and window functions can simplify complex calculations and provide deeper insights. Now, let’s apply these techniques to our insurance data!

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

This exercise is part of the course

Introduction to Databricks SQL

IntermediateSkill Level

4.7+

Start Course for Free

Build a strong data warehousing foundation with the Databricks Data Intelligence Platform! You will learn how Databricks and the lakehouse architecture set your organization up for a modern SQL and BI stack. You will also explore the key components of the Databricks SQL product, ranging from data storage techniques to compute optimizations.

Exercise 1: SQL in the Data Intelligence Platform Exercise 2: Benefits of the lakehouse for SQL Exercise 3: Exploring Databricks SQL Exercise 4: Exploring some data Exercise 5: Partner Connect Exercise 6: Databricks SQL key assets Exercise 7: Classifying Databricks SQL assets Exercise 8: Create a query Exercise 9: Build a visualization

Build a strong data foundation for the lakehouse architecture! You will learn how to ingest, transform, and model your data using the capabilities in Databricks SQL. You will explore GUI-based and programmatic approaches to building out the medallion architecture, and creating a data ecosystem ready for any analytics workload.

Exercise 1: Ingesting Data Exercise 2: GUI-based data ingest Exercise 3: Hydrating the lakehouse Exercise 4: Upload data manually Exercise 5: Using COPY INTO Exercise 6: Leveraging Auto Loader Exercise 7: Transforming data Exercise 8: SQL and the medallion architecture Exercise 9: Creating a coffee data layer Exercise 10: Cleaning up raw tables Exercise 11: Creating the silver layer Exercise 12: Creating a table of large premium claims

Create powerful data insights using Databricks SQL! You will learn how to write queries, create visualizations, and power dashboards using in-platform Databricks capabilities. You will practice leveraging SQL code, filters, and parameters to create a robust analytical application for your end users.

Exercise 1: Querying in the Data Intelligence Platform Exercise 2: Analytical assets in Databricks SQL Exercise 3: Querying our coffee dataset Exercise 4: Analyzing insurance claims Exercise 5: Supplementing queries with functions Exercise 6: Visualizing query results Exercise 7: Creating an analytics application Exercise 8: The role of Partner Connect Exercise 9: A coffee data dashboard Exercise 10: Adding filters to dashboards Exercise 11: Using parameters in queries Exercise 12: Create an executive dashboard

In the final chapter, you will learn some more advanced techniques that leverage the key differentiators of the Databricks platform. You will learn how to handle high-velocity and fast-changing data using window functions, and will be able to merge datasets as they come in.

Exercise 1: Common data engineering patterns Exercise 2: Append vs. CDC Exercise 3: Updating coffee sales data Exercise 4: Optimizing data with SQL Exercise 5: Using INSERT Exercise 6: Using MERGE Exercise 7: Advanced data analysis patterns Exercise 8: Understanding window functions and sub-queries Exercise 9: Analyzing coffee sales by store

Current Exercise

Exercise 10: Writing a sub-query Exercise 11: Using a window function Exercise 12: Using windows and sub-queries together Exercise 13: Course recap