1. Comparing dates
While each business is unique and has their own metrics to analyze, there is one common factor across all businesses: dates.
2. Questions to answer
When given a data set with a date field, there are many questions you can ask. In this lesson, we'll answer two questions in particular. First, what does our month-over-month performance look like? Second, what does our rolling 7 day performance look like? We'll be using this simple table that shows revenue by date.
3. Month-over-month comparison
First off, let's look into the revenue from last month vs the previous month, also known as a month-over-month comparison. We do this by using one of two window functions, either the lag or lead function. The lag function returns a value from a previous row, while the lead function returns a value from a later row. Either can work as long as the report is ordered correctly. In our example, let's use the lag function.
4. Month-over-month comparison
The first step when building this report is to set up a query that pulls revenue by month. To do this, we need to turn each date into a month. We can use the date part function to accomplish this. The second step is to include last month's revenue in the report. This is where we add the lag function. Since lag outputs a previous row, we must order the report in ascending order. We also must order it by the date part calculation, not date, since we should order it by month. We set the offset number, or the second argument of the lag function, to 1, which indicates we only want to offset one row. Note that if you omit this argument, it defaults to one, which is what we do here.
5. Month-over-month comparison
Lastly, we can run a ratio of the previous two steps to get month-over-month change. Note that if you subtract 1 after the ratio, it gives an intuitive number centered around 0, where negative numbers indicate a drop in revenue while positive numbers indicate growth.
6. Month-over-month comparison
We can clearly see from the results that we dropped 6% revenue in February and increased 50% revenue in March.
7. Rolling calculations
Next question is to look at a 7-day rolling revenue value. This requires us to change our window. Previously, we have only taken into account the entire table or the entire partition, but in this case we only want to take into account exactly 7 rows. To specify how many rows should be calculated, we must use the rows between clause. This is an additional clause that comes after the order statement in a window function and identifies how many rows should be included.
8. Rolling calculations
To view the rolling 7 day revenue for each date, we can add a new window function using the rows between clause mentioned in the previous slide. In this case, we want to include the current row and take the 6 rows prior, since the current row counts towards our 7 rows. Again, since we have a group by in the query, we need a SUM of a SUM.
9. New table: web_data
To explain this concept, you are going to practice on a new table called web_data. Here's how this works. Your company has built a website with a dedicated page for each country to help advertise the Olympics. This table tracks views to each page by date.
10. Let's practice!
So there are just a few ways you can compare dates, but there are many other approaches as well. Let's get some practice comparing dates in our queries.