1. Survey of useful functions
Welcome back!
In the first three chapters of this course, you wrote queries to calculate many key performance indicators, or KPIs.
In this last chapter, you'll write queries to produce reports of those KPIs. In this video, you'll learn two important functions for producing reports.
2. Dealing with dates
Dates are everywhere in business data; you've dealt with dates throughout this course.
You've truncated dates using DATE_TRUNC, and you've cast the output to DATE to drop the hours, minutes, and so on.
However, human-readable dates are important in reporting, and the default date format isn't very readable. How do you get from the default date format to something more readable?
TO_CHAR is a function that formats dates based on the format patterns passed to the function. Let's explore how TO_CHAR works.
3. TO_CHAR()
TO_CHAR takes two inputs, a date and a format string. Its output is a string of the date formatted according to the patterns in the format string.
For example, the pattern Dy represents the abbreviated day name. Take the first two days of June 2018. June 1 is a Friday and June 2 is a Saturday. Passing the first of June 2018 with a format string of Dy returns Fri, the abbreviation for Friday. Passing the next day returns Sat, and so on.
Patterns in the format string will automatically be replaced with what they represent, extracted from the date. All other characters will remain as-is in the output.
DD represents the date's day number. Dash is not a pattern. Adding dash and DD to the format string adds a dash and the day's number to the output.
4. TO_CHAR Patterns
Here are some patterns and what they represent. Placing a pattern in the format string will automatically replace the pattern by what it represents, extracted from the date. Read the cited documentation for many more patterns.
5. TO_CHAR
Let's see TO_CHAR in action.
This query selects the first three dates of Delivr's operations, the first three days of June. These dates in the order date column are passed to TO_CHAR with two different format strings.
In the output, the two produce different results. For example, in the first format, the full month pattern is used, so the full month is in the output, whereas in the second only the abbreviated month is used.
6. Window functions revisited
Recall window functions from Chapter 2.
You calculated the registrations running total with SUM, and you used LAG to fetch the previous month's active users, or MAU, to calculate the MAU growth rate.
A third useful window function is RANK, which assigns a rank to each row based on that row's position in a sorted partition. For example, ranking by revenue ranks users, eateries, or months by how much revenue they generated.
Let's see RANK in action.
7. RANK()
This query calculates the revenue from each user, sorts users by revenue in descending order, then selects the top 3 users by revenue. Ranked by revenue, User ID 18's rank is 1, 76's rank is 2, and so on.
8. RANK (II)
After wrapping the previous query in a CTE, selecting user ID and using RANK over the CTE ordered by revenue in descending order provides the top rank to the highest-revenue user, the second-top rank to the second-highest revenue generating user, and so on.
Be careful to sort by the correct column when using RANK, because it uses the sort order to assign ranks to rows.
9. Survey of useful functions
You'll practice using these two functions in the following exercises to use them later when writing queries for reports.