1. Ranking functions
Let's learn how to apply ranking functions to time series data using window function SQL.
2. Row number gives sequential numbering
ROW_NUMBER can be used as a ranking function. It numbers all rows sequentially based on the order encountered. For rows that have duplicate values and where the order is not clear the row numbers are assigned arbitrarily. Here it is in a query where we have added a number for each row in the table and named that field, rank.
3. Rank
The RANK function is similar to ROW_NUMBER.
RANK assigns a rank to each row within each partition according to the ordering specified by the ORDER BY clause. It starts with one and if a value is repeated, it gets the same ranking.
4. Rank vs. row
Here is a comparison of the RANK and ROW_NUMBER functions looking at station three. Both show the same number of results.
Notice the result of RANK has two rows for the RANK of one because temperatures were the same. It then also skipped RANK two.
That may not be ideal in all cases, perhaps we don't want to skip RANKs.
5. Dense rank
In that case, we can use the DENSE_RANK function.
The syntax is the same as we've been seeing so far.
6. Rank vs. dense rank
As we saw before, the RANK function repeated a rank when it encountered the same value again, but skipped the next one.
Dense rank also repeats a rank but does not skip.
Reviewing the outputs here shows that dense rank included a rank of two after the two values ranked at one.
We can also see that the maximum rank of the rank function visible here is eight, while the maximum for dense rank is seven.
7. Percent rank
Let's review one more ranking function. PERCENT_RANK. It works the same way as the other ranking functions we have encountered, except it returns a ranking percentage.
The ranking percentage is calculated by taking the rank and subtracting one, the total rows in the partition and subtracting one, and dividing the two. The values will be a float ranging from zero to one.
The syntax is the same as the other ranking functions.
8. Percent rank output
Here is an example of the output. The decimals have been shortened to fit the screen.
There are more rows in the final output than what is visible on the screen, but the last row will always have a percent rank of one.
9. Let's practice!
Let's practice!