1. Analytic functions
SQL Server provides a useful set of analytic functions.
Similar to the aggregate functions, these are used to calculate an aggregate value on a group of rows. The difference is that the analytic functions are computed on each row, instead of working on groups of data.
Let's take them one by one.
2. FIRST_VALUE()
FIRST_VALUE() returns the first value in an ordered set of data.
If you want to show in a query the minimum salary from each department per row, this is the function you can use.
FIRST_VALUE() is used in combination with the OVER() clause.
OVER() has the following components:
PARTITION BY, used to split the output of the query into separate result sets. The function will then return a value for each partition. Using PARTITION BY is optional.
ORDER BY is mandatory, because analytical functions are applied on an ordered result set.
With the row or range frame, you establish the limits for the partitions on which the analytical function will be applied.
3. LAST_VALUE()
LAST_VALUE() is similar to FIRST_VALUE(), returning the last value in an ordered set.
For calculating the last value for different groups of data, you need to make sure you know how each group or partition is created.
4. Partition limits
The limits of each partition can be explicitly specified after ORDER BY from the OVER clause.
You need to mention the start and end boundaries for the partitions.
Discussing the boundaries in more detail is outside the scope of this course. What you need to remember for now is that analytic functions are applied by default from the first row of the partition until the current row. If you want to apply the function on the entire subset of data, you need to explicitly add this clause: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
This means that all rows will be taken into consideration, from the first one in the partition until the last.
5. FIRST_VALUE() and LAST_VALUE() example
Let's see an example.
You want to compare the total votes of each person with the minimum number of votes recorded by a person and with the maximum. You want to divide the voters per gender.
This is how the query looks like.
These are the results returned.
First, information is shown for all female voters.
On the same row, you see the votes for each person, the minimum number of votes registered by a female participant and the maximum number. Then, the same info is shown for the male voters. Notice that the minimum and maximum values are different.
6. LAG() and LEAD()
Another pair of useful functions is LAG() and LEAD().
You can use LAG() in SELECT statements for comparing values from the current row with values from the previous one. For example, you can compare the profit made by a company in a year with the profit from the previous year.
With LEAD(), you can access values from a subsequent row in the same result set.
7. LAG() and LEAD() example
Let's look at a simple example from the "ratings" table.
In this query, I select information about the different types of chocolates from the company "Felchlin", their cocoa percentage and the rating received.
I want to compare the percentage of each bar with the one of the bar that received the nearest lower rating and also higher rating.
Using the LAG() function, you can compare a value with one from the previous row. With LEAD(), you do the same comparison with a value from the next row.
What's nice is that you can see these values per each row.
8. Let's practice!
Are you ready to test your knowledge of the analytic functions? I'm sure you are.