Get startedGet started for free

WINDOW Functions

1. WINDOW Functions

Window functions, also known as analytic functions, enable us to perform calculations over a dynamic set of rows, known as a window.

2. What are WINDOW functions

Window functions are powerful tools for extracting hidden patterns across a range of data. Think of it as a 'window' moving through the dataset, enabling calculations like running totals, moving averages, and rank assignments. Consider it like a magnifying glass focusing on specific rows for detailed analysis while considering the broader dataset context. In this example, our window finds the sum of the items in the current row and the prior two rows.

3. When to use WINDOW functions

Window functions are particularly valuable when dealing with data that exhibits trends or patterns over time or across groups. For instance, they can calculate moving averages, rank items based on local criteria, or identify outliers within a specific context. This image illustrates the type of window functions, including aggregates, rankings, and values.

4. WINDOW structure, PARTITION and ORDER BY

When structuring window functions in BigQuery, we specify the function, define the window frame with OVER, apply partitioning or grouping within the window frame, and order as needed. This simplified example shows us the core elements of a WINDOW function. This window function will return a row number in sequential order (1, 2, 3, etc.). It will group, or partition, these results by the "customer-id" and order them by the "order-date". This will return a number that represents the sequence of orders by that customer.

5. RANK and PERCENT_RANK

RANK and RANK-PERCENT functions are useful for assigning rankings to rows based on their values within a specified window. The RANK() function assigns an ordinal rank to each row within the window frame. On the other hand, the RANK_PERCENT() function assigns a percentile rank, which represents the percentage of rows that fall below the current row in terms of the specified order. This query shows the RANK and PERCENT_RANK of products based on the number of photos in the product listing.

6. LAG and LEAD

LAG and LEAD are quite simple overall. LAG returns the value from the previous row, and LEAD returns the value from the next row. This query shows how LAG and LEAD work and the values from the previous and next rows in the query.

7. RANGE BETWEEN and CURRENT ROW

The RANGE BETWEEN clause allows us to define window frames that adapt to the CURRENT ROW, providing a flexible approach to analyzing data within a local context. This query will show us the sum of the item price in the current row and the two preceding rows. As we can see on the right, we can add bounds on the number of rows in our window function and unbounded windows, which will consider all values before or after the current row.

8. QUALIFY

The QUALIFY clause acts as a filter in the WHERE clause to filter values based on a WINDOW condition. For example, this query uses QUALIFY to return just the values with a ranking of 3 or less based on the number of photos for the product listing. HAVING would not be an option here since we are not aggregating the data, but assigning a rank by row.

9. Let's practice!

Now that we have reviewed some common WINDOW functions let's practice this with our data to understand the common use cases for WINDOW functions.