Get Started

Using CTEs with Redshift

1. Using CTEs with Redshift

Let's look at using common table expressions, often called CTEs.

2. Common table expressions (CTEs)

Common Table Expressions, often referred to as CTEs, are a crucial feature in SQL. CTEs act like temporary result sets. They help simplify queries and significantly improve the readability of our SQL code. They provide an elegant alternative to subqueries, making our queries more intuitive and easier to maintain.

3. Subquery and CTE structures

Here is a typical subquery usage. The query has a subquery, top-ten-divisions-by-rev, that selects the top ten divisions by revenue, and the outer query refines that further by selecting from those results where the total revenue is greater than one hundred thousand. Now, let's look at that same query using a CTE. We start with the CTE at the top starting with a WITH clause the name of the CTE, top-ten-divisions-by-rev, AS the same query we used in the subquery on the left. Next, we write the main query to select from the CTE and perform the additional filtering. As we can see, the CTE version enhances code readability and separates the logic into manageable sections, making it easier to follow and maintain. I often think of the joke about who is the only person who can debug a SQL query... the person who originally wrote it.

4. Multiple CTEs

We can also use multiple CTEs in a single query by separating them with a comma and using the alias as syntax again. In the example here, we define the same top-ten-divisions-by-rev as in the previous query and follow it with a division-names CTE that selects all the division IDs and names from the divisions table. Combining these CTEs in our main query allows us to leverage their individual results to create a comprehensive and clear SQL statement. Notice that we're doing all the work of joining within the main query. This method of using the CTEs helps keep the lineage of all the fields from where they started to where they ended up, which can be critical to debugging complex SQL statements.

5. Multiple CTEs, CTE joins

Often, people are joining between the CTEs themselves, and it can make keeping track of what data is available where and from where very difficult. While we joked about SQL debugging, Kernighan's Law states, "Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?"

6. CTE performance

The order in which we write SQL, as shown on the left, is different from the order in which the database executes SQL, as illustrated on the right. Since the SQL engine will plan the query execution best for the entire query chain, CTEs, and subqueries perform very similarly on Redshift. However, a CTE will be far more effective if we reuse a particular subquery more than once.

7. Let's practice!

Now that we've explored the power of CTEs, let's put your knowledge into practice. It's time for some hands-on exercises to reinforce our understanding of using CTEs with Amazon Redshift.