Get startedGet started for free

Filtering in the WHERE clause

1. Filtering in the WHERE clause

We just learned how different SQL clauses affect the result columns and rows. Let's focus on the WHERE clause to improve query performance.

2. Limit the data

Recall that in the SQL logical order, the WHERE clause occurs second. The data is brought in with the FROM clause. The WHERE clause is the first place to limit the number of returned records. Just like large tables are slow to query, large datasets are slow to return data. Since the WHERE clause occurs early in the SQL execution and results in fewer records, it speeds the query. In order to understand how let's return to the query planner.

3. EXPLAIN

Recall that SQL is the order you provide to the query planner. The planner, or cooks, use the available information to plan your meal execution. EXPLAIN provides a window into that execution plan. You can add EXPLAIN before any query in order to see the ordered steps of the execution plan. The EXPLAIN output shows one step and includes a cost estimate, rows, and width information. EXPLAIN only provides estimates. The estimate units are not intuitive, and we will explore them in depth in chapter four. Right now, we will focus on the execution steps. This query has one, the sequential scan.

4. EXPLAIN with WHERE

Adding a WHERE clause to the query changes the query plan. There is now an additional instruction in the sequential step. This filter instruction will be the focus of the remainder of this lesson.

5. Good - Filtering for similar values with LIKE OR

This example shows a list of countries with their international phone codes. Suppose you want to find information for Chad, China, India, and Indonesia. You try to save time by writing a query using fuzzy matching with the LIKE operator. The query plan filter instruction shows how the LIKE OR conditions translate into two distinct filters. Each country filter is executed independently.

6. Better - Filtering for similar values with LIKE ANY

You now change the WHERE clause to use an ARRAY. An ARRAY is a fancy way to say a list. The query plan shows a more efficient query. The filter shows how the ARRAY operator functions as one filter condition — the filter searches for a match of ANY item in the list.

7. Good - Filtering for exact values with OR

You now only need the information for two countries, Chad and China, so decide to type out the country names. You write a query using OR to select each country. The filter line in the query plan shows how the OR operator translates into two distinct conditions. Each country filter is executed independently.

8. Better - Filtering for exact values with IN

You want the filter to function as one condition so change the WHERE clause to use the IN operator. The query plan shows how the IN operator searches for a match of ANY item in the one list condition. The structure is equivalent to the ARRAY that you used with LIKE.

9. Best - Filtering for numbers

Finally, you decide to filter the table using the numeric phone codes. The filter again shows how the IN operator functions as one filter condition. The filter searches for a match of ANY number in the one list condition. This filter is more efficient than the equivalent country list because numbers are easier to search than text.

10. Summarizing the best WHERE filters

Numeric fields are shorter than text fields. For instance, the phone code for Indonesia is 62, whereas the country name is nine characters. One number requires less computer storage space than one character. The shorter length and smaller storage size result in numeric fields speeding performance.

11. Summarizing the best WHERE filters

To summarize, numeric filters are better than text filters. IN and ARRAY operators function as one filter, which then performs faster than OR operators.

12. Let's practice!

Now it's your turn to practice optimizing with WHERE filtering.