Get startedGet started for free

Filtering with WHERE

1. Filtering with WHERE

Welcome to chapter two. In this first lesson, we'll discuss how to use WHERE correctly in a query.

2. How WHERE works

WHERE is used in a query to filter out individual rows from a data source specified in the FROM statement. In this example, we are using a WHERE filter condition to restrict the PlayersStats table to only rows where Position equals SG, for Shooting Guard.

3. How WHERE works

The PlayerStats table is scanned, and only the rows that meet the WHERE filter condition are extracted.

4. WHERE processing order

Remember from the previous chapter WHERE is processed after FROM, but before SELECT. Let's look at an example of this. Here, we apply a filtering condition to a new calculated column, TotalRebounds, and WHERE is used to filter the results where TotalRebounds is greater than or equal to 1000. The query looks right, but when run, returns an error. The TotalRebounds column is created in the SELECT statement. But remember, WHERE is processed before SELECT, so WHERE does not know about the new column yet.

5. Using a sub-query

We can still achieve the desired results by creating the new calculated column in a sub-query first. Here, we have the sub-query, after FROM, that creates the new column TotalRebounds, which then becomes available to be used in the WHERE filter condition in the outer query.

6. Using a sub-query

Now the query runs with no error message. Using sub-queries will be discussed in more detail later in this course.

7. Calculations on columns

But why use a sub-query to get the TotalRebounds column, in the previous example, when we could've just used the same calculation in the WHERE filtering condition. This is still valid and will return the same results. However, WHERE will run that calculation on every row to check if the row meets the condition or not. For a simple calculation, like the one in the example, this is not a big issue. But if the calculation is complex, it could increase the time it takes the query to run. Note, also, the unnecessary repeating of the calculation in the SELECT statement, and the WHERE filtering condition.

8. Functions on columns

Similarly, functions applied to a column in the WHERE filtering condition will also run on every row. Simple functions on small data sets will not show any noticeable effect on performance. However, applying complex or multiple functions to a column, or columns, in the WHERE filtering condition, could noticeably increase the time it takes the query to run, especially on large data sets.

9. WHERE simplified

You should always try and make your query as simple as possible, while still getting what you want. One way to do this is by avoiding using calculations and functions in the WHERE filtering condition. Here, simply applying a wildcard filter avoided using a function, while still returning the same result.

10. Summary

Let's summarize what we've learned. WHERE is processed before SELECT and will return an error if the syntax order is not correct. Calculations or functions on columns in the WHERE filtering condition could increase the time it takes the query to run. And remember, there may be a simpler way to apply a WHERE filter condition.

11. Let's practice!

Let's practice.