1. Filtering with WHERE
Welcome to chapter two. In this first lesson, we'll discuss how to use WHERE correctly in a query.
2. How WHERE works
WHERE is used in a query to filter out individual rows from a data source specified in the FROM statement. In this example, we are using a WHERE filter condition to restrict the PlayersStats table to only rows where Position equals SG, for Shooting Guard.
3. How WHERE works
The PlayerStats table is scanned, and only the rows that meet the WHERE filter condition are extracted.
4. WHERE processing order
Remember from the previous chapter WHERE is processed after FROM, but before SELECT. Let's look at an example of this. Here, we apply a filtering condition to a new calculated column, TotalRebounds, and WHERE is used to filter the results where TotalRebounds is greater than or equal to 1000. The query looks right, but when run, returns an error. The TotalRebounds column is created in the SELECT statement. But remember, WHERE is processed before SELECT, so WHERE does not know about the new column yet.
5. Using a sub-query
We can still achieve the desired results by creating the new calculated column in a sub-query first. Here, we have the sub-query, after FROM, that creates the new column TotalRebounds, which then becomes available to be used in the WHERE filter condition in the outer query.
6. Using a sub-query
Now the query runs with no error message. Using sub-queries will be discussed in more detail later in this course.
7. Calculations on columns
But why use a sub-query to get the TotalRebounds column, in the previous example, when we could've just used the same calculation in the WHERE filtering condition. This is still valid and will return the same results. However, WHERE will run that calculation on every row to check if the row meets the condition or not. For a simple calculation, like the one in the example, this is not a big issue. But if the calculation is complex, it could increase the time it takes the query to run. Note, also, the unnecessary repeating of the calculation in the SELECT statement, and the WHERE filtering condition.
8. Functions on columns
Similarly, functions applied to a column in the WHERE filtering condition will also run on every row. Simple functions on small data sets will not show any noticeable effect on performance. However, applying complex or multiple functions to a column, or columns, in the WHERE filtering condition, could noticeably increase the time it takes the query to run, especially on large data sets.
9. WHERE simplified
You should always try and make your query as simple as possible, while still getting what you want. One way to do this is by avoiding using calculations and functions in the WHERE filtering condition. Here, simply applying a wildcard filter avoided using a function, while still returning the same result.
10. Summary
Let's summarize what we've learned. WHERE is processed before SELECT and will return an error if the syntax order is not correct. Calculations or functions on columns in the WHERE filtering condition could increase the time it takes the query to run. And remember, there may be a simpler way to apply a WHERE filter condition.
11. Let's practice!
Let's practice.