1. Filtering while joining
You have already learned one method of filtering using the WHERE clause. Let's look at another option; filtering results while joining.
2. Joins revisited
Recall that a join brings together tables through the use of at least one common, linking field. For instance, TABLE_A and TABLE_B are joined using the common field, "NAME".
Joins are one method of combining data from multiple tables. Both INNER and OUTER JOINS can combine data.
Joins are also one method of filtering data. Broadly speaking, inner joins limit data through the join itself. Outer joins limit data by using an extra non-linking join condition.
3. Patient and appointments data
Suppose you have patient data from a health center and want to see which patients had an appointment today.
You have two tables. One table lists all of today's appointments. Another table lists all the patient demographic information.
You want to join them together in order to see the name and sex of today's patients.
4. Inner joins to filter
An INNER JOIN is one way to limit data to today's patients. It is an alternative to a WHERE clause filter.
You use an INNER JOIN, linking on the patient_id field.
Amelia Hernandez did not have an appointment today, so her patient_id is not in the Appointments table.
5. Inner joins to filter
Amelia does not show in the query results. The INNER JOIN serves as a filter.
6. Outer joins to filter
Suppose instead you want to find the information for a particular male patient that came in today.
You change the query to a LEFT JOIN and add a sex filter.
As a LEFT JOIN, the query keeps all the appointments records. The sex filter limits the patients table only to Zhang. Only Zhang's information joins to the appointments table.
7. Outer joins to filter
The query retains all of the day's appointments but only has patient information for the male patient, for Zhang.
8. Filter pitfalls
Instead of adding sex as a join condition, you may have added it in the WHERE clause to serve as a filter. However, your query would have returned different results.
What are they, and why would they be different?
Recall the SQL logical order of operations. The query first executes the FROM clause. The FROM clause includes the joins meaning this step returns records from both the Appointments and Patients tables. As a LEFT JOIN, the Patients information is appended to the corresponding Appointments table record.
Next, the WHERE clause filters the results so that only records with sex equal to M remain. This removes the entire record for Lotte Smith, including her Appointments information.
9. Filter pitfalls
The only remaining record is the appointment and patient information for Zhang Wei.
10. Filter pitfalls
This differs from the prior query where sex was a non-linking join condition. That query kept all of the day's appointments but only brought over patient information for the male patient, Zhang.
11. Filter pitfalls improved
To improve the query, you can keep sex in the WHERE clause but should change the JOIN to an INNER JOIN. Because sex comes from the Patients table, the joined table, it effectively turns the LEFT JOIN into an INNER JOIN. Rewriting the query as an explicit INNER JOIN better represents the functionality and overall improves readability.
As an INNER JOIN, the sex filter could be in the WHERE clause as it is now.
Or the sex filter could be in the FROM clause as an additional join condition.
The final results would be the same with one total record, the appointment and patient information of Zhang Wei.
12. Let's practice!
Now it's your turn to practice filtering while joining.