Filtering grouped data
1. Filtering grouped data
In our final chapter, we'll take conditional logic a step further; moving from filtering individual rows to filtering grouped results, and then creating new categories.2. WHERE or HAVING
Until now, we have filtered ungrouped data. Sometimes, we have questions about groups of data. For example, "Which languages have more than 20 films?" filters based on the language count, an aggregated and grouped result. This reveals an important SQL concept. There are two different types of filtering: individual rows and grouped results. Think of it this way: WHERE asks, "Which individual films meet this condition?" while HAVING asks, "Which groups meet this condition?"3. Filtering grouped data
Our prompt is requesting a filter on language groups and generates a query using HAVING, which filters after grouping and aggregation are complete and the counts are available.4. Prompting for group filters
The language of the prompt gives us and the AI clues to recognize the filtering requirements. "Film titles and budgets with a budget above 5 million" would refer to individual records; that's WHERE. "Languages and budget with rounded average budget above 5 million" refers to grouped results. Since average can't be calculated from a single row, SQL must group rows first, so that's HAVING.5. Combining filters
Sometimes we need both. Take the prompt "Show languages with over five films from 2000 onwards, where average duration exceeds 80" that combines individual and group filtering.6. Combining filters
"From 2000 onwards" applies to individual films, so it's handled by WHERE before grouping.7. Combining filters
"Over five films" and8. Combining filters
"average duration" apply after grouping. HAVING handles these. WHERE always handles the row-level filters first, and HAVING always handles the group-level filters afterwards.9. Order of operations
This happens because SQL processes queries in a specific order, like an assembly line. It starts with FROM to pick the tables.10. Order of operations
Next, WHERE filters the rows.11. Order of operations
After that, GROUP BY forms groups, and aggregate functions calculate values for those groups.12. Order of operations
HAVING then filters the grouped results.13. Order of operations
Only then does SELECT finalize what's shown,14. Order of operations
and ORDER BY15. Order of operations
and LIMIT handle the sorting and trimming.16. Filter prompts
Let's review these prompting strategies before heading to the exercises. When a prompt combines both, for example, films from 2000 onwards where the average duration is above 80, the conditions for individual films are handled with WHERE first, and the group-level filters are handled with HAVING afterwards. If the prompt asks about individual records, like films released after 2024 or budgets above five million, that points to WHERE because we're filtering rows before grouping. If the prompt mentions an aggregate, such as a language with an average budget, or a count of films greater than five, that signals HAVING because SQL has to group the data and calculate those values first.17. Iterative prompting
Sometimes the clearest way to get what we want is to iterate. Start with a smaller prompt, like selecting the right field or doing one filter first, check the results, and then prompt again to add more detail. Repeat until you get the desired results.18. Time to practice!
With these language patterns in mind, it's time to practice.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.