Conditional analysis
1. Conditional analysis
In our final video lesson, we'll learn how to use conditional logic with aggregates and combine the techniques we've learned to supercharge our analysis.2. Conditional aggregates
Remember grouping films by release year? That gave us total counts per year. But what if we need more detail? For example, how many films had budgets recorded versus how many didn't, for each year? This requires conditional aggregates. When we prompt "Show the number of films with a budget and without a budget, for each year", the AI may generate a query that uses COUNT() with a CASE statement.3. Conditional aggregates
This works row by row, returning 1 if the condition matches and NULL if it doesn't.4. Conditional aggregates
Since COUNT() ignores NULLs, it only counts the rows where 1 was returned. That gives us the total number of films that meet the condition.5. Conditional aggregates
Sometimes SQL uses an alternative pattern, wrapping the CASE inside a SUM. At first these look different, but they give the same result. This happens because, with COUNT, the unmatched rows become NULLs and are ignored. With SUM, the unmatched rows become 0s, and adding them up produces the same result.6. Conditional aggregates with NULLs
This works well, but NULL values often need careful handling. In our earlier example, we wrote one condition to count films with a budget and another to count films without, which included NULLs. Sometimes including missing values gives misleading results. In those cases, we need to filter out the NULL values as we've previously learned, before applying our CASE logic. Let's explore this idea with budgets and see how all of the techniques we've learned can come together in a single query.7. Combining techniques
In practice, analysts often face layered questions. We don't just want totals; we want them broken down by year, filtered for reliable data, split into categories, and trimmed to a readable size. The following prompt asks for quite a lot: "For each year after 2000 with at least 5 films that have budget data, categorize films as High Budget if the budget is over 1 million and Low Budget if it is 1 million or less, and show the counts for each category, ordered by year and showing the top 10 results". Remember, we can always break this down with iterative prompts. But let's look at the resulting query. It uses WHERE to filter years and exclude NULL budgets, COUNT with CASE to split High Budget and Low Budget films, GROUP BY to organize results by year, HAVING to show only years with enough films, ORDER BY to sort chronologically, and LIMIT to show the top 10 results.8. Beyond counting
Beyond counting, these conditional patterns work with other aggregates too. The approach remains the same. The fun part of analysis is experimenting with different conditions and thresholds to uncover insights that basic queries would miss entirely. These techniques let us slice data in ways that reveal hidden insights, like discovering which decades produced the most high-budget films, or whether shorter movies generate better profit margins than epics.9. Let's practice!
Let's head to some final exercises to practice what we've learned.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.