1. Identify favorite actors of customer groups
In this lesson, we will combine different SQL statements to complex queries and explore who the favorite actors for certain customer groups are.
2. Combining SQL statements in one query
In one query, we will combine now LEFT JOIN, WHERE, GROUP BY, HAVING and ORDER BY statements.
3. From renting records to customer and actor information
First, we need to prepare all information we need to answer the following question: "Who is the favorite actor for a certain customer group?"
We take the table 'renting' and use a LEFT JOIN to augment it with information about actors and customers. Since the table 'actors' cannot be matched with 'renting' directly, we need to JOIN 'renting' with 'actsin' first by 'movie_id', then we can join actors by 'actor_id'.
This joined table will be the basis for the next queries.
4. Male customers
From the joined table, we select only the customers with the gender 'male' as you can see in the second to last row. Then we group by the actors' names with 'GROUP BY a-dot-name'. In the first line, you can see that we report the actors' names and, for each actor, count how often their movies are watched by male customers.
5. Who is the favorite actor?
We can now ask, "What makes an actor the favorite actor?" Is it being watched most often as we assumed in the last query? An alternative definition could be that the movies in which they appear get the best ratings.
This can be easily included in the table by adding the third line in this query, that is average of rating.
6. Add HAVING and ORDER BY
There are actors who only appear in movies without a rating. These actors will get 'NULL' as an average rating. If we want to exclude these actors, we can do so by using the HAVING clause: HAVING average rating IS NOT NULL.
Note the difference between the WHERE clause, which is applied to the table before the aggregation, and the HAVING clause which is applied after the aggregation with GROUP BY.
The final line orders the result table such that the highest average rating comes first and for the same average rating the highest number of views comes first.
7. Add HAVING and ORDER BY
In our fictional movie database, Ray Romano is the favorite actor among male customers, receiving the best possible rating of 10 - but only with 3 views.
8. Let's practice!
Here you saw how to combine several SQL statements to answer the question "Who is the favorite actor?". Now it's your turn to answer the question "Which is the favorite movie?".