Putting it all together again
Now that we have a better understanding of how merges can enrich our data, let's revisit a summary table.
Currently loaded are two DataFrames:
transactions
- A full list of each ticket sale transaction - but no information on movie genre.movies
- A table of our movie titles and genre
Let's put these two tables together to create a view we took for granted before - ticket quantity sold for each genre.
This exercise is part of the course
Python for Spreadsheet Users
Exercise instructions
- Merge
transactions
withmovies
on themovie_title
column. - Group by
movie_genre
and sum. Store the result ingenre_summary
. - Sort
genre_summary
byticket_quantity
. Store the result asgenre_summary_sorted
. - Print
genre_summary_sorted
(this has been done for you).
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Merge transaction data with the movie data on movie_title
transactions_with_genres = ____
# Group by movie_genre and call the sum method
genre_summary = transactions_with_genres.groupby(____, as_index=False).____()
# Sort the genre summary by ticket_quantity
genre_summary_sorted = genre_summary.____('ticket_quantity', ascending=False).reset_index(drop=True)
# View the summary
print(genre_summary_sorted)