Get startedGet started for free

Performing a semi join

Some of the tracks that have generated the most significant amount of revenue are from TV-shows or are other non-musical audio. You have been given a table of invoices that include top revenue-generating items. Additionally, you have a table of non-musical tracks from the streaming service. In this exercise, you'll use a semi join to find the top revenue-generating non-musical tracks.

The tables non_mus_tcks, top_invoices, and genres have been loaded for you.

This exercise is part of the course

Joining Data with pandas

View Course

Exercise instructions

  • Merge non_mus_tcks and top_invoices on tid using an inner join. Save the result as tracks_invoices.
  • Use .isin() to subset the rows of non_mus_tcks where tid is in the tid column of tracks_invoices. Save the result as top_tracks.
  • Group top_tracks by gid and count the tid rows. Save the result to cnt_by_gid.
  • Merge cnt_by_gid with the genres table on gid and print the result.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Merge the non_mus_tcks and top_invoices tables on tid
tracks_invoices = ____.merge(____)

# Use .isin() to subset non_mus_tcks to rows with tid in tracks_invoices
top_tracks = _____[non_mus_tcks['tid'].isin(____)]

# Group the top_tracks by gid and count the tid rows
cnt_by_gid = top_tracks.groupby(['gid'], as_index=False).agg({'tid':____})

# Merge the genres table to cnt_by_gid on gid and print
print(____)
Edit and Run Code