Aan de slagGa gratis aan de slag

Performing a semi join

Some of the tracks that have generated the most significant amount of revenue are from TV-shows or are other non-musical audio. You have been given a table of invoices that include top revenue-generating items. Additionally, you have a table of non-musical tracks from the streaming service. In this exercise, you'll use a semi join to find the top revenue-generating non-musical tracks.

The tables non_mus_tcks, top_invoices, and genres have been loaded for you.

Deze oefening maakt deel uit van de cursus

Joining Data with pandas

Cursus bekijken

Oefeninstructies

  • Merge non_mus_tcks and top_invoices on tid using an inner join. Save the result as tracks_invoices.
  • Use .isin() to subset the rows of non_mus_tcks where tid is in the tid column of tracks_invoices. Save the result as top_tracks.
  • Group top_tracks by gid and count the tid rows. Save the result to cnt_by_gid.
  • Merge cnt_by_gid with the genres table on gid and print the result.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Merge the non_mus_tcks and top_invoices tables on tid
tracks_invoices = ____.merge(____)

# Use .isin() to subset non_mus_tcks to rows with tid in tracks_invoices
top_tracks = _____[non_mus_tcks['tid'].isin(____)]

# Group the top_tracks by gid and count the tid rows
cnt_by_gid = top_tracks.groupby(['gid'], as_index=False).agg({'tid':____})

# Merge the genres table to cnt_by_gid on gid and print
print(____)
Code bewerken en uitvoeren