Analyzing songs on Spotify
You have a list of CSV files that you want to aggregate to investigate the Spotify music catalog. Importantly, you want to be able to do this quickly and to utilize all your available computing power to do it.
Each CSV file contains all the songs released in a given year, and each row gives information about an individual song.
dask
and delayed()
have been imported for you, and the list of filenames is available in your environment as filenames
. pandas
has been imported as pd
.
Diese Übung ist Teil des Kurses
Parallel Programming with Dask in Python
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
n_songs_in_c, n_songs = 0, 0
for file in filenames:
# Load in the data
df = ____(____)(____)