Creating Dask dataframes from CSVs
Previously, you analyzed the Spotify song data using loops and delayed functions. Now you know that you can accomplish the same thing more easily using a Dask DataFrame. Let's see how much easier the same tasks you did earlier are if you do them using these methods instead of loops. First, however, you will need to load the dataset into a Dask DataFrame.
Diese Übung ist Teil des Kurses
Parallel Programming with Dask in Python
Anleitung zur Übung
- Import the
dask.dataframe
subpackage asdd
. - Read all the CSV files in the
data/spotify
folder using a maximum blocksize of 1MB. - Use the
dd.to_datetime()
function to convert the strings in the'release_date'
column to datetimes. - Use the DataFrame's
.head()
method to show 5 rows of the table.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Import dask dataframe as dd
____
# Load in the DataFrame
df = ____
# Convert the release_date column from string to datetime
____
# Show 5 rows of the DataFrame
print(____)