Session Ready
Exercise

Comparing Dask & pandas execution times

The function you created in the last exercise can be used with either Dask or Pandas DataFrames. The only difference is that after the function is run on a Dask DataFrame, .compute() must be called on the result to perform the computation.

Your job is to run the by_region function separately on a Pandas DataFrame and a Dask DataFrame read from the same CSV file. To help understand how much time is taken when reading the file you'll compare the execution of the function with the Dask DataFrame to the Pandas DataFrame where the time taken to call pd.read_csv is included or ignored.

Instructions 1/3
undefined XP
  • 1
  • 2
  • 3

Time the execution of pd.read_csv() and by_region together with 'WDI.csv' and print in milliseconds.