Exercise

From task to subtasks

For this exercise, you will be using parallel computing to apply the function take_mean_age() that calculates the average athlete's age in a given year in the Olympics events dataset. The DataFrame athlete_events has been loaded for you and contains amongst others, two columns:

  • Year: the year the Olympic event took place
  • Age: the age of the Olympian

You will be using the multiprocessor.Pool API which allows you to distribute your workload over several processes. The function parallel_apply() is defined in the sample code. It takes in as input the function being applied, the grouping used, and the number of cores needed for the analysis. Note that the @print_timing decorator is used to time each operation.

Instructions

100 XP
  • Complete the code, so you apply take_mean_age with 1 core first, then 2 and finally 4 cores.