Exercise

Chaining operations

Now that you have loaded and cleaned the data, you can begin analyzing it. Your first task is to look at the birth dates of the politicians. The birth dates are in string format like 'YYYY-MM-DD'. The first 4 characters in the string are the year.

The filtered Dask bag you created in the last exercise, filtered_bag, is available in your environment.

Instructions

100 XP
  • Use the bag's .pluck() method to extract 'birth_date' strings.
  • Write a lambda function to extract the year string from the 'birth_date' strings and convert it into an integer.
  • Use the new bag birth_year_bag to calculate the min, max, and mean birth years.
  • Use the dask.compute() function to calculate the three aggregates efficiently.