ComenzarEmpieza gratis

Chaining operations

Now that you have loaded and cleaned the data, you can begin analyzing it. Your first task is to look at the birth dates of the politicians. The birth dates are in string format like 'YYYY-MM-DD'. The first 4 characters in the string are the year.

The filtered Dask bag you created in the last exercise, filtered_bag, is available in your environment.

Este ejercicio forma parte del curso

Parallel Programming with Dask in Python

Ver curso

Instrucciones del ejercicio

  • Use the bag's .pluck() method to extract 'birth_date' strings.
  • Write a lambda function to extract the year string from the 'birth_date' strings and convert it into an integer.
  • Use the new bag birth_year_bag to calculate the min, max, and mean birth years.
  • Use the dask.compute() function to calculate the three aggregates efficiently.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Select the 'birth_date' from each dictionary in the bag
birth_date_bag = filtered_bag.____

# Extract the year as an integer from the birth_date strings
birth_year_bag = birth_date_bag.____(lambda x: ____)

# Calculate the min, max and mean birth years
min_year = ____
max_year = ____
mean_year = ____

# Compute the results efficiently and print them
print(____(____))
Editar y ejecutar código