LoslegenKostenlos loslegen

Filtering Dask bags

The politician data you are working with comes from different sources, so it isn't very clean. Many of the dictionaries are missing keys that you may need to run your analysis. You will need to filter out the elements with important missing keys.

A function named has_birth_date() is available in the environment. It checks the input dictionary to see if it contains the key 'birth_date'. It returns True if the key is in the dictionary and False if not.

def has_birth_date(dictionary):
  return 'birth_date' in dictionary

The bag you created in the last exercise is available in your environment as dict_bag.

Diese Übung ist Teil des Kurses

Parallel Programming with Dask in Python

Kurs anzeigen

Anleitung zur Übung

  • Use dict_bag's .count() method to print out the number of elements it contains.
  • Use the has_birth_date() function to filter out the elements which do not have the 'birth_date' key.
  • Print out the number of elements filtered_bag contains.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Print the number of elements in dict_bag
print(____)

# Filter out records using the has_birth_date() function
filtered_bag = dict_bag.____(____)

# Print the number of elements in filtered_bag
print(____)
Code bearbeiten und ausführen