Get startedGet started for free

Restructuring a dictionary

Now you want to clean up the politician data and move it into a Dask DataFrame. However, the politician data is nested, so you will need to process it some more before it fits into a DataFrame.

One particular piece of data you want to extract is buried a few layers inside the dictionary. This is a link to a website for each politician. The example below shows how it is stored inside the dictionary.

record = {
...
 'links': [{'note': '...',
            'url': '...'},],  # Stored here
...
}

The bag of politician data is available in your environment as dict_bag.

This exercise is part of the course

Parallel Programming with Dask in Python

View Course

Exercise instructions

  • Complete the extract_url() function to extract the 'url' key from the dictionary, which is in the zeroth position in the list under the 'links' key, and assign this to the key url.
  • Run the extract_url() function across all elements of the bag.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def extract_url(x):
    # Extract the url and assign it to the key 'url'
    x['url'] = x[____][____][____]
    return x
  
# Run the function on all elements in the bag.
dict_bag = ____

print(dict_bag.take(1))
Edit and Run Code