Get startedGet started for free

Working with multiple datasets

You can load as many different datasets as you’d like from data.world and work with them together. Here we’ve used the load_dataset method to bring in two separate datasets, assigning them each to a variable.

We’ll leave it to you to create a dataframe for each using the dataframes property, and then merge the two dataframes together on the state and stusab fields.

After merging them together, add a new citystate field to your merged dataset, populating it with the concatenated values of the city and state_name fields, separated by , resulting in a city, state format.

This exercise is part of the course

Intro to data.world in Python

View Course

Exercise instructions

  • Create a police_shootings dataframe from the fatal_police_shootings_data table in int_dataset
  • Create a state_abbrvs dataframe from the statesfipscodes table in fipsCodes_dataset
  • Merge the two dataframes together on the state and stusab fields using the merge() function. Assign to merged_dataframe.
  • Add a citystate column to your merged dataframe, populating it with the concatinated values from the city and state_name columns, separated by ', '.
  • Print first 5 rows of merged dataframe

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# datadotworld module has been imported as dw
import datadotworld as dw

# We've loaded two datasets to use 'int_dataset' and 'fipsCodes_dataset'
int_dataset = dw.load_dataset('https://data.world/jonloyens/intermediate-data-world')
fipsCodes_dataset = dw.load_dataset('https://data.world/uscensusbureau/fips-state-codes')

## Create two dataframes: police_shootings from the 'fatal_police_shootings_data' table of int_dataset and state_abbrvs, from the 'statesfipscodes' table of fipsCodes_dataset


## Merge the two datasets together on the state and stusab fields. Assign to a merged_dataframe variable.
merged_dataframe = ____.____(____, how = 'left', left_on = '____', right_on='____')


## Add a 'citystate' column to the merged_dataframe dataframe, populating it with the concatinated values from the 'city' and 'state_name' columns, separated by ', '. 


## Print first 5 rows of merged_dataframe

Edit and Run Code