Session Ready
Exercise

SQL: Query multiple tables

Not only can you create very specific queries against a single table on data.world, you can also write queries against multiple tables within a single dataset or across many datasets!

We call these cross-dataset queries 'federated queries' and they're just like same-dataset queries, but require you to specify not just the name of the tables, but also the unique dataset path that the remote table is in using dot notation.

We've been using the full URL for the dataset in our datadotworld method examples, but for federated queries, you'll need to use just the unique part of the URL path. Let's look at the dataset https://data.world/len/intelligence-of-dogs. The unique part of the URL is len/intelligence-of-dogs, where len is the owner id and intelligence-of-dogs is the dataset id.

Lets use that to build an example query that calls the dog_intelligence table from that dataset, which will be joined against the AKC Breed Info table of the 'local' dataset we'll pass as a parameter to the query method, https://data.world/len/dog-canine-breed-size-akc:

SELECT 
  breedSmarts.Classification, 
  AVG(breedSmarts.obey) AS obey, 
  AVG(weight_high_lbs) AS heavyWeightAvg, 
  AVG(weight_low_lbs) AS lowWeightAvg 
FROM `AKC Breed Info` AS breedSize
  JOIN len.`intelligence-of-dogs`.`dog_intelligence` 
    AS breedSmarts 
  ON breedSmarts.Breed = breedSize.breed
GROUP BY breedSmarts.Classification
ORDER BY obey

Notice on the 'secondary' table we used the dataset owner id and dataset id to reference the dog_intelligence table:

len.`intelligence-of-dogs`.`dog_intelligence

Now try a similar query on your own with a different set of datasets:

Instructions
100 XP
  • Complete the federated SQL query to return three columns: state, the count of farmers markets (fmid) per state, and the average adult obesity rate (adult_obese.value) per state from the export table in https://data.world/agriculture/national-farmers-markets, left joined with the adult_obese table in https://data.world/health/obesity-by-state-2014 on the export.state and adult_obese.location fields.
  • Execute the SQL query against https://data.world/agriculture/national-farmers-markets using the query() method
  • Create a stateStats dataframe from the results
  • Plot the stateStats results using State as the x-axis (matplotlib is already imported)