Session Ready
Exercise

SQL: Querying a table

Another way to pull data in from data.world is to use the query() method of the datadotworld module. query() lets you use SQL or SPARQL to query one or more datasets, and takes a dataset URL and the query string as parameters. This makes it easy for you to pull in exactly what fields and aggregations you need from the data, and even lets you get joined tables with a single call to data.world.

The query() method is a lot like load_dataset as it gives you access to three properties to access the resulting data: raw_data, table, and dataframe. Lets try out a couple of SQL queries and then we'll jump into a SPARQL example.

SQL on data.world is actually a dialect we've created called dwSQL. dwSQL does most everything you'd normally do in a SQL SELECT, and it also has some extended funcionality specific to data.world. Check out the full dwSQL documentation, and here we'll work with some basics.

Using the dataset at https://data.world/nrippner/refugee-host-nations, follow the instructions below:

Instructions
100 XP
  • Complete the SQL query to select all rows from the unhcr_all table where Year equals 2010. Assign the query string to a sql_query variable.
  • Use the query method of the datadotworld module to run the sql_query against the https://data.world/nrippner/refugee-host-nations dataset. Assign the results to a query2010 variable.
  • Use the dataframe property of the resulting query to create a dataframe variable named unhcr2010
  • Print the first 5 rows using the head method.