Get startedGet started for free

Querying relational databases in Python

1. Querying relational databases in Python

Now that you have figured out how to create a database engine and to list the tables of the database in question, it's time to connect to the engine and query the database. Once again, the term "querying" is just a fancy way of saying getting data out from the database.

2. Basic SQL query

THe HELLO WORLD of SQL queries is 'SELECT * FROM Table_Name', where 'Table_name' is the name of any of the tables in the database. This query returns all columns of all rows of the Table of interest. For example, I could query the Northwind database with 'SELECT * FROM Orders' and this would return all columns of all rows of the 'Orders' table. The star after SELECT means 'all columns'. Straightforward, right? Well, nearly! This is an SQL query and we need to figure out how to make such a query using python, SQLAlchemy and, in fact, we'll also use pandas to store the results of our queries.

3. Workflow of SQL querying

The workflow will be as follows. You'll import the required packages and functions, create the engine, connect to it, query the database save the results of the query to a dataframe, and close the connection.

4. Your first SQL query

Let's now check out how to do each of these steps! Create the engine using the function create_engine. To connect to the database after creating the engine, you create a connection object con by applying the method connect to the engine. To query the DB, apply the method execute to the connection con and pass it a single argument, the relevant SQL query; This creates a SQLAlchemy results object which we assign to the variable rs. To turn the results object rs into a dataframe, we apply the method fetchall to rs and save it as a dataframe using the pandas function DataFrame. fetchall fetches all rows, as you would expect. To close the connection, execute 'con dot close'. Don't forget to do this!

5. Printing your query results

You can then print the head of the dataframe, as we have done before, as a sanity check: all the rows look good but the column names aren't correct.

6. Set the DataFrame column names

To fix this, before closing the connection, you can set the dataframe's column names by executing 'df dot columns equals rs dot keys'.

7. Set the data frame column names

One last note: analogous to what you saw in chapter 1 when opening plain text files, you can use

8. Using the context manager

the context manager construct to open a connection, which will save you the trouble of closing the connection later, or save you the trouble of forgetting to close it! There are two other differences that you may have notice between this and the previous code: firstly, I no longer have 'SELECT *' in the SQL query; I now have column names of the table 'Orders'; all this does is it imports those particular columns and no others whereas 'SELECT *' imports all columns; secondly, instead of applying the method fetchall to the results rs, I apply the method fetchmany with the argument size equals 5; this imports 5 rows instead of all rows. You'll become better acquainted with these functions and arguments very soon.

9. Let's practice!

That's enough out of me! It's time for you to practice writing your own SQL queries to import data from your database, enjoy!