1. Connecting to your database
In the Python world, there are several great tools that we can use when working with databases.
2. Meet SQLAlchemy
One of those is SQLAlchemy that we will be using throughout this course. SQLAlchemy will allow us to generate SQL queries by writing Python code. You should still consider learning how to write queries in SQL as well.
SQLAlchemy has two main components. The part we will be focusing on is often referred to as "core" part of SQLAlchemy. It's really focused around the relational model of the database. Additionally, there is the Object Relational Model or ORM part of SQLAlchemy that is really focused around data models and classes that you as a programmer create.
3. There are many types of databases
There are many different types of databases, and each database type has its own quirks and unique capabilities. You'll commonly find SQLite, PostgreSQL, MySQL, Microsoft SQL Server, and Oracle when working with data. SQLAlchemy provides a way to operate across all of these database types in a consistent manner.
4. Connecting to a database
To connect to a database, we need a way to talk to it, and an engine provides that common interface. To create an engine, we import the create_engine function from sqlalchemy; we then use the create_engine function and supply it a connection string that provides the details needed to connect to a database. Finally once we have an engine, we are ready to make a connection using the connect method on the engine. It's worth noting that SQLAlchemy won't actually make the connection until we give it some work to execute. So to review, an engine is the common interface to the database, which requires a connection string to provide the details used to find and connect to the database.
5. A word on connection strings
Before we go any further, let's talk a bit more about connection strings. In their simplest form, they tell us what kind of database we are talking to and how we should access it. In this example, you can see that we are using the sqlite database driver
6. A word on connection strings
and the database file named census_nyc-dot-sqlite which is in the current directory.
7. What's in your database?
Now that we have an engine and a connection, we need to know what tables are in the database. We'll start again by importing the create_engine function and creating an engine to our database. Finally, we can use the table_names method of the engine which returns a list of tables.
8. Reflection
Once we know what table we want to work on, we need a way to access that table with python. To do that we are going to use a handy process called reflection, which reads the database and builds a Table object that we can use in our code. We already have created our engine, so we begin by importing the MetaData and Table objects needed for reflection. The MetaData object is a catalog that stores database information such as tables so we don't have to keep looking them up. To reflect the table, we initialize a MetaData object. Next, we use the SQLAlchemy Table object and provide the table name we got earlier from the table_names method. We also supply our metadata instance, and then instruct it to autoload the table using the engine. Finally, we can use the function repr to view the details of our table that we stored as census. This allows us to see the names of the columns, such as the 'state' and 'sex' columns, along with their types, such as VARCHAR and INTEGER.
This process of reflection may seem a bit of an overhead, but it will make understanding your databases and extracting information from them far easier downstream.
9. Let's practice!
Now it's your turn to practice writing connection strings, connecting to databases and reflecting tables. Then we'll be back here writing our first SQL queries.