Get startedGet started for free

Introduction to relational databases

1. Your first database

Welcome to this course on Introduction to Relational Databases. My name is Timo Grossenbacher, and I work as a data journalist in Switzerland. In this course, you will see why using relational databases has many advantages over using flat files like CSVs or Excel sheets. You'll learn how to create such databases, and bring into force their most prominent features.

2. Investigating universities in Switzerland

Let me tell you a little story first. As a data journalist, I try to uncover corruption, misconduct and other newsworthy stuff with data. A couple of years ago I researched secondary employment of Swiss university professors. It turns out a lot of them have more than one side job besides their university duty, being paid by big companies like banks and insurances. So I discovered more than 1500 external employments and visualized them in an interactive graphic, shown on the left. For this story, I had to compile data from various sources with varying quality. Also, I had to account for certain specialties, for example, that a professor can work for different universities; or that a third-party company can have multiple professors working for them. In order to analyze the data, I needed to make sure its quality was good and stayed good throughout the process. That's why I stored my data in a database, whose quite complex design you can see in the right graphic. All these rectangles were turned into database tables.

3. A relational database:

But why did I use a database? A database models real-life entities like professors and universities by storing them in tables. Each table only contains data from a single entity type. This reduces redundancy by storing entities only once – for example, there only needs to be one row of data containing the details of a certain company. Lastly, a database can be used to model relationships between entities. You can define exactly how entities relate to each other. For instance, a professor can work at multiple universities and companies, while a company can employ more than one professor.

4. Throughout this course you will:

Throughout this course, you will actually work with the same real-life data used during my investigation. You'll start from a single table of data and build a full-blown relational database from it, column by column, table by table. By doing so, you'll get to know constraints, keys, and referential integrity. These three concepts help preserve data quality in databases. By the end of the course, you'll know how to use them. In order to get going, you'll just need a basic understanding of SQL – which can also be used to build and maintain databases, not just for querying data.

5. Your first duty: Have a look at the PostgreSQL database

I've already created a single PostgreSQL database table containing all the raw data for this course. In the next few exercises, I want you to have a look at that table. For that, you'll need to retrieve your SQL knowledge and query the "information_schema" database, which is available in PostgreSQL by default. "information_schema" is actually some sort of meta-database that holds information about your current database. It's not PostgreSQL specific and also available in other database management systems like MySQL or SQL Server. This "information_schema" database holds various information in different tables, for example in the "tables" table.

6. Have a look at the columns of a certain table

"information_schema" also holds information about columns in the "columns" table. Once you know the name of a table, you can query its columns by accessing the "columns" table. Here, for example, you see that the system table "pg_config" has only two columns – supposedly for storing name-value pairs.

7. Let's do this.

Okay, let's have a look at your first database.