Get startedGet started for free

The keys to the database

1. The keys to the database

The second stage of exploring a database is understanding the formal relationships, or links, between tables. These explicit relationships are one of the benefits of having your data in a database rather than in a set of individual data files.

2. Foreign keys

Foreign keys are the formal way that database tables are linked together. In this example, the actor_id column in the film_actor table is a foreign key that references the id column of the actor table.

3. Foreign keys

A foreign key is a column that references a single, specific row in the database. The referenced row is usually in a different table, but foreign keys can reference rows in the same table as well. Foreign keys reference other rows using a unique identifier for the row. The unique ID often comes from a primary key column in the referenced table. Primary keys are specially designated columns where each row has a unique, non-null value. Foreign key columns are restricted to contain either a value that is in the referenced column, or null. If the value is null, it indicates that there's no relationship for that row.

4. ER diagram

Let's look at the entity relationship diagram for our database. In the diagram, foreign keys are indicated on the arrows between tables.

5. ER diagram

The value before the colon is the name of the column in the table from which the arrow originates. The value after the colon is the name of the referenced column in the table the arrow is pointing to. So the company_id column in the tag_company table refers to the id column in the company table.

6. ER diagram

When an arrow points from and to the same table, this is a self reference. parent_id in the company table references the id column in the same table.

7. ER diagram

Note that there's no foreign key linking the company table to the fortune500 table. But this doesn't prevent us from joining these tables. Both tables have ticker columns with comparable values that can be used to join the tables. The lack of a foreign key relationship just means that the values in the ticker columns aren't restricted to the set of values in the other table.

8. Primary Keys

The diagram also shows which columns are primary keys. Primary keys have a border around them at the top of each list of columns. Primary keys uniquely identify the rows in the table.

9. Coalesce function

Before you return to the exercises, let's add the coalesce function to your toolkit. coalesce takes two or more values or column names as arguments. The three dots in square brackets here indicate that additional values can be supplied as inputs. The coalesce function operates row-wise on the input. It returns the first non-NULL value in each row, checking the columns in the order they're supplied to the function.

10. Coalesce function

Here's an example. We have a table called prices with two columns. Remember that blanks are null values. We can use coalesce to combine these two columns. If column_1 is not null, coalesce returns that value. If column_1 is null, coalesce returns the value of column_2. In this example, the first value returned by coalesce is 10. This is because, in the first row of prices, the value of column_1 is NULL. So coalesce returns the value of column_2. Coalesce returned four values because there were four rows in the input. Coalesce is useful for specifying default or backup values when selecting a column that might contain NULL values.

11. Time to keep exploring!

OK, time to use your knowledge of foreign keys and the coalesce function to continue exploring the database.