Get Started

Working with hierarchical tables

1. Working with hierarchical tables

In addition to tables that join with other tables, there are also tables that join with themselves.

2. Hierarchical tables

We call these tables self-referential or hierarchical tables. These are commonly used to store organizational charts, geographic data, networks and relationship graphs.

3. Hierarchical tables - example

Here we have an employees table, which, in addition to having the employee's name and position, also contains an column for their manager. That manager is also an employee and has a record in that table. The table has an undefined relationship between the id column and the manager column.

4. Hierarchical tables - alias()

In order to use this relationship in a query, we need a way to refer to this table by another name. The alias method allows us to do just that by creating a way to refer to the same table with two unique names.

5. Querying hierarchical data

Let's get a list of managers and the employees that report to them. To join the employees table using the relationship, we start by using the alias method on the employees table and storing the alias as managers. Now we can use both the name managers and employees to refer to the table. Now we are ready to build our query. We start by selecting the name column from the managers alias and labeling that column as manager. Next we select the name column from the employees table and label it employee. Now we use the select_from method to wrap an explicit join from the employees table to the managers alias. We use the id column from the managers alias with the manager column of the employees table to form the join condition. Next, we order by the managers name. Finally, we execute the statement and review the results. For example, Taft's supervisor here is Harding.

6. group_by and func

Hierarchical tables can get tricky when performing group_bys or using functions. It's important to think of it as if it were two different tables. You should focus on having the table in the group_by and the alias in the function or vice versa. It's super important to make sure you are using both the alias and the table in the query when using the join otherwise you could cause the query to error or use a lot of resources.

7. Querying hierarchical data

To practice this let's pretend that we are making next years budgets and we need to know how much salary to allocate for each managers employees. We start by making the managers alias of the employees table. Then we begin building the select statement, we select the managers name and then sum all the employees salaries. Next, we use the same explicit join from the previous example in the select_from. Next, we group by the managers name and finally we execute the query. Notice that we applied the function to the employees table and grouped by the managers alias.

8. Let's practice!

It's now your turn to try your hand at this tricky challenge.