Get startedGet started for free

Joins and unions in a data model

1. Joins and unions in a data model

Welcome back. In this video we'll discuss creating a base table for a data model using joins and unions.

2. Purpose of a base table

The first step in creating a data model is adding a table. Though it's not a strict requirement in Sigma, data models in this course start with what we'll refer to as a base table.

3. Purpose of a base table

A base table represents the core unit of analysis for the model. It should have one row for each object or event that you're modeling. For example, in a data model for orders, each row of the base table should represent one order. This is sometimes referred to as defining the granularity of the data, and it's a great place to start when designing a data model.

4. Purpose of a base table

After determining the right granularity, consider if your base table needs information from one or more warehouse tables. If you were modeling order data, you might need data about the customers that placed those orders as well. So, you might combine the orders table from your warehouse with the customers table to create a base table with data from both.

5. Purpose of a base table

Two common ways to combine data are joins and unions. You might be familiar with joins and unions from SQL or other data tools, and they work just the same in Sigma. While we will do a brief conceptual review in this video, we assume some familiarity, and the focus in this course is on making use of these operations in Sigma.

6. Purpose of a base table

For more information on joins and unions, you can see additional resources on DataCamp, or refer to Sigma's documentation.

7. Joins

Let's start with joins.

8. Joins

Joins combine data by bringing columns from one table onto another, using a shared column as a key. Each row from one table can have additional fields added onto it from another. You can think of this as making the table "wider."

9. Joins

For example, you might join customer data onto an accounts table using a customer ID field, so that each row includes information on the customer that holds that account, like adding the email address of the account holder onto each row.

10. Unions

We can also combine data with unions.

11. Unions

A union combines data by adding rows from one table onto the end of another table. By matching columns to one another, you can create a "longer" table with more rows.

12. Unions

Unions are useful when similar data, with matching columns, is split across tables. The tables almost always have the same granularity, and a union allows you to treat them as one continuous dataset.

13. Unions

For example, if you had two tables of transactions from two different Oakmark Bank locations, you could union them together, to create a single table with all the transactions.

14. Purpose of combining data

While it is possible to build a base table with no joins or unions, consider the potential consequences. If the end users of your data model are frequently creating joins or unions themselves, that might be a sign to add that data to the model. That way, the base table remains consistent between users, and the chance of error from repeated manual joins is reduced.

15. Let's practice!

In the next video we'll create a data model and build a base table. Now, try an exercise.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.