1. Types of joins
Hi. In this video we'll talk about three types of joins and the characteristics of each type.
2. Three types of joins
There are different ways to categorize joins.
For our purposes, it makes sense to first look at three major types.
The three types are known as
one-to-one,
one-to-many,
and many-to-many.
These names describe the relationship of one table to another based on a lookup column, also known as a key column or index.
In a spreadsheet environment, the values in the key column are like the lookup values in a VLOOKUP formula.
3. One-to-one
We can use our first type of join, the one-to-one join,
for datasets at the same level.
In other words, the lookup column of each table is the same for both tables.
In this example, both tables have a key column called GameKey and have a single row of data for each game.
Joining these tables based on GameKey results in rows matched one to one.
4. One-to-many
One-to-many joins are slightly more complex.
They apply when tables have unique rows of data at different levels. Look at the example here. As before, the table on the left has one row of data for each game. The table on the right has several rows for each game. The data in the right-hand table is unique to combinations of game and play. You can think of GameKey as the higher-level column and PlayId as the lower-level column.
The tables are joined by the higher-level column, GameKey,
and the resulting table contains data for each game and play combination.
In a spreadsheet environment, you would probably use the table with lower-level data, the yellow table, as the base table to avoid creating extra rows. You could then use VLOOKUP formulas to pull game data from the blue table. The result is the same.
5. Many-to-many
The last type of join, many-to-many, is often the most difficult to treat.
The many-to-many join involves datasets with repeated values in the key column for both tables. In this example, both tables have PlayerId in common. Note that each player appears in several rows of data for both tables.
Joining the tables results in a table with all possible combinations of data across the two tables.
A word of caution on these joins: when neither table has unique lookup values, the combination can be hard to interpret. For example, it's not likely that a player wore several numbers during a single game. That would confuse everyone!
So, we need another table, such as one with games and dates, to bridge these datasets. Using a bridge table like this turns the many-to-many join into a series of one-to-one or one-to-many joins. For this reason, we won't go into more detail on many-to-many joins in this course.
6. Let's practice!
OK then, it's time to practice what you just learned.