1. Joining with a one-to-many relationship
In the first lesson, you joined the sets table to the themes table.
2. Joining sets and themes
Each set has exactly one theme, so the joined table has exactly as many observations as the sets table: 4,977.
3. The inventories table
But not all joins work that way. Let's introduce a new table, inventories. An inventory represents a product that's made up of some combination of parts.
Notice that inventories has the variable set num, which suggests it links to the "set num" variable from our sets table.
4. Joining sets and inventories
Let's try using that variable to join sets to inventories. Something convenient is that when the variables you're joining by have the same name, you can just write by equals quote set-underscore-num unquote. This is in contrast to the way we earlier did c theme-underscore-id equals id.
Notice that while we started with 4,977 sets, our joined result now has 5,056. This is because each set can have multiple versions, each of which gets its own inventory item. For instance, we can see that Medium Gift Set from 1949 has three versions, and Single 2 by 4 brick has five versions.
5. Filtering the joined table
One way we can confirm our understanding is to filter this joined table for only the first version.
Notice this results in a table with 4,976 observations, compared to our 4,977. This means there's one set that doesn't have a version 1, which is probably a data quality issue. An inner join keeps an observation only if it has an exact match between the first and the second tables.
We'll return to how we could find a missing observation in later chapters, and you'll also see that paying close attention to the number of rows before and after a join is an important part of understanding your data.
6. Parts and pieces
In the exercises, you'll be returning to the parts table. But you'll also see each LEGO piece has another attribute besides its part: its color.
7. The inventory parts
The inventory parts table is what combines a part and a color. That combination describes a single LEGO piece, which in turn is a part of an inventory. You'll see how all of these connect in the rest of this chapter.
8. Let's practice!
Let's practice!