Get startedGet started for free

Left and right joins

1. Left and right joins

There are two more joins you can perform using the merge() function, left joins and the right joins.

2. Left joins

A left join keeps only the observations that are present in the data table on the left side of the join. In other words, a left join will add information from the data table on the right to the data table on the left. This is useful when you have two data tables from different sources, but you're really only interested in the observations from one. The data table on the left side of the join is the data table given to the x argument of the merge() function. To perform a left join with the merge() function, you set the argument all.x to be equal to TRUE.

3. Right joins

Conversely, a right join keeps only the observations that are present in the data table on the right side of the join. The data table on the right side of the join is the data table given to the y argument of the merge() function. To perform a right join with the merge() function you set the argument all.y to be equal to TRUE.

4. Right joins - Left joins

The outcome is the same as swapping the order of the data tables in the merge() function and performing a left join. Most people find one or the other to fit more naturally when thinking about data and stick to that. Its only in rare cases when joining multiple data tables in a sequence of joins where you might need to use both left and right joins.

5. Default values

Any arguments you don't specify in a function call in R take on their default values. For example, the arguments all, all.x, and all.y for the merge() function have a default value of FALSE. You can look up the default values for any function's arguments by using the help() function.

6. Exercise instructions

In the code exercises throughout the rest of the course, you will be instructed to join one data table to another using the wording you see on the slide. Regardless of the type of join, the data table that you see after the word "to" should always be placed on the left side of the join in your code. So if we ask you to join the shipping data.table to the demographics data table, the demographics data table should always be on the left side of the join: the first argument to the merge function.

7. Let's practice!

Go ahead and code some left and right joins.