Get startedGet started for free

The semi_join and anti_join verbs

1. The semi_join and anti_join verbs

You've learned about four joining verbs so far: inner join, left join, right join, and full join.

2. Mutating verbs

All of these are what we call mutating verbs: they combine the variables from your two tables.

3. Review: left join

For example, when you left joined batmobile and batwing, you ended up with a new column you did not have in batmobile: specifically, quantity underscore batwing.

4. Filtering joins

But let's talk about a different class of verbs: filtering joins. A filtering join keeps or removes observations from the first table, but it doesn't add new variables. The two filtering verbs you'll be learning are semi join and anti join.

5. Filtering joins

A semi join asks the question: what observations in X are also in Y?

6. Filtering joins

And an anti join asks the question: what observations in X are not in Y?

7. The semi join

Let's start with semi join. What parts used in the Batmobile set are also used in the Batwing set? This semi join takes us from the 173 pieces that are in the Batmobile set and reduces it to the 45 pieces that are also in the Batwing set. But notice that we still have the same three variables- part num, color id, and quantity: that the Batmobile set started with. We kept Batmobile's quantity variable, and didn't even have to specify a suffix. This is useful for when we want to filter down a table without modifying it further.

8. The anti join

The opposite of a semi join is an anti join. Anti joins ask: what observations in the first table are not in the second table? In this case, what pieces are in the batmobile but not in the batwing. Notice again, we did not specify a suffix. The result tells us that there are 128 pieces in the Batmobile that are not in the Batwing.

9. Filtering with semi_join

These verbs aren't useful just for comparing Batman's rides. You could use them to filter down the other tables we've worked with. For example, you might want to know what themes ever appear in a set. A semi join tells you that 569 themes make at least one appearance.

10. Filtering with anti_join

Conversely, you could use anti join to find the themes never appear in a set in our database. This filters down to the 96 that don't match to anything.

11. The joining verbs

In the exercises, you'll find other useful applications of the semi and anti joins. Before long, you'll have six joining verbs in your dplyr arsenal.

12. Let's practice!

Let's practice!