Get startedGet started for free

Index-based joins

1. Index-based joins

In this lesson we'll talk about index-based joins. Joining data frames based on indexes is pretty much the same as joining on key columns. Even better, it's often easier to understand and easier to code.

2. Pandas indexing

We can join data frames based on either generic or tailored indexes. The data frame on the left has a generic index starting with 0, which is the default for pandas data frames. It can be most easily joined to another data frame with a similar default-like index. The frame on the right has a customized multi-level index. Each X-Y coordinate in the data belongs to a unique combination of the values in the 3 index columns: GameKey, PlayID, and Display Name. This table can be most easily joined with a table having one or more of these index columns as part of its own index. By the way, you might have noticed these tables contain the same data. It's often a matter of preference how to best format the data frames as you work with them.

3. Joining on index

To join frames based on index, we can use the pandas data frame join method. The join method is similar to the merge method with some exceptions. It joins frames on index by default, so you don't need to specify an 'on' column. Also, you can join multiple data frames at once with the join method. Just pass a bracketed list of data frames as the 'other' frames. This feature is very useful when importing and joining many files with similar indexes.

4. Let's practice!

OK, you know the drill. It's time to finish the chapter with one last practice session.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.