CommencerCommencer gratuitement

Joining multiple tables

You now want to pursue a different avenue and map player positions during punts. You may recall that the NextGenStats (NGS) system captures player positions and orientations 10 times per second for all players, on every play. It's a lot of data!

You'll be joining three data frames to prepare for analysis. Below are their names and descriptions.

  • games: high-level data by GameKey
  • punts: play-level data by GameKey and PlayId
  • ngs: position data by GameKey, PlayId, GSISID (player id), and Time

A member of your team has provided you with a list comprehension on line 2 to print the index of each data frame in one line of code. For more information about list comprehensions, check out Python Data Science Toolbox Part 2.

Cet exercice fait partie du cours

Pandas Joins for Spreadsheet Users

Afficher le cours

Instructions

  • Inner join the data frames on index using games as the primary data frame.
  • View the first 10 rows of the resulting data frame.
  • Ensure the index of the new frame has no duplicates.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# List the index of each data frame
print([[n for n in df.index.names] for df in [games, punts, ngs]])

# Inner join the data frames
games_all = ____.____([punts, ____], how=____)

# View first 10 rows of new frame
print(____.head(10))

# Check index for duplicates
print(____.index.____.sum())
Modifier et exécuter le code