Simple coding for complex merges
Great news! You have access to the league's Next Gen Stats data (NGS). NGS captures player location and orientation for every player, on every play. Data is recorded 10 times per second, which means there are over 1.5 million observations per week for punts alone! The data has already been loaded to a data frame called coords
.
You also have general play data on every punt that corresponds to the punts tracked by NGS. Rows in this data frame, called punts
, are identified by unique combinations of GameKey
and PlayId
.
To join the data in a spreadsheet environment, you would create a column in each table combining GameKey
and PlayId
and match tables based on the new column. Here you can try a simple merge statement to join punts
and coords
.
This exercise is part of the course
Pandas Joins for Spreadsheet Users
Exercise instructions
- View the first 10 rows of
punts
. Note that rows are unique to eachGameKey
-PlayId
combination. - View the first 10 rows of
coords
. - Merge the two data frames with
punts
as the left data frame andcoords
as the right data frame. - View the first 15 rows of the new data frame,
punts_w_coords
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# View punts
print(____.head(10))
# View coords
print(____.head(10))
# Merge data frames
punts_w_coords = ____.merge(____)
# View new data frame
print(____.head(15))