Normal joins
You've been given two DataFrames to combine into a single useful DataFrame. Your first task is to combine the DataFrames normally and view the execution plan.
The DataFrames flights_df
and airports_df
are available to you.
This exercise is part of the course
Cleaning Data with PySpark
Exercise instructions
- Create a new DataFrame
normal_df
by joiningflights_df
withairports_df
. - Determine which type of join is used in the query plan.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Join the flights_df and aiports_df DataFrames
normal_df = flights_df.____(____, \
flights_df["Destination Airport"] == airports_df["IATA"] )
# Show the query plan
normal_df.____()