Joining flights with their destination airports
You've been hired as a data engineer for a global travel company. Your first task is to help the company improve its operations by analyzing flight data. You have two datasets in your workspace: one containing details about flights (flights) and another with information about destination airports (airports), both are already available in your workspace..
Your goal? Combine these datasets to create a powerful dataset that links each flight to its destination airport.
Este ejercicio forma parte del curso
Introduction to PySpark
Instrucciones del ejercicio
- Examine the
airportsDataFrame. Note which key column will let you joinairportsto theflightstable. - Join the
flightswith theairportsDataFrame on the"dest"column. Save the result asflights_with_airports. - Examine
flights_with_airportsagain. Note the new information that has been added.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Examine the data
airports.____()
# .withColumnRenamed() renames the "faa" column to "dest"
airports = airports.withColumnRenamed("faa", "dest")
# Join the DataFrames
flights_with_airports = ____
# Examine the new DataFrame
flights_with_airports.____