Aan de slagGa gratis aan de slag

Fix the broken query

This query runs correctly, but gives an incorrect result in one of the rows because of an omission in the OVER clause. Can you locate the bug? Can you modify the query to make it give a reasonable result?

Deze oefening maakt deel uit van de cursus

Introduction to Spark SQL in Python

Cursus bekijken

Oefeninstructies

  • Provide the row number of the erroneous row as an integer.
  • Provide the clause (as a string) that when added to the OVER clause fixes the problem.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

query = """
SELECT 
ROW_NUMBER() OVER (ORDER BY time) AS row,
train_id, 
station, 
time, 
LEAD(time,1) OVER (ORDER BY time) AS time_next 
FROM schedule
"""
spark.sql(query).show()

# Give the number of the bad row as an integer
bad_row = ____

# Provide the missing clause, SQL keywords in upper case
clause = '____ ____ ____'
Code bewerken en uitvoeren