CommencerCommencer gratuitement

Fix the broken query

This query runs correctly, but gives an incorrect result in one of the rows because of an omission in the OVER clause. Can you locate the bug? Can you modify the query to make it give a reasonable result?

Cet exercice fait partie du cours

Introduction to Spark SQL in Python

Afficher le cours

Instructions

  • Provide the row number of the erroneous row as an integer.
  • Provide the clause (as a string) that when added to the OVER clause fixes the problem.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

query = """
SELECT 
ROW_NUMBER() OVER (ORDER BY time) AS row,
train_id, 
station, 
time, 
LEAD(time,1) OVER (ORDER BY time) AS time_next 
FROM schedule
"""
spark.sql(query).show()

# Give the number of the bad row as an integer
bad_row = ____

# Provide the missing clause, SQL keywords in upper case
clause = '____ ____ ____'
Modifier et exécuter le code