1. Learn
  2. /
  3. Courses
  4. /
  5. Introduction to Spark SQL in Python

Connected

Exercise

Practice reading query plans 2

Three dataframes are available: part2_df, part3_df, and part4_df. The questions posed in this exercise can be answered by inspecting the explain() output of each dataframe.

Note that Spark tags each column name with a descriptor, delimited by a # symbol. For example, word#0, id#1L, part#2, and title#3. For the purpose of this exercise, these descriptors can be ignored.

Instructions 1/4

undefined XP
    1
    2
    3
    4

Question

  • What file was part2_df loaded from? The full path is not wanted here, we want only the filename and its extension.

Possible answers