Bringing it all together II
Create a DataFrame, apply transformations, cache it, and check if it’s cached. Then, uncache it to release memory.
For this exercise a spark session has been made for you! Look carefully at the outcome of the .explain() method to understand what the outcome is!
Latihan ini adalah bagian dari kursus
Introduction to PySpark
Petunjuk latihan
- Cache the
dfDataFrame. - Explain the processing of the
agg_resultDataFrame. - Unpersist the cached
dfDataFrame after processing.
Latihan interaktif praktis
Cobalah latihan ini dengan menyelesaikan kode contoh berikut.
# Cache the DataFrame
df.____
# Perform aggregation
agg_result = df.groupBy("Department").sum("Salary")
agg_result.show()
# Analyze the execution plan
agg_result.____
# Uncache the DataFrame
df.____