IniziaInizia gratis

Practicing caching: putting it all together

What was the best approach to caching df1 and df2 and why?

Your results will vary; but here is one (random) result for each of the two approaches:

First answer (cache df1):

df1_1st : 2.4s
df1_2nd : 0.1s
df2_1st : 0.3s
df2_2nd : 0.2s
Overall elapsed : 3.9

Second answer (cache df2):

df1_1st : 2.3s
df1_2nd : 1.1s
df2_1st : 1.7s
df2_2nd : 0.1s
Overall elapsed : 6.4

Questo esercizio fa parte del corso

Introduction to Spark SQL in Python

Visualizza il corso

Esercizio pratico interattivo

Passa dalla teoria alla pratica con uno dei nostri esercizi interattivi

Inizia esercizio