Removing a DataFrame from cache
You've finished the analysis tasks with the departures_df DataFrame, but have some other processing to do. You'd like to remove the DataFrame from the cache to prevent any excess memory usage on your cluster.
The DataFrame departures_df is defined and has already been cached for you.
Deze oefening maakt deel uit van de cursus
Cleaning Data with PySpark
Oefeninstructies
- Check the caching status on the
departures_dfDataFrame. - Remove the
departures_dfDataFrame from the cache. - Validate the caching status again.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Determine if departures_df is in the cache
print("Is departures_df cached?: %s" % departures_df.____)
print("Removing departures_df from cache")
# Remove departures_df from the cache
____
# Check the cache status again
print("Is departures_df cached?: %s" % ____)