Removing a DataFrame from cache
You've finished the analysis tasks with the departures_df DataFrame, but have some other processing to do. You'd like to remove the DataFrame from the cache to prevent any excess memory usage on your cluster.
The DataFrame departures_df
is defined and has already been cached for you.
This exercise is part of the course
Cleaning Data with PySpark
Exercise instructions
- Check the caching status on the
departures_df
DataFrame. - Remove the
departures_df
DataFrame from the cache. - Validate the caching status again.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Determine if departures_df is in the cache
print("Is departures_df cached?: %s" % departures_df.____)
print("Removing departures_df from cache")
# Remove departures_df from the cache
____
# Check the cache status again
print("Is departures_df cached?: %s" % ____)