Viewing tables
Once you've created a SparkSession
, you can start poking around to see what data is in your cluster!
Your SparkSession
has an attribute called catalog
which lists all the data inside the cluster. This attribute has a few methods for extracting different pieces of information.
One of the most useful is the .listTables()
method, which returns the names of all the tables in your cluster as a list.
This exercise is part of the course
Foundations of PySpark
Exercise instructions
- See what tables are in your cluster by calling
spark.catalog.listTables()
and printing the result!
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print the tables in the catalog
print(spark.____.____())