Creating a SparkSession
We've already created a SparkSession for you called spark, but what if you're not sure there already is one? Creating multiple SparkSessions and SparkContexts can cause issues, so it's best practice to use the SparkSession.builder.getOrCreate() method. This returns an existing SparkSession if there's already one in the environment, or creates a new one if necessary!
This exercise is part of the course
Foundations of PySpark
Exercise instructions
- Import
SparkSessionfrompyspark.sql. - Make a new
SparkSessioncalledmy_sparkusingSparkSession.builder.getOrCreate(). - Print
my_sparkto the console to verify it's aSparkSession.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import SparkSession from pyspark.sql
from ____ import ____
# Create my_spark
my_spark = ____
# Print my_spark
print(____)