Creating a SparkSession
We've already created a SparkSession
for you called spark
, but what if you're not sure there already is one? Creating multiple SparkSession
s and SparkContext
s can cause issues, so it's best practice to use the SparkSession.builder.getOrCreate()
method. This returns an existing SparkSession
if there's already one in the environment, or creates a new one if necessary!
This exercise is part of the course
Foundations of PySpark
Exercise instructions
- Import
SparkSession
frompyspark.sql
. - Make a new
SparkSession
calledmy_spark
usingSparkSession.builder.getOrCreate()
. - Print
my_spark
to the console to verify it's aSparkSession
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import SparkSession from pyspark.sql
from ____ import ____
# Create my_spark
my_spark = ____
# Print my_spark
print(____)