1. Learn
  2. /
  3. Courses
  4. /
  5. Introduction to PySpark

Connected

Exercise

Bringing it all together I

You've built a solid foundation in PySpark, explored its core components, and worked through practical scenarios involving Spark SQL, DataFrames, and advanced operations. Now it’s time to bring it all together. Over the next two exercises, you're going to make a SparkSession, a Dataframe, cache that Dataframe, conduct analytics and explain the outcome!

Instructions

100 XP
  • Import SparkSession from pyspark.sql.
  • Make a new SparkSession called final_spark using SparkSession.builder.getOrCreate().
  • Print my_spark to the console to verify it's a SparkSession.
  • Create a new DataFrame from a preloaded schema and column definition.