MulaiMulai sekarang secara gratis

Interactive Use of PySpark

Spark comes with an interactive Python shell in which PySpark is already installed. PySpark shell is useful for basic testing and debugging and is quite powerful. The easiest way to demonstrate the power of PySpark’s shell is with an exercise. In this exercise, you'll load a simple list containing numbers ranging from 1 to 100 in the PySpark shell.

The most important thing to understand here is that we are not creating any SparkContext object because PySpark automatically creates the SparkContext object named sc in the PySpark shell.

Latihan ini adalah bagian dari kursus

Big Data Fundamentals with PySpark

Lihat Kursus

Petunjuk latihan

  • Create a Python list named numb containing the numbers 1 to 100.
  • Load the list into Spark using Spark Context's parallelize method and assign it to a variable spark_data.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Create a Python list of numbers from 1 to 100 
numb = range(____, ____)

# Load the list into PySpark  
spark_data = sc.____(numb)
Edit dan Jalankan Kode