BaşlayınÜcretsiz Başlayın

Map and Collect

The main method with which you can manipulate data in PySpark is using map(). The map() transformation takes in a function and applies it to each element in the RDD. It can be used to do any number of things, from fetching the website associated with each URL in our collection to just squaring the numbers. In this simple exercise, you'll use map() transformation to cube each number of the numbRDD RDD that you've created earlier. Next, you'll store all the elements in a variable and finally print the output.

Remember, you already have a SparkContext sc, and numbRDD available in your workspace.

Bu egzersiz

Big Data Fundamentals with PySpark

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Create map() transformation that cubes all of the numbers in numbRDD.
  • Collect the results in a numbers_all variable.
  • Print the output from numbers_all variable.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Create map() transformation to cube numbers
cubedRDD = numbRDD.map(lambda x: ____)

# Collect the results
numbers_all = cubedRDD.____()

# Print the numbers from numbers_all
for numb in ____:
	print(____)
Kodu Düzenle ve Çalıştır