SortByKey and Collect

Many times it is useful to sort the pair RDD based on the key (for example word count which you'll see later in the chapter). In this exercise, you'll sort the pair RDD Rdd_Reduced that you created in the previous exercise into descending order and print the final output.

Remember, you already have a SparkContext sc and Rdd_Reduced available in your workspace.

This exercise is part of the course

Big Data Fundamentals with PySpark

View Course

Exercise instructions

  • Sort the Rdd_Reduced RDD using the key in descending order.
  • Collect the contents and iterate to print the output.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Sort the reduced RDD with the key by descending order
Rdd_Reduced_Sort = Rdd_Reduced.____(ascending=False)

# Iterate over the result and retrieve all the elements of the RDD
for num in Rdd_Reduced_Sort.____():
  print("Key {} has {} Counts".format(____, num[1]))