SortByKey and Collect
Many times it is useful to sort the pair RDD based on the key (for example word count which you'll see later in the chapter). In this exercise, you'll sort the pair RDD Rdd_Reduced
that you created in the previous exercise into descending order and print the final output.
Remember, you already have a SparkContext sc
and Rdd_Reduced
available in your workspace.
This exercise is part of the course
Big Data Fundamentals with PySpark
Exercise instructions
- Sort the
Rdd_Reduced
RDD using the key in descending order. - Collect the contents and iterate to print the output.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Sort the reduced RDD with the key by descending order
Rdd_Reduced_Sort = Rdd_Reduced.____(ascending=False)
# Iterate over the result and retrieve all the elements of the RDD
for num in Rdd_Reduced_Sort.____():
print("Key {} has {} Counts".format(____, num[1]))