SortByKey and Collect
Many times it is useful to sort the pair RDD based on the key (for example word count which you'll see later in the chapter). In this exercise, you'll sort the pair RDD Rdd_Reduced that you created in the previous exercise into descending order and print the final output.
Remember, you already have a SparkContext sc and Rdd_Reduced available in your workspace.
Deze oefening maakt deel uit van de cursus
Big Data Fundamentals with PySpark
Oefeninstructies
- Sort the
Rdd_ReducedRDD using the key in descending order. - Collect the contents and iterate to print the output.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Sort the reduced RDD with the key by descending order
Rdd_Reduced_Sort = Rdd_Reduced.____(ascending=False)
# Iterate over the result and retrieve all the elements of the RDD
for num in Rdd_Reduced_Sort.____():
print("Key {} has {} Counts".format(____, num[1]))