BaşlayınÜcretsiz Başlayın

PySpark DataFrame visualization

Graphical representations or visualization of data is imperative for understanding as well as interpreting the data. In this simple data visualization exercise, you'll first print the column names of names_df DataFrame that you have created earlier, then convert the names_df to Pandas DataFrame, and finally plot the contents as horizontal bar plot with names of the people on the x-axis and their age on the y-axis.

Remember, you already have a SparkSession spark and a DataFrame names_df available in your workspace.

Bu egzersiz

Big Data Fundamentals with PySpark

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Print the names of the columns in names_df DataFrame.
  • Convert names_df DataFrame to df_pandas Pandas DataFrame.
  • Use matplotlib's plot() method to create a horizontal bar plot with 'Name' on x-axis and 'Age' on y-axis.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Check the column names of names_df
print("The column names of names_df are", names_df.____)

# Convert to Pandas DataFrame  
df_pandas = names_df.____()

# Create a horizontal bar plot
____.plot(kind='barh', x='____', y='____', colormap='winter_r')
plt.show()
Kodu Düzenle ve Çalıştır