BaşlayınÜcretsiz Başlayın

Integers in PySpark UDFs

This exercise covers UDFs, allowing you to understand function creation in PySpark! As you work through this exercise, think about what this would replace in a data cleaning workflow.

Remember, there's already a SparkSession called spark in your workspace!

Bu egzersiz

Introduction to PySpark

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Register the function age_category as a UDF called age_category_udf.
  • Add a new column to the DataFrame df called "category" that applies the UDF to categorize people based on their age. The argument for age_category_udf() is provided for you.
  • Show the resulting DataFrame.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Register the function age_category as a UDF
age_category_udf = ____(age_category, StringType())

# Apply your udf to the DataFrame
age_category_df_2 = age_category_df.withColumn("category", ____(age_category_df["age"]))

# Show df
age_category_df_2.____
Kodu Düzenle ve Çalıştır