1. Learn
  2. /
  3. Courses
  4. /
  5. Introduction to PySpark

Connected

Exercise

Integers in PySpark UDFs

This exercise covers UDFs, allowing you to understand function creation in PySpark! As you work through this exercise, think about what this would replace in a data cleaning workflow.

Remember, there's already a SparkSession called spark in your workspace!

Instructions

100 XP
  • Register the function age_category as a UDF called age_category_udf.
  • Add a new column to the DataFrame df called "category" that applies the UDF to categorize people based on their age. The argument for age_category_udf() is provided for you.
  • Show the resulting DataFrame.