ComenzarEmpieza gratis

Integers in PySpark UDFs

This exercise covers UDFs, allowing you to understand function creation in PySpark! As you work through this exercise, think about what this would replace in a data cleaning workflow.

Remember, there's already a SparkSession called spark in your workspace!

Este ejercicio forma parte del curso

Introduction to PySpark

Ver curso

Instrucciones del ejercicio

  • Register the function age_category as a UDF called age_category_udf.
  • Add a new column to the DataFrame df called "category" that applies the UDF to categorize people based on their age. The argument for age_category_udf() is provided for you.
  • Show the resulting DataFrame.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Register the function age_category as a UDF
age_category_udf = ____(age_category, StringType())

# Apply your udf to the DataFrame
age_category_df_2 = age_category_df.withColumn("category", ____(age_category_df["age"]))

# Show df
age_category_df_2.____
Editar y ejecutar código