Pandas UDFs
This exercise covers Pandas UDFs, so that you can practice their syntax! As you work through this exercise, notice the differences between the Pyspark UDF from the last exercise and this type of UDF.
Remember, there's already a SparkSession
called spark
in your workspace!
Cet exercice fait partie du cours
Introduction to PySpark
Instructions
- Define the
add_ten_pandas()
function as a pandas UDF. - Add a new column to the DataFrame called
"10_plus"
that applies the pandas UDF to thedf
column"value"
. - Show the resulting DataFrame.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Define a Pandas UDF that adds 10 to each element in a vectorized way
@____(DoubleType())
def add_ten_pandas(column):
return column + 10
# Apply the UDF and show the result
df.withColumn("10_plus", ____)
df.____