Applying a UDF to vector data
A dataframe is available called df having a column output of type vector. Its first five rows are shown in the console.
A UDF get_first_udf is available that selects the first element of a vector column.
Latihan ini adalah bagian dari kursus
Introduction to Spark SQL in Python
Petunjuk latihan
- Create a new dataframe called
df_newby adding a new column todf. Call the new columnlabel. - Show the first five rows of
df_new.
Latihan interaktif praktis
Cobalah latihan ini dengan menyelesaikan kode contoh berikut.
# Add label by applying the get_first_udf to output column
df_new = df.____('____', ____('____'))
# Show the first five rows
df_new.____