BaşlayınÜcretsiz Başlayın

Creating a UDF for vector data

A dataframe df is available, having a column output of type vector. Its first five rows are shown in the console.

Bu egzersiz

Introduction to Spark SQL in Python

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Create a UDF called first_udf. It selects the first element of a vector column. Set the result to a default value of 0.0 for any item that is not a vector containing at least one item and cast the output as a float.
  • Use the select operation on df to apply first_udf to the output column.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Selects the first element of a vector column
first_udf = ____(lambda x:
            ____(x.indices[0]) 
            if (x and hasattr(x, "toArray") and x.____())
            else 0.0,
            FloatType())

# Apply first_udf to the output column
df.select(____("output").alias("result")).show(5)
Kodu Düzenle ve Çalıştır