ComeçarComece de graça

Binarizing Day of Week

In a previous video, we saw that it was very unlikely for a home to list on the weekend. Let's create a new field that says if the house is listed for sale on a weekday or not. In this example there is a field called List_Day_of_Week that has Monday is labeled 1.0 and Sunday is 7.0. Let's convert this to a binary field with weekday being 0 and weekend being 1. We can use the pyspark feature transformer Binarizer to do this.

Este exercício faz parte do curso

Feature Engineering with PySpark

Ver curso

Instruções do exercício

  • Import the feature transformer Binarizer from pyspark and the ml.feature module.
  • Create the transformer using Binarizer() with the threshold for setting the value to 1 as anything after Friday, 5.0, then set the input column as List_Day_of_Week and output column as Listed_On_Weekend.
  • Apply the binarizer transformation on df using transform().
  • Verify the transformation worked correctly by selecting the List_Day_of_Week and Listed_On_Weekend columns with show().

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Import transformer
from pyspark.____.____ import ____

# Create the transformer
binarizer = ____(threshold=____ inputCol=____, outputCol=____)

# Apply the transformation to df
df = binarizer.____(____)

# Verify transformation
df[[____, ____]].____()
Editar e executar o código