Binarizing Day of Week
In a previous video, we saw that it was very unlikely for a home to list on the weekend. Let's create a new field that says if the house is listed for sale on a weekday or not. In this example there is a field called List_Day_of_Week that has Monday is labeled 1.0 and Sunday is 7.0. Let's convert this to a binary field with weekday being 0 and weekend being 1. We can use the pyspark feature transformer Binarizer to do this.
Cet exercice fait partie du cours
Feature Engineering with PySpark
Instructions
- Import the feature transformer Binarizerfrompysparkand theml.featuremodule.
- Create the transformer using Binarizer()with the threshold for setting the value to 1 as anything after Friday, 5.0, then set the input column asList_Day_of_Weekand output column asListed_On_Weekend.
- Apply the binarizer transformation on dfusingtransform().
- Verify the transformation worked correctly by selecting the List_Day_of_WeekandListed_On_Weekendcolumns withshow().
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Import transformer
from pyspark.____.____ import ____
# Create the transformer
binarizer = ____(threshold=____ inputCol=____, outputCol=____)
# Apply the transformation to df
df = binarizer.____(____)
# Verify transformation
df[[____, ____]].____()