Exercise

Binarizing Day of Week

In a previous video, we saw that it was very unlikely for a home to list on the weekend. Let's create a new field that says if the house is listed for sale on a weekday or not. In this example there is a field called List_Day_of_Week that has Monday is labeled 1.0 and Sunday is 7.0. Let's convert this to a binary field with weekday being 0 and weekend being 1. We can use the pyspark feature transformer Binarizer to do this.

Instructions

100 XP
  • Import the feature transformer Binarizer from pyspark and the ml.feature module.
  • Create the transformer using Binarizer() with the threshold for setting the value to 1 as anything after Friday, 5.0, then set the input column as List_Day_of_Week and output column as Listed_On_Weekend.
  • Apply the binarizer transformation on df using transform().
  • Verify the transformation worked correctly by selecting the List_Day_of_Week and Listed_On_Weekend columns with show().