Ratios
Ratios are all around us. Whether it's miles per gallon or click through rate, they are everywhere. In this exercise, we'll create some ratios by dividing out pairs of columns.
This exercise is part of the course
Feature Engineering with PySpark
Exercise instructions
- Create a new variable
ASSESSED_TO_LISTby dividingASSESSEDVALUATIONbyLISTPRICEto help us understand if the having a high or low assessment value impacts our price. - Create another new variable
TAX_TO_LISTto help us understand the approximate tax rate by dividingTAXESbyLISTPRICE. - Lastly create another variable
BED_TO_BATHSto help us know how crowded our bathrooms might be by dividingBEDROOMSbyBATHSTOTAL.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# ASSESSED_TO_LIST
df = ____
df[['ASSESSEDVALUATION', 'LISTPRICE', 'ASSESSED_TO_LIST']].show(5)
# TAX_TO_LIST
df = ____
df[['TAX_TO_LIST', 'TAXES', 'LISTPRICE']].show(5)
# BED_TO_BATHS
df = ____
df[['BED_TO_BATHS', 'BEDROOMS', 'BATHSTOTAL']].show(5)