Ratios
Ratios are all around us. Whether it's miles per gallon or click through rate, they are everywhere. In this exercise, we'll create some ratios by dividing out pairs of columns.
This exercise is part of the course
Feature Engineering with PySpark
Exercise instructions
- Create a new variable
ASSESSED_TO_LIST
by dividingASSESSEDVALUATION
byLISTPRICE
to help us understand if the having a high or low assessment value impacts our price. - Create another new variable
TAX_TO_LIST
to help us understand the approximate tax rate by dividingTAXES
byLISTPRICE
. - Lastly create another variable
BED_TO_BATHS
to help us know how crowded our bathrooms might be by dividingBEDROOMS
byBATHSTOTAL
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# ASSESSED_TO_LIST
df = ____
df[['ASSESSEDVALUATION', 'LISTPRICE', 'ASSESSED_TO_LIST']].show(5)
# TAX_TO_LIST
df = ____
df[['TAX_TO_LIST', 'TAXES', 'LISTPRICE']].show(5)
# BED_TO_BATHS
df = ____
df[['BED_TO_BATHS', 'BEDROOMS', 'BATHSTOTAL']].show(5)