Arithmetical features
To practice creating new features, you will be working with a subsample from the Kaggle competition called "House Prices: Advanced Regression Techniques". The goal of this competition is to predict the price of the house based on its properties. It's a regression problem with Root Mean Squared Error as an evaluation metric.
Your goal is to create new features and determine whether they improve your validation score. To get the validation score from 5-fold cross-validation, you're given the get_kfold_rmse()
function. Use it with the train
DataFrame, available in your workspace, as an argument.
This exercise is part of the course
Winning a Kaggle Competition in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Look at the initial RMSE
print('RMSE before feature engineering:', get_kfold_rmse(train))
# Find the total area of the house
train['TotalArea'] = ____[____] + ____[____] + ____[____]
# Look at the updated RMSE
print('RMSE with total area:', get_kfold_rmse(train))