Session Ready
Exercise

Boosting for predicted revenue

The initial model got an RMSE of around 7.34. Let's see if we can improve this using an iteration of boosting.

You'll build another linear regression, but this time the target values are the errors from the base model, calculated as follows:

y_train_error = pred_train - y_train
y_test_error = pred_test - y_test

For this model you'll also use 'popularity' as an additional feature, hoping that it can provide informative patterns than with the 'budget' feature alone. This is available to you as X_train_pop and X_test_pop.

Instructions
100 XP
  • Fit a linear regression model to the previous errors using X_train_pop and y_train_error.
  • Calculate the predicted errors on the test set, X_test_pop.
  • Calculate the RMSE, like in the previous exercise, using y_test_error and pred_error.