Get startedGet started for free

Movie revenue prediction with CatBoost

Let's finish up this chapter on boosting by returning to the movies dataset! In this exercise, you'll build a CatBoostRegressor to predict the log-revenue. Remember that our best model so far is the AdaBoost model with a RMSE of 5.15.

Will CatBoost beat AdaBoost? We'll try to use a similar set of parameters to have a fair comparison.

Recall that these are the features we have used so far: 'budget', 'popularity', 'runtime', 'vote_average', and 'vote_count'. catboost has been imported for you as cb.

OBS: be careful not to use a classifier, or your session might expire!

This exercise is part of the course

Ensemble Methods in Python

View Course

Exercise instructions

  • Build and fit a CatBoostRegressor using 100 estimators, a learning rate of 0.1, and a max depth of 3.
  • Calculate the predictions for the test set and print the RMSE.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

import catboost as cb

# Build and fit a CatBoost regressor
reg_cat = ____.____(____, ____, ____, random_state=500)
____

# Calculate the predictions on the test set
pred = ____

# Evaluate the performance using the RMSE
rmse_cat = np.sqrt(mean_squared_error(y_test, pred))
print('RMSE (CatBoost): {:.3f}'.format(rmse_cat))
Edit and Run Code