1
Combining Multiple Models
Free
Do you struggle to determine which of the models you built is the best for your problem? You should give up on that, and use them all instead! In this chapter, you'll learn how to combine multiple models into one using "Voting" and "Averaging". You'll use these to predict the ratings of apps on the Google Play Store, whether or not a Pokémon is legendary, and which characters are going to die in Game of Thrones!
2
Bagging
Bagging is the ensemble method behind powerful machine learning algorithms such as random forests. In this chapter you'll learn the theory behind this technique and build your own bagging models using scikit-learn.
3
Boosting
Boosting is class of ensemble learning algorithms that includes award-winning models such as AdaBoost. In this chapter, you'll learn about this award-winning model, and use it to predict the revenue of award-winning movies! You'll also learn about gradient boosting algorithms such as CatBoost and XGBoost.
4
Stacking
Get ready to see how things stack up! In this final chapter you'll learn about the stacking ensemble method. You'll learn how to implement it using scikit-learn as well as with the mlxtend library! You'll apply stacking to predict the edibility of North American mushrooms, and revisit the ratings of Google apps with this more advanced approach.

Initializing

Boosting for predicted revenue

The initial model got an RMSE of around 7.34. Let's see if we can improve this using an iteration of boosting.

You'll build another linear regression, but this time the target values are the errors from the base model, calculated as follows:

y_train_error = pred_train - y_train
y_test_error = pred_test - y_test

For this model you'll use 'popularity' feature instead, hoping that it can provide more informative patterns than with the 'budget' feature alone. This is available to you as X_train_pop and X_test_pop. As in the previous exercise, the input features have been standardized for you.

Fit a linear regression model to the previous errors using X_train_pop and y_train_error.
Calculate the predicted errors on the test set, X_test_pop.
Calculate the RMSE, like in the previous exercise, using y_test_error and pred_error.