1. Let's mlxtend it!
In this lesson, you'll be introduced to the Mlxtend library, which allows you to easily build stacking ensembles.
2. Mlxtend
Mlxtend stands for Machine Learning Extensions.
It is a third-party Python library which contains many utilities and tools for machine learning and Data Science tasks, including feature selection, ensemble methods, visualization, and model evaluation.
It has an intuitive API, and works well with scikit-learn estimators, which is very convenient for our purpose.
3. Stacking implementation from mlxtend
Mlxtend uses a slightly different stacking architecture to the one we've seen previously. Similar to the architecture we already know, the first-layer estimators are trained on the complete feature set.
However, it uses only the predictions as the input features for the second-layer meta-estimator, which makes it lighter and faster for both training and predicting.
An additional important property of the implementation, is that the second-layer estimators can be trained using the predicted class-labels or class probabilities as input features. The use of the class probabilities may allow you to solve more complex problems.
4. StackingClassifier with mlxtend
Mlxtend's stacking estimators are similar to the scikit-learn ensemble estimators you've seen throughout the course.
First, you need to import the StackingClassifier class from the mlxtend dot classifier module.
Then, you instantiate the 1st-layer classifiers which you want to use without training them, as the Stacking classifier will take care of that.
In the same way, you must instantiate the second-layer meta classifier of your choice.
With this, you are ready to build the Stacking classifier.
The first parameter it expects is classifiers, which is a list of the first-layer classifiers.
The second parameter is meta_classifier, which is the meta-classifier you instantiated previously.
Another useful parameter is use_probas, which specifies if you want to use probabilities instead of class labels as targets. This is false by default.
An additional parameter which you may be interested in is use_features_in_secondary. This allows you to train the model on both the original input features as well as the individual predictions. By default it is false.
After instantiating the Stacking classifier, you can call the fit and predict methods just like you would for a scikit-learn estimator.
5. StackingRegressor with mlxtend
You can similarly build Stacking regressors with Mlxtend.
First, you import StackingRegressor from the mlxtend dot regressor module.
Then, instantiate the 1st-layer regressors to be used as well as the meta regressor.
Now it's time to build the Stacking regressor.
The first parameter in this case is called regressors, a list of the first-layer regressors.
The second parameter, as you may have guessed, is the meta_regressor. Here you pass the reg_meta object instantiated before.
There is no use_probas parameter, as we're dealing with a regression problem.
Nevertheless, the use_features_in_secondary parameters is available to include both the original input features and the individual predictions.
Once the Stacking regressor is instantiated, you can fit it to the training set and use it to make predictions.
6. Let's mlxtend it!
Let's now practice applying mlxtend!