Get startedGet started for free

Averaging

1. Averaging

In this lesson, you'll learn about another popular ensemble method: averaging.

2. Counting Jelly Beans

Say you're participating in a game of guessing how many jelly beans there are in a jar. The idea is to guess the number of jelly beans and the participant who is closer to the actual receives the full jar or another prize. As you are not allowed to open the jar and count them individually, your best bet is to estimate the amount. But how can we provide a good estimate? The easiest option would be to choose a random reasonable number. A more intelligent approach would be to approximate the volume of the jar. And you could take even more advanced approaches. Did you know that the average of the individual guesses tends to be close to the actual value and can be as good as or better than any of them? It sounds counterintuitive, right? But it's the same principle we saw in the previous lesson: the wisdom of the crowd. The averaging ensemble takes advantage of this principle.

3. Averaging (Soft Voting)

This ensembling technique is known as averaging, or "soft" voting, and it can be applied to both classification and regression. In this technique, the combined prediction is the mean of the individual predictions. For Regression, we use the predicted values. And for Classification, we use the predicted probabilities. As the mean doesn't have ambiguous cases like the mode, we can use any number of estimators, as long as we have at least two of them.

4. Averaging ensemble with scikit-learn

To build an averaging classifier, we'll use the same class as before: VotingClassifier. The main difference is that we specify an additional parameter: voting with the value of "soft". The default value is "hard". We can also pass the optional parameter weights, which specifies a weight for each of the estimators. If specified, the combined prediction is a weighted average of the individual ones. Otherwise, the weights are considered uniform. In a similar way, we can build an Averaging regressor. For this purpose, we use the VotingRegressor class from the sklearn dot ensemble module. The first parameter is also the list of string / estimators tuples, but instead of classifiers we use regressors.

5. scikit-learn example

Let's see an averaging ensemble in action. Here we have a 5-nearest neighbors classifier, a decision tree, and a logistic regression. We create an averaging classifier passing the list of estimators, and the parameter voting as "soft". In addition, assuming that we know that Decision Tree has better individual performance, we give it a higher weight. Ideally, the weights should be tuned while training the model, for example, using grid search cross-validation.

6. Game of Thrones deaths

For the exercises that follow, you'll be using a dataset consisting of characters from the popular series Game of Thrones. Your target is to predict whether a character is alive or not. For that purpose, you'll use features like age, gender, books in which the character appears, its popularity, and whether or not the character's relatives are alive.

7. Time to practice!

The TV show may be over, but there are still 2 books to go. It's your turn now to put averaging into practice and predict the deaths of characters from Game of Thrones!