Bagged trees

1. Bagged trees

Decision trees are easy to understand and interpret, but often a small change in the data can result in a very different series of splits and a very different model. This is one of the main drawbacks of decision trees: the high variance that you have observed in the exercises.

2. Many heads are better than one

One solution to this is the wisdom of the crowd: The collective knowledge of many people typically exceeds the knowledge of any single individual. This brings up the idea of so-called "ensemble models", which means that you use several models and their predictions together instead of one single model.

3. Bootstrap & aggregation

What does the term "bagged trees" mean? Bagging is an ensemble method and is shorthand for Bootstrap Aggregation. Bootstrapping simply means sampling rows at random from the training dataset, with replacement. When you draw samples with replacement, that means that you'll draw a single training example more than once. This results in a modified version of the training set where some rows are represented multiple times and some rows are absent. This let's you generate new data that's similar to the data you started with. By doing this, you can fit many different, but similar, models. Aggregation is done using the average prediction of all models as the final regression prediction, or the majority vote in classification.

4. Step 1: Bootstrap and train

Bagging works in the following way: First, you draw a sample with replacement from the original training set Then, you train a decision tree model using that sampled training set. Repeat these steps as many times as you like - that could be 10 times, 100 times, or 1000. Typically, the more trees, the better the model, but the more training time you need. This example shows an ensemble of three bagged trees.

5. Step 2: Aggregate

Now, let's say you have three bootstrapped trees that make up your ensemble. To generate a prediction using a bagged tree model, you generate predictions from each of the trees and then simply aggregate the predictions together to get a final prediction. The bagged, or "ensemble" prediction is the average prediction or majority vote across the bootstrapped trees. Bagging can dramatically reduce the variance of unstable models such as trees, leading to improved performance.

6. Coding: Specify the bagged trees

Fitting a bagged decision tree model in R is very similar to fitting a decision tree. The package baguette offers a bag_tree() function that behaves similarly to the decision_tree() function. We specify the mode as "classification" and set the engine to "rpart". Additionally, we can specify how many bagged trees we want to create, using the times parameter. Let's specify an ensemble of 100 bagged trees.

7. Train all trees

Fitting works exactly like you already know. Tidymodels is taking care of taking the bootstrap samples and training all of the trees behind the scenes. Simply use the fit() function as usual, using a formula and the training data. Here we use the formula still_customer, as modeled according to all other variables in the data, which is what the dot stands for. The data is set to our credit customers training dataset. When you print the final model, you get a summary of the model ensemble: The fit time is 24 seconds, we have 100 members in our ensemble, that is 100 trees, and finally, a tibble showing the importance of the predictors for the final outcome. In our example, the count of total transactions is the most important predictor.

8. Let's bootstrap!

Now that you know how to create a bagged model ensemble, it's your turn to practice.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.