Get startedGet started for free

Optimize the boosted ensemble

1. Optimize the boosted ensemble

Welcome back! Now that you created and trained a boosted classifier using the built-in hyperparameters, it's time to alter these hyperparameters to maximize the performance. Let's tune the boosted ensemble!

2. Starting point: untuned performance

As a starting point, the out-of-the-box performance observed in the exercises was 95%. That's already fantastic given just the standard hyperparameters were used.

3. Tuning workflow

This overview shows the steps you learned for tuning a specification. First, use tune() to flag hyperparameters for tuning in your specification.

4. Tuning workflow

Then, create a grid of hyperparameters with grid_regular() or others.

5. Tuning workflow

Then, use vfold_cv() to create cross-validation folds.

6. Tuning workflow

You pass all that into the tune_grid() function, and go for coffee or a jog.

7. Tuning workflow

After you come back, call select_best() to select the best results.

8. Tuning workflow

and finalize your model specification with the winners.

9. Tuning workflow

As a last step, you fit your final model using the optimal hyperparameters to the training data to get your optimized full model.

10. Step 1: Create the tuning spec

Let's get to coding! As a first step, create the model specification. Let's choose and fix 500 trees and optimize for learning rate, tree_depth, and sample_size. The console output reflects these decisions.

11. Step 2: Create the tuning grid

Then, we need a grid containing all hyperparameter combinations that we want to try. You already know grid_regular(), which creates an evenly-spaced grid of all the hyperparameters. It takes the tuning parameters, which we extract by applying the function parameters() to our dummy specification, and the levels, which is the number of levels each tuning parameter should get. Let's specify two levels for each of our three tuning parameters. The result is a tibble with 8 rows, that is eight possible combinations of the three hyperparameters having two levels each. Another possibility is grid_random(), which creates a random, and not evenly spaced grid. The size parameter specifies the number of random combinations in the result. Size equals 8 gives us 8 random combinations of values.

12. Step 3: The tuning

Now for the actual tuning. The tune_grid() function takes the dummy specification, the model formula, the resamples, which are some cross-validation folds, a tuning grid, and a list of metrics. In our case, the dummy specification is boost_spec, the model formula is "still_customer is modeled as a function of all other parameters", resamples is six folds of the training data customers_train, the tuning grid is tunegrid_boost, which we created in the previous slide, and metrics is a metric_set containing only the roc_auc metric.

13. Visualize the result

It's always helpful and interesting to visualize the tuning results. The autoplot() function creates an overview of the tuning results. In our case, we see one plot per sample_size, tree_depth on the x axis, the AUC on the y axis, and different colors for different learning rates. The green line containing the smallest learning rate achieves only an area under curve of 50%, and there seems to be not much difference between a tree_depth of 8 or 12, both have an AUC value of 95 to close to 100%.

14. Step 4: Finalize the model

The optimal hyperparameter combination can be extracted using select_best(). This gives you a one-row tibble containing one column for every hyperparameter. We see that Model17 with a tree_depth of 8, a learn_rate of 0-point-1 and a sample_size of 55% yields the best results. Then, plug these into the specification containing the placeholders using finalize_model(). This finalizes your specification after tuning.

15. Last step: Train the final model

Finally, you train the final model on the whole training set customers_train. Printing the model reveals information like that it took 2-point-3 seconds to train and is 344 kilobytes in size.

16. Your turn!

Now it's your turn to apply that to your boosted ensemble!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.