Get startedGet started for free

Hyperparameter tuning in caret

1. Hyperparameter tuning in caret

Welcome back to Chapter 2. Let's dive deeper into how to perform hyperparameter tuning with caret.

2. Voter dataset from US 2016 election

The dataset we'll be working with is from a survey about the 2016 US presidential election. We will use the attribute "turnout16_2016" to predict whether or not a person voted in that election.

3. Let's train another model with caret

Let's train another machine learning model with caret using gradient boosting with repeated cross-validation. Keep in mind that in reality, you would want to address the problem of having unbalanced classes but let's focus on hyperparameter tuning for now. Our model takes about 33 seconds to run.

4. Let's train another model with caret

When we explore our model, we see that caret tuned interaction depth and the number of trees based on default values. For a quick base-line model, this is fine but what if we wanted to manually define hyperparameters?

5. Cartesian grid search with caret

In the previous chapter, you used the `expand.grid()` function to manually define single values for every hyperparameter. The same function can be used to define a grid of hyper-parameters because it creates a grid of all possible combinations of hyperparameters given! For the number of trees and tree complexity, we'll compare different values. Shrinkage and the minimum number of observations per node is kept constant. We can now train our model just as before. But this time, we will use the tuneGrid parameter and feed our grid to it. If we perform Cartesian grid search, every combination of hyperparameters in our grid will be evaluated. You see that this model took much longer to train. You will see in the following examples that hyperparameter tuning takes some time and computational power - so be prepared to exercise some patience!

6. Cartesian grid search with caret

The output will look similar to before with automatic hyperparameter tuning. We again get a table with accuracy and kappa values for all tested combinations of hyperparameters and a final result written below this table. Our model performance did not improve compared to before but we only tested a small range of hyperparameter values. In your real-world projects, you would test a much larger range of values but here, we will focus on learning the concepts behind hyperparameter tuning techniques.

7. Plot hyperparameter models

We can also plot our hyperparameters with the plot function and define the metric and plot type we want to visualize. Per default, we will see accuracy and line plots. Every line represents a different hyperparameter for the maximum tree depth. The colors of the lines correspond to this as well. On the x-axis, we see the number of boosting iterations, which comes from the hyperparameter n.trees that we defined to be either 100, 200 or 250. And the y-axis shows the accuracy of the model given these hyperparameter combinations. Alternatively, we can plot the Kappa metrics and show them as a heatmap. Kappa is another metric used to evaluate the performance of classification models. It compares an Observed Accuracy with an Expected Accuracy. Kappa values need to be considered in the context of the problem but generally, we want to achieve high Kappa values. The Kappa values are shown on the color scale, while the x-axis shows the number of trees and the y-axis the max tree depth. Here, our Kappa values don't look very good - the reason is that our data was strongly unbalanced, so the accuracy for always assigning the majority class will already be very high. So, we can conclude that while having pretty good accuracy, our model did not in fact perform much better than random.

8. Test it out for yourself!

Alright, enough theory - go test it out yourself!