Get startedGet started for free

Adaptive resampling

1. Adaptive resampling

Good job completing the exercises! You have now seen how hyperparameter grids work when tuning your models. However, both grid and random search are not very efficient - nor fast! Adaptive Resampling is a technique that can be used instead.

2. What is Adaptive Resampling?

With grid search and random search the performance of different hyperparameter combinations is evaluated. Which combination wins, e.g. which has the highest accuracy, is determined at the very end. That way, many of the tested combinations will perform badly. And testing hyperparameter combinations will continue, even if the best combination has already been found. With adaptive resampling, hyperparameter combinations are resampled with values that are close to combinations that performed well and combinations that are sub-optimal will not be tested at all. This way, each round of tested hyperparameters will zero in on the optimal combination of hyperparameters. This makes adaptive resampling faster and more efficient. A detailed explanation of adaptive resampling and how it is implemented in caret can be found in this paper by Max Kuhn. You can again click on this link to get to the paper.

3. Adaptive resampling in caret

Adaptive resampling is implemented in caret, so it is very easy for us to use. We simply need to modify our trainControl function with the following settings: As method we define adaptive_cv to use adaptive resampling with cross-validation. By default, a "grid" search would be performed but here, we define search as random. Then we define the adaptive resampling process with - min, which determines the minimum number of resamples used for each hyperparameter. Per default, caret uses a min value of 5. The larger we set min, the slower the resampling process will be but we increase our likelihood of finding the optimal hyperparameter combination. - alpha defines the confidence level that we want to use to remove hyperparameters. Usually, changing alpha does not influence the result that much. - with method, we set the resampling method. It can be either a simple linear model, as we use here with gls. Or we could use a Bradley-Terry model, which would be advised if we have a large number of hyperparameters to test or if we expect our model accuracy to be close to one and not vary much between hyperparameter combinations. It is therefore useful for fine-tuning models that are already pretty good. - and finally, complete let's us specify whether we want to generate a full resampling set if an optimal solution is found before resampling is completed. Setting complete as FALSE would save time and we would still get the optimal combination of hyperparameters - but we won't know the final estimated performance measure for our model. This is how the final traincontrol function will look like for adaptive resampling.

4. Adaptive resampling in caret

We can now use our as such defined trainControl with carets train function - just as we did before. What we additionally need to define in train is again the setting tuneLength, which will define the maximum number of hyperparameter combinations we want to compare. Here I'll be using 7, which is, of course, again a rather low number. In your real-world experiments, you will most likely want to compare at least 100 combinations. But you see that even an efficient method like adaptive resampling still takes time to perform its magic.

5. Adaptive resampling

Here you see the lower part of the output of our model trained with adaptive resampling. It has again the same structure as our caret models from before. And we also get the final values used, which give an accuracy of 96% but a low Kappa value.

6. Let's get coding!

Now it's your turn!