Advanced tuning with mlr

1. Advanced tuning with mlr

Now, I will show you some advanced functions for hyperparameter tuning in mlr.

2. Advanced tuning controls

In the previous lesson, you got to know grid search and random search. But we can also use more advanced methods, like - CMA Evolution Strategy, which is based on the concept of creating variation from the hyperparameter values in each iteration and choosing those with highest fitness for the next round. You can think of it as "Survival of the fittest" for hyperparameters. - We can also predefine a complete data frame of hyperparameters - Or use Generalized simulated annealing. Our hyperparameter search space for the model can be thought of as a complex non-linear function where the best hyperparameters will be found in its global minimum, which GenSA aims to find. - Another tuning control is iterated F-racing for automated configuration of algorithms to find the most optimal hyperparameter values in an optimization task. - And we could also use model-based or Bayesian optimization. As the name suggests, MBO uses Bayesian statistics to approximate the objective function. MBO works in conjunction with the functions `makeMBOControl` and `setMBOControlTermination` Check the help for each function to find out which hyperparameters can be defined with each function. Some allow only discrete values, while others can't deal with dependencies. An example for dependent hyperparameters is degree in Support Vector Machines, which only works with a polynomial kernel.

3. Choosing evaluation metrics

Until now, we didn't define performance metrics and used the defaults. For classification, this was the Mean misclassification error (mmce). But we can also define one or more metric with the measure argument to tuneParams, which can take one value or a list of values. Let's look at an example: If we pass a list, the first element is used to optimize against during hyperparameter tuning, while the remaining elements of the list will only be evaluated and returned. For additional details, have a look at the Advanced Tuning section of the mlr package documentation. Here you see part of the output of our tuning run. You get information about the iteration number, hyperparameters and the metrics measured on the test (or more accurately the validation) data.

4. Choosing evaluation metrics

We can also define more complex metrics with the setAggregation function, which additionally returns the standard deviation of a metric, aggregated after resampling. In our example, accuracy, aggregated by the mean performance values on the training set is used for optimization, while the mmce is evaluated. When we look at the output, we see that we get information about the performance not only for the test set but also for the training set. If the available performance metrics are not suitable for your particular problem, you can use the makeMeasure function to construct custom measures.

5. Nested cross-validation & nested resampling

Another advanced approach is nested cross validation. Here, we use the makeTuneWrapper function instead of tuneParams to customize our base learner and add a hyperparameter search strategy. We can use this wrapper either directly with the train function, where tuning and resampling are performed and a final model is fit with the best hyperparameter combination. These can be extracted with getTuneResult. Or we can add an additional layer of cross-validation with the resample function, where we pass a second tuning control to the resampling argument.

6. Choose hyperparameters from a tuning set

And finally, we can extract the hyperparameters of our learner object and use the setHyperPars function to specifically define a set of hyperparameters. These can then be used just as before with the fit function, which will return a trained model that can be used for prediction on new data.

7. It's your turn!

Great, now it's your turn again!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.