1. Introduction to hyperparameter tuning
Hello again - In this lesson, we are going to start applying the model validation techniques we have been practicing while introducing hyperparameter tuning.
2. Model parameters
To start, let's first review what model parameters are, as model parameters and model hyperparameters are quite different.
Model parameters are created as the result of fitting a model and are estimated by the input data. They are used to make predictions on new data and are not manually set by the modeler.
3. Linear regression parameters
For example, in a linear model, the coefficients and intercept are considered model parameters. We can print a linear model's parameters using lr-dot-coef_ and lr-dot-intercept_. Notice that these parameters are only created after the model has been fit.
4. Linear regression parameters
If we did not call .fit, the coefficients and intercept would not exist for the lr object.
5. Model hyperparameters
So, if model parameters are the result of training a model, then what are hyperparameters?
Hyperparameters are the values that are set before training occurs. So anytime we refer to a parameter as being manually set, we are referring to hyperparameters.
We have already been working with some hyperparameters for scikit-learn's random forest models, such as n_estimators, and max_depth. Let's cover a few more of the basic hyperparameters for these models.
6. Random forest hyperparameters
The table above only has four of the 15 or so possible hyperparameters, and we have already discussed the first three: n_estimators, max_depth, and max_features during this course. I am adding min_samples_split to our list, which is the minimum number of samples required to make a split at the end of a leaf.
If a leaf in a decision tree has four observations in it, min_samples_split must be 4 or greater in order for this leaf to be split into more leaves.
So what now? If these are hyperparameters, how do we tune them?
7. What is hyperparameter tuning?
Throughout this course, we have been hinting at various aspects of hyperparameter tuning. We have used various hyperparameters and altered the values of these hyperparameters to suit our specific model or data.
Hyperparameter tuning consists of selecting hyperparameters to test and then running a specific type of model with various values for these hyperparameters. For each run of the model, we keep track of how well the model did for a specified accuracy metric, as well as keep track of the hyperparameters that were used.
8. Specifying ranges
One of the hardest parts of this process is selecting the right hyperparameters to tune, and specifying the appropriate value ranges for each hyperparameter. For example, consider the three ranges of values specified in the example above.
When we run hyperparameter tuning, we run our random forest model at different values from the ranges specified.
We might select a random max depth, a minimum sample for creating splits of 8, and a maximum feature count of 4. We used the random-dot-choice() function to select randomly from the depth list.
To review which parameters were used at any time, you can use the get_params() method on your model.
9. Too many hyperparameters!
If you do check out the contents of get_params(), you might feel overwhelmed by the amount of options available. For this model, there are 16 different hyperparameters.
In practice, however, only a handful of these hyperparameters will be tuned at the same time. Tuning too many can take forever to train and might make reading the output difficult.
10. General guidelines
It's best to start with the basics and tune the hyperparameters you understand. Read through the documentation for the ones that you don't, and test values you have seen in other models. As you practice this technique more, you will become more comfortable with the process.
11. Let's practice!
Let's work on the basic steps for hyperparameter tuning.