1. Hyperparameter Values
In this lesson we will look more in depth at what values to set for different hyperparameters and begin automating our work.
2. Hyperparameter Values
Previously you learned that some hyperparameters are likely better to start your tuning with than others.
What we didn't discuss was what values should you try.
This will be specific to the algorithm and to the hyperparameter itself
But there does exist best practice around this.
Let's walk through some top tips for deciding ranges of values to try for different hyperparameters.
3. Conflicting Hyperparameter Choices
It is firstly important to know what values NOT to set as they may conflict.
You will see in the Scikit Learn documentation for the Logistic Regression Algorithm, that some values of the hyperparameter 'penalty' conflict with some values of the hyperparameter 'solver'
Another example from the ElasticNet algorithm
demonstrates a softer conflict that will not result in an error, but may result in a model construction we had not anticipated.
Safe to say, close inspection of the Scikit Learn documentation is important.
4. Silly Hyperparameter Values
There are also values for different hyperparameters that may be valid but are very unlikely to yield good results. Some examples of this are:
Having a random forest algorithm with a very low number of trees.
Would you consider it a forest if it had 2 trees? How about 5 or 10? Still probably not. But at 300, 500, 1000 or more that is definitely getting there!
Having only 1 neighbor in a K-nearest neighbor algorithm. This algorithm averages votes of 'neighbors' to your sample. Safe to say averaging the vote of 1 person doesn't sound robust!
Finally, incrementing some hyperparameters by a small amount is unlikely to greatly improve the model. One more tree in a forest for example, isn't likely to have a large impact.
Researching and documenting sensible values for different hyperparameters and algorithms will be a very useful activity.
5. Automating Hyperparameter Choice
In the previous exercise you built several different models to test a single hyperparameter like so.
This was quite an inefficient way of writing code, I think we can do better to test different values for the number of neighbors hyperparameter.
6. Automating Hyperparameter Tuning
One thing we could try is using a for loop.
We create a list of values to test.
Then loop through the list, creating an estimator, fitting and predicting each time.
We append the accuracy to a list of accuracy scores to analyze after.
This method easily allows us to test more values than our previous work.
7. Automating Hyperparameter Tuning
We can store the results in a DataFrame to view the effect of this hyperparameter on the accuracy of the model.
It appears that adding any more neighbors doesn't help beyond 20.
8. Learning Curves
A common tool that is used to assist with analyzing the impact of a singular hyperparameter on an end result is called a 'learning curve'.
Firstly let's create a list of many more values to test using Python's range function.
The rest of the code is the same as before.
9. Learning Curves
Since we tested so many values, we will use a graph rather than a table to analyze the results.
We plot the accuracy score on the Y axis and our hyperparameter value on our X axis.
10. Learning Curves
We can see our suspicions confirmed, that accuracy does not increase at all beyond where we tested before.
11. A handy trick for generating values
One thing to be aware of is that python's range function does not work for decimal steps which is important for hyperparameters that work on that scale.
A handy trick uses NumPy's linspace function that will create a number of values, evenly spread between a start and end value that you specify.
Here is a quick example you can see.
5 values, evenly spaced between 1 and 2 inclusive.
12. Let's practice!
Let's practice trying different hyperparameters and plotting some learning curves!