1. Hyperparameter tuning
You now know everything you need to perform hyperparameter tuning in neural networks!
Our aim is to identify those parameters that make our model generalize better.
2. Neural network hyperparameters
A neural network is full of parameters that can be tweaked:
the number of layers, neurons per layer,
the order of such layers,
the activation functions,
batch sizes,
learning rates, optimizers, a lot of things to keep in mind!
3. Sklearn recap
In sklearn we can perform hyperparameter search by using methods like RandomizedSearchCV.
We import RandomizedSearchCV from sklearn model_selection.
We instantiate a model,
define a dictionary with a series of model parameters to try and finally instantiate a RandomizedSearchCV object passing our model, the parameters and a number of cross-validation folds.
We fit it on our data and print the best resulting combination of parameters.
For this example, a min_samples_leaf of 1, 3 max_features and a max_depth of 3 gave us the best results.
4. Turn a Keras model into a Sklearn estimator
We can do the same with our Keras models! But we first have to transform them into sklearn estimators. We do this by first defining a function that creates our model.
Then we import the KerasClassifier wrapper from tensorflow.keras sci-kit learn wrappers.
We finish by simply instantiating a KerasClassifier object passing create_model as the building function, other parameters like epochs and batch_size are optional but should be passed if we want to specify them.
5. Cross-validation
This is very cool! Our model is now just like any other sklearn estimator, so we can, for instance, perform cross-validation on it to see the stability of its predictions across folds.
Import cross_val_score, passing in our recently converted Keras model, predictors, labels, and the number of folds.
We can then check the mean accuracy per fold or
the standard deviation.
Note that 6 epochs and a batch_size of 16 were used since we specified it before.
6. Tips for neural networks hyperparameter tuning
It's much more probable that a good combination of parameters will be found by using random search instead of an exhaustive grid search. Grid search loops over all possible combinations of parameters whilst random search tries a given number of random combinations.
Normally, not many epochs are needed to check how well your model is performing,
using a smaller representative sample of your dataset makes things faster if you've got a huge dataset.
It's easier to play with things like optimizers, batch_sizes, activations, and learning rates.
7. Random search on Keras models
To perform randomized search on a Keras model we just need to define the parameters to try.
We can try different optimizers, activation functions for the hidden layers and batch sizes.
The keys in the parameter dictionary must be named exactly as the parameters in our create_model function.
We then instantiate a RandomizedSearchCV object passing our model and parameters with 3 fold cross-validation.
We end up fitting our random_search object to obtain the results. We can print the best score and the parameters that were used.
We get an accuracy of 94% with the adam optimizer, 3 epochs, a batch_size of 10 and relu activation.
8. Tuning other hyperparameters
Parameters like the number of neurons per layer and the number of layers can also be tuned using the same method. We just need to make some smart changes in our create model function.
The nl parameter determines the number of hidden layers and nn the number of neurons in these layers, we can have a loop inside our function and add to our sequential model as many layers as provided in nl with the given number of neurons.
9. Tuning other hyperparameters
Then we just need to use the exact same names in the parameter dictionary as we have in our function and repeat the process.
The best result is 87% accuracy with 2 hidden layers of 128 neurons each.
10. Let's tune some networks!
Let's go practice now!