Training and validation with Keras

1. Training with Keras

Earlier in the chapter, we defined neural networks in Keras. In this video, we will discuss how to train and evaluate them.

2. Overview of training and evaluation

Whenever we train and evaluate a model in tensorflow, we typically use the same set of steps. First, we'll load and clean the data. Second, we'll define a model, specifying an architecture. Third, we'll train and validate the model. And fourth, we perform evaluation.

3. How to train a model

Let's see an example of how this works. We'll start by importing tensorflow and defining a keras sequential model. We'll then add a dense layer to the model with 16 nodes and a relu activation function. Note that our input shape is (784,), since our dataset consists of 28x28 images, reshaped into vectors. We next define the output layer, which has 4 nodes and a softmax activation function.

4. How to train a model

We next compile the model, using the adam optimizer and the categorical cross entropy loss. Finally, we train the model using the fit operation.

5. The fit() operation

Notice that we only supplied two arguments to fit: features and labels. These are the only two required arguments; however, there are also many optional arguments, including batch_size, epochs, and validation_split. We will cover each of these.

6. Batch size and epochs

Let's start with the difference between the batch size and epochs parameters. The number of examples in each batch is the batch size, which is 32 by default. The number of times you train on the full set of batches is called the number of epochs. Here, the batch size is 5 and the number of epochs is 2. Using multiple epochs allows the model to revisit the same batches, but with different model weights and possibly optimizer parameters, since they are updated after each batch.

7. Performing validation

So what does the validation_split parameter do? It divides the dataset into two parts. The first part is the train set and the second part is the validation set.

8. Performing validation

Selecting a value of zero point two will put 20% of the data in the validation set.

9. Performing validation

The benefit of using a validation split is that you can see how your model performs on both the data it was trained on, the training set, and a separate dataset it was not trained on, the validation set. Here, we can see the first 10 epochs of training. Notice that we can see the training loss and validation loss separately. If the training loss becomes substantially lower than the validation loss, this is an indication that we're overfitting. We should either terminate the training process before that point or add regularization or dropout.

10. Changing the metric

Another benefit of the high level keras API is that we can swap less informative metrics, such as the loss, for ones that are easily interpretable, such as the share of accurately classified examples. We can do this by supplying accuracy to the metrics parameter of compile. We then apply fit to the model again with the same settings.

11. Changing the metric

Using the accuracy metric, we can see that the model performs quite well. In just 10 epochs, it goes from an accuracy of 42% to over 99%. Notice that the model performs equally well in the validation set, which means that we're unlikely to be overfitting.

12. The evaluation() operation

Finally, it is good idea to split off a test set before you begin to train and validate. You can use the evaluate operation to check performance on the test set at the end of the training process. Since you may tune model parameters in response to validation set performance, using a separate test set will provide you with further assurance that you have not overfitted.

13. Let's practice!

You now know how to streamline model training and validation in keras, so let's practice doing it with a few exercises.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.