Get startedGet started for free

Learning curves

1. Learning curves

Learning curves provide a lot of information about your model. Now that you know how to use the history callback to plot them, you will learn how to read them to get the most value out of them.

2. Learning curves

So far we've seen two types of learning curves: loss curves and accuracy curves.

3. Loss curve

Loss tends to decrease as epochs go by. This is expected since our model is essentially learning to minimize the loss function. Epochs are shown on the X axis and loss on the Y-axis. As epochs go by our loss value decreases. After a certain amount of epochs, the value converges, meaning it no longer gets much lower than that. We've arrived at a minimum.

4. Accuracy curve

Accuracy curves are similar but opposite in tendency if the Y-axis shows accuracy it now tends to increase as epochs go by. This shows that our model makes fewer mistakes as it learns.

5. Overfitting

If we plot training versus validation data we can identify overfitting. We will see the training and validation curves start to diverge. Overfitting is when our model starts learning particularities of our training data which don't generalize well on unseen data. The early stopping callback is useful to stop our model before it starts overfitting.

6. Unstable curves

But not all curves are smooth and pretty, many times we will find unstable curves. There are many reasons that can lead to unstable learning curves; the chosen optimizer, learning rate, batch-size, network architecture, weight initialization, etc. All these parameters can be tuned to improve our model learning curves, as we aim for better accuracy and generalization power. We will cover this in the following videos.

7. Can we benefit from more data?

Neural networks are well known for surpassing traditional machine learning techniques as we increase the size of our datasets. We can check whether collecting more data would increase a model’s generalization and accuracy.

8. Can we benefit from more data?

We aim at producing a graph like this one, where we have fitted our model with increasing amounts of training data and plotted the values for the training and test accuracies of each run.

9. Can we benefit from more data?

If after using all our data we see that our test still has a tendency to improve, that is, it's not parallel to our training set curve and it's increasing, then it's worth it to gather more data if possible to allow the model to keep learning.

10. Coding train size comparison

How would we go about coding a graph like the previous one? Imagine we want to evaluate an already built and compiled model and that we have partitioned our data into X_train, y_train, X_test and y_test. We first store the model initial weights, this is done by calling get_weights on our model,we then initialize two lists to store train and test accuracies.

11. Coding train size comparison II

We loop over a predefined list of train sizes and for each training size we get the corresponding training data fraction. Before any training, we make sure our model starts with the same set of weights by setting them to the initial_weights using the set_weights function. After that, we can fit our model on the training fraction. We use an EarlyStopping callback which monitors loss,but it's important to note that it's not validation loss since we haven't provided the fit method with validation data. After the training is done, we can get the accuracy for the training set fraction and the accuracy from the test set and append it to our lists of accuracies. Observe that the same quantity of test data observations were used to evaluate each iteration.

12. Time to dominate all curves!

It's time for you to show that you can dominate learning curves!