Using loss functions to assess model predictions

1. Using loss functions to assess model predictions

We've generated predictions by running a forward pass, the next step is to see how good our predictions are compared to the actual values.

2. Why do we need a loss function?

The loss function, another component of neural networks, tells us how good our model is at making predictions during training. It takes a model prediction, y-hat, and true label, or ground truth, y, as inputs, and outputs a float.

3. Why do we need a loss function?

Let's use our multi-class classification model that predicts whether an animal is a mammal (0), bird (1), or reptile (2). Our dataset contains animal characteristics; this example is a bear. Therefore, the class is zero (mammal). If our model predicts that the class equals zero, it is correct, and the loss value will be low. An incorrect prediction would make the loss value high. Our goal is to minimize loss.

4. One-hot encoding concepts

Loss is calculated using a loss function, F, which takes the ground truth y and the prediction yhat as inputs and outputs a numerical loss value. In our animal example, possible values for our true class of y are integers 0, 1 or 2. yhat is the raw prediction before applying the softmax function. It is a tensor with the same dimensions as the number of classes N. If N is three, the softmax output is a tensor of shape 1 by 3.

5. One-hot encoding concepts

We use one-hot encoding to convert an integer y into a tensor of zeros and ones so we can compare to evaluate model performance. For example, if y = 0 with three classes, the encoded form is 1, 0, 0.

6. Transforming labels with one-hot encoding

We can import torch.nn.functional as F to avoid manual one-hot encoding. In the first example, ground truth is zero (the first class). We have three classes, so the function outputs a three-element tensor with one at the first position and zeros otherwise. If y equals one (the second class), the output tensor has a one in the second position, and zeros otherwise. Lastly, if y equals two (the third class), the output tensor has a one at the last position, and zeros otherwise.

7. Cross entropy loss in PyTorch

With the encoding complete, we can pass it along with our predictions yhat, to a loss function. Here, yhat is stored as the tensor "scores". The most commonly used loss function for classification is cross-entropy loss. We start by defining our loss function as "criterion". We then pass it the .double() method of the scores tensor and the one_hot_target tensor. This ensures tensors are in the correct format for the loss function. The output is the computed loss value.

8. Bringing it all together

In summary, the loss function takes the scores tensor as input, which is the model prediction before the final softmax function, and the one-hot encoded ground truth label. It outputs a single float, the loss of that sample. Recall that our goal is to minimize this loss. In the next video, we'll see how to do that with backpropagation.

9. Let's practice!

For now, it's time for practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Introduction to Deep Learning with PyTorch

IntermediateSkill Level

4.8+

2843 reviews