Evaluation of multi-output models and loss weighting

1. Evaluation of multi-output models and loss weighting

Welcome back! In this final video of the course, we will discuss loss weighting and evaluation of multi-output models. Let's dive in!

2. Model evaluation

Let's start with the evaluation of a multi-output model. It's very similar to what we have done before. However, with two different outputs, we need to set up two accuracy metrics: one for alphabet classification and one for character classification. We iterate over the test DataLoader and get the model's predictions as usual. Finally, we update the accuracy metrics, and after the loop, we can calculate their final values. The accuracy is higher for alphabets than for characters, which is not surprising: predicting the alphabet is an easier task with just 30 classes to choose from; for characters, there are 964 possible labels. The difference in accuracy scores is not very large, however: 31 versus 24 percent. This is because learning to recognize the alphabets helped the model recognize individual characters: there is a combined positive effect from solving these two tasks at once.

3. Multi-output training loop revisited

Let's now take a look at the training loop for our last model predicting characters and alphabets. Because the model solves two classification tasks at the same time, we have two losses: one for alphabets, and another one for characters. However, since the optimizer can only handle one objective, we had to combine the two losses somehow. We chose to define the final loss as the sum of the two partial losses. By doing so, we are telling the model that recognizing characters and recognizing alphabets are equally important to us. If that is not the case, we can combine the two losses differently.

4. Varying task importance

Let's say that correct classification of characters is twice as important for us as the classification of alphabets. To pass this information to the model, we can multiply the character loss by two to force the model to optimize it more. Another approach is to assign weights to both losses that sum up to one. This is equivalent from the optimization perspective, but arguably easier to read for humans, especially with more than two loss components.

5. Warning: losses on different scales

There is just one caveat: when assigning loss weights, we must be aware of the magnitudes of the loss values. If the losses are not on the same scale, one loss could dominate the other, causing the model to effectively ignore the smaller loss. Consider a scenario where we're building a model to predict house prices, and use MSE loss. If we also want to use the same model to provide a quality assessment of the house, categorized as "Low", "Medium", or "High", we would use cross-entropy loss. Cross-entropy is typically in the single-digit range, while MSE can reach tens of thousands. Combining these two would result in the model ignoring the quality assessment task completely. A solution is to scale each loss by dividing it by the maximum value in the batch. This brings them to the same range, allowing us to weight them if desired and add together.

6. Let's practice!

Let's practice!

This exercise is part of the course

Intermediate Deep Learning with PyTorch

IntermediateSkill Level

4.8+

Start Course for Free

Learn how to train neural networks in a robust way. In this chapter, you will use object-oriented programming to define PyTorch datasets and models and refresh your knowledge of training and evaluating neural networks. You will also get familiar with different optimizers and, finally, get to grips with various techniques that help mitigate the problems of unstable gradients so ubiquitous in neural nets training.

Exercise 1: PyTorch and object-oriented programming Exercise 2: PyTorch Dataset Exercise 3: PyTorch DataLoader Exercise 4: PyTorch Model Exercise 5: Optimizers, training, and evaluation Exercise 6: Training loop Exercise 7: Optimizers Exercise 8: Model evaluation Exercise 9: Vanishing and exploding gradients Exercise 10: Initialization and activation Exercise 11: Activations: ReLU vs. ELU Exercise 12: Batch Normalization

Train neural networks to solve image classification tasks. In this chapter, you will learn how to handle image data in PyTorch and get to grips with convolutional neural networks (CNNs). You will practice training and evaluating an image classifier while learning about how to improve the model performance with data augmentation.

Exercise 1: Handling images with PyTorch Exercise 2: Image dataset Exercise 3: Data augmentation Exercise 4: Data augmentation in PyTorch Exercise 5: Convolutional Neural Networks Exercise 6: The convolutional layer Exercise 7: Building convolutional networks Exercise 8: Training image classifiers Exercise 9: Choosing augmentations Exercise 10: Dataset with augmentations Exercise 11: Image classifier training loop Exercise 12: Evaluating image classifiers Exercise 13: Multi-class model evaluation Exercise 14: Analyzing metrics per class

Build and train recurrent neural networks (RNNs) for processing sequential data such as time series, text, or audio. You will learn about the two most popular recurrent architectures, Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks, as well as how to prepare sequential data for model training. You will practice your skills by training and evaluating a recurrent model for predicting electricity consumption.

Exercise 1: Handling sequences with PyTorch Exercise 2: Generating sequences Exercise 3: Sequential Dataset Exercise 4: Recurrent Neural Networks Exercise 5: Sequential architectures Exercise 6: Building a forecasting RNN Exercise 7: LSTM and GRU cells Exercise 8: RNN vs. LSTM vs. GRU Exercise 9: LSTM network Exercise 10: GRU network Exercise 11: Training and evaluating RNNs Exercise 12: RNN training loop Exercise 13: Evaluating forecasting models

Build multi-input and multi-output models, demonstrating how they can handle tasks requiring more than one input or generating multiple outputs. You will explore how to design and train these models using PyTorch and delve into the crucial topic of loss weighting in multi-output models. This involves understanding how to balance the importance of different tasks when training a model to perform multiple tasks simultaneously.

Exercise 1: Multi-input models Exercise 2: Two-input dataset Exercise 3: Two-input model Exercise 4: Training two-input model Exercise 5: Multi-output models Exercise 6: Two-output Dataset and DataLoader Exercise 7: Two-output model architecture Exercise 8: Training multi-output models Exercise 9: Evaluation of multi-output models and loss weighting

Current Exercise

Exercise 10: Multi-output model evaluation Exercise 11: Loss weighting Exercise 12: Wrap-up