Multi-class model evaluation
Let's evaluate our cloud classifier with precision and recall to see how well it can classify the seven cloud types. In this multi-class classification task it is important how you average the scores over classes. Recall that there are four approaches:
- Not averaging, and analyzing the results per class;
- Micro-averaging, ignoring the classes and computing the metrics globally;
- Macro-averaging, computing metrics per class and averaging them;
- Weighted-averaging, just like macro but with the average weighted by class size.
Both Precision and Recall are already imported from torchmetrics. It's time to see how well our model is doing!
This exercise is part of the course
Intermediate Deep Learning with PyTorch
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Define metrics
metric_precision = Precision(task=____, num_classes=____, average=____)
metric_recall = ____
net.eval()
with torch.no_grad():
for images, labels in dataloader_test:
outputs = net(images)
_, preds = torch.max(outputs, 1)
metric_precision(preds, labels)
metric_recall(preds, labels)
precision = metric_precision.compute()
recall = metric_recall.compute()
print(f"Precision: {precision}")
print(f"Recall: {recall}")