Multi-class model evaluation
Let's evaluate our cloud classifier with precision and recall to see how well it can classify the seven cloud types. In this multi-class classification task it is important how you average the scores over classes. Recall that there are four approaches:
- Not averaging, and analyzing the results per class;
- Micro-averaging, ignoring the classes and computing the metrics globally;
- Macro-averaging, computing metrics per class and averaging them;
- Weighted-averaging, just like macro but with the average weighted by class size.
Both Precision
and Recall
are already imported from torchmetrics
. It's time to see how well our model is doing!
This exercise is part of the course
Intermediate Deep Learning with PyTorch
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Define metrics
metric_precision = Precision(task=____, num_classes=____, average=____)
metric_recall = ____
net.eval()
with torch.no_grad():
for images, labels in dataloader_test:
outputs = net(images)
_, preds = torch.max(outputs, 1)
metric_precision(preds, labels)
metric_recall(preds, labels)
precision = metric_precision.compute()
recall = metric_recall.compute()
print(f"Precision: {precision}")
print(f"Recall: {recall}")