Define evaluation metrics
You're developing a real-time language translation service in a video conferencing application. To monitor training, you'll define evaluation metrics for accuracy and F1 score, which measure overall model performance.
The evaluate
and numpy
(np
) libraries have been pre-imported.
This exercise is part of the course
Efficient AI Model Training with PyTorch
Exercise instructions
- Load the
f1
score using theevaluate
library;accuracy
has been loaded for you. - Extract
logits
andlabels
from the inputeval_predictions
. - Convert
logits
topredictions
. - Compute the
f1
score based on thepredictions
andlabels
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def compute_metrics(eval_predictions):
load_accuracy = evaluate.load("accuracy")
# Load the F1 score
load_f1 = ____("____")
# Extract logits and labels from eval_predictions
____, ____ = eval_predictions
# Convert logits to predictions
predictions = np.____(____, axis=-1)
accuracy = load_accuracy.compute(predictions=predictions, references=labels)["accuracy"]
# Compute the F1 score
f1 = ____.____(predictions=predictions, references=labels)["f1"]
return {"accuracy": accuracy, "f1": f1}