Define evaluation metrics

You're developing a real-time language translation service in a video conferencing application. To monitor training, you'll define evaluation metrics for accuracy and F1 score, which measure overall model performance.

The evaluate and numpy (np) libraries have been pre-imported.

This exercise is part of the course

Efficient AI Model Training with PyTorch

View Course

Exercise instructions

Load thef1 score using the evaluate library; accuracy has been loaded for you.
Extract logits and labels from the input eval_predictions.
Convert logits to predictions.
Compute the f1 score based on the predictions and labels.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def compute_metrics(eval_predictions):
    load_accuracy = evaluate.load("accuracy")
  	# Load the F1 score
    load_f1 = ____("____")
    # Extract logits and labels from eval_predictions
    ____, ____ = eval_predictions
    # Convert logits to predictions
    predictions = np.____(____, axis=-1)
    accuracy = load_accuracy.compute(predictions=predictions, references=labels)["accuracy"]
    # Compute the F1 score
    f1 = ____.____(predictions=predictions, references=labels)["f1"]
    return {"accuracy": accuracy, "f1": f1}

Edit and Run Code