Get startedGet started for free

Setting the model in evaluation mode

You're ready to set your language model in evaluation mode. If the model is not in evaluation mode during inference, layers like batch normalization and dropout will introduce changes in the behavior of the model, leading to inconsistent translation quality. Build the loop to evaluate the model!

Some data have been preloaded: model, eval_dataloader, accelerator, and metric.

This exercise is part of the course

Distributed AI Model Training in Python

View Course

Exercise instructions

  • Set the model in evaluation mode before looping through batches in the dataset.
  • Aggregate predictions and labels across devices to compute evaluation metrics with Accelerator's .gather_for_metrics() method.
  • Compute the evaluation metric at the end.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

metric = evaluate.load("glue", "mrpc")

# Set the model in evaluation mode
____.____()
for step, batch in enumerate(eval_dataloader):
    with torch.no_grad():
        outputs = model(**batch)
    predictions = outputs.logits.argmax(dim=-1)
    # Aggregate values across devices
    predictions, references = ____.____((predictions, batch["labels"]))
    metric.add_batch(predictions=predictions, references=references)
# Compute the evaluation metric
eval_metric = metric.____()
print(eval_metric)
Edit and Run Code