Get startedGet started for free

Evaluating with METEOR

METEOR excels at evaluating some of the more semantic features in text. It works similar to ROUGE by comparing a model-generated output to a reference output. You've been provided these texts as generated and reference; it's over to you to evaluate the score.

The evaluate library has been loaded for you.

This exercise is part of the course

Introduction to LLMs in Python

View Course

Exercise instructions

  • Compute and print the METEOR score.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

meteor = evaluate.load("meteor")

generated = ["The burrow stretched forward like a narrow corridor for a while, then plunged abruptly downward, so quickly that Alice had no chance to stop herself before she was tumbling into an extremely deep shaft."]
reference = ["The rabbit-hole went straight on like a tunnel for some way, and then dipped suddenly down, so suddenly that Alice had not a moment to think about stopping herself before she found herself falling down a very deep well."]

# Compute and print the METEOR score
results = ____
print("Meteor: ", ____)
Edit and Run Code