Evaluating with METEOR
METEOR excels at evaluating some of the more semantic features in text. It works similar to ROUGE by comparing a model-generated output to a reference output. You've been provided these texts as generated
and reference
; it's over to you to evaluate the score.
The evaluate
library has been loaded for you.
Este exercício faz parte do curso
Introduction to LLMs in Python
Instruções de exercício
- Compute and print the METEOR score.
Exercício interativo prático
Experimente este exercício preenchendo este código de exemplo.
meteor = evaluate.load("meteor")
generated = ["The burrow stretched forward like a narrow corridor for a while, then plunged abruptly downward, so quickly that Alice had no chance to stop herself before she was tumbling into an extremely deep shaft."]
reference = ["The rabbit-hole went straight on like a tunnel for some way, and then dipped suddenly down, so suddenly that Alice had not a moment to think about stopping herself before she found herself falling down a very deep well."]
# Compute and print the METEOR score
results = ____
print("Meteor: ", ____)