Aan de slagGa gratis aan de slag

Automated caption quality assessment

You have accurately classified the image of the dress, but how good was the original description?

Maa Fab wrap with a Trendy design dress with Vibrant color for an elegant touch of Fabric completely Soft and Comfortable wear with amazing design of Solid Boat ? Neck Flared Dress to make a perfect addition to your wardrobe collection.

You will now use the CLIP model to make a quantitative statement about how accurate this description is using the CLIP score. The caption (description), image (image), ToTensor class, and clip_score() function from torchmetrics have been loaded.

Deze oefening maakt deel uit van de cursus

Multi-Modal Models with Hugging Face

Cursus bekijken

Oefeninstructies

  • Convert the image to a PyTorch tensor with intensities ranging from 0-255.
  • Use the clip_score() function to assess the quality of the caption by comparing image_tensor and description with the openai/clip-vit-base-patch32 model.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Convert the image to a PyTorch tensor
image_tensor = ____()(image)*____

# Use the clip_score function to assess the quality of the caption
score = ____(____, ____, "____")

print(f"CLIP score: {score}")
Code bewerken en uitvoeren