MulaiMulai sekarang secara gratis

Automated caption quality assessment

You have accurately classified the image of the dress, but how good was the original description?

Maa Fab wrap with a Trendy design dress with Vibrant color for an elegant touch of Fabric completely Soft and Comfortable wear with amazing design of Solid Boat ? Neck Flared Dress to make a perfect addition to your wardrobe collection.

You will now use the CLIP model to make a quantitative statement about how accurate this description is using the CLIP score. The caption (description), image (image), ToTensor class, and clip_score() function from torchmetrics have been loaded.

Latihan ini adalah bagian dari kursus

Multi-Modal Models with Hugging Face

Lihat Kursus

Petunjuk latihan

  • Convert the image to a PyTorch tensor with intensities ranging from 0-255.
  • Use the clip_score() function to assess the quality of the caption by comparing image_tensor and description with the openai/clip-vit-base-patch32 model.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Convert the image to a PyTorch tensor
image_tensor = ____()(image)*____

# Use the clip_score function to assess the quality of the caption
score = ____(____, ____, "____")

print(f"CLIP score: {score}")
Edit dan Jalankan Kode