LoslegenKostenlos loslegen

Pipeline caption generation

In this exercise, you'll again use flickr dataset, which has 30,000 images and associated captions. Now you'll generate a caption for the following image using a pipeline instead of the auto classes.

Photo of a man standing on a ladder cleaning a window

The dataset (dataset) has been loaded with the following structure:

Dataset({
    features: ['image', 'caption', 'sentids', 'split', 'img_id', 'filename'],
    num_rows: 10
})

The pipeline module (pipeline) has been loaded.

Diese Übung ist Teil des Kurses

Multi-Modal Models with Hugging Face

Kurs anzeigen

Anleitung zur Übung

  • Load the image-to-text pipeline with Salesforce/blip-image-captioning-base pretrained model.
  • Use the pipeline to generate a caption for the image at index 3.

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

# Load the image-to-text pipeline
pipe = pipeline(task="____", model="____")

# Use the pipeline to generate a caption with the image of datapoint 3
pred = ____(dataset[3]["____"])

print(pred)
Code bearbeiten und ausführen