Pipeline caption generation
In this exercise, you'll again use flickr dataset, which has 30,000 images and associated captions. Now you'll generate a caption for the following image using a pipeline instead of the auto classes.
The dataset (dataset
) has been loaded with the following structure:
Dataset({
features: ['image', 'caption', 'sentids', 'split', 'img_id', 'filename'],
num_rows: 10
})
The pipeline module (pipeline
) has been loaded.
Diese Übung ist Teil des Kurses
Multi-Modal Models with Hugging Face
Anleitung zur Übung
- Load the
image-to-text
pipeline withSalesforce/blip-image-captioning-base
pretrained model. - Use the pipeline to generate a caption for the image at index
3
.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Load the image-to-text pipeline
pipe = pipeline(task="____", model="____")
# Use the pipeline to generate a caption with the image of datapoint 3
pred = ____(dataset[3]["____"])
print(pred)