ComenzarEmpieza gratis

Pipeline caption generation

In this exercise, you'll again use flickr dataset, which has 30,000 images and associated captions. Now you'll generate a caption for the following image using a pipeline instead of the auto classes.

Photo of a man standing on a ladder cleaning a window

The dataset (dataset) has been loaded with the following structure:

Dataset({
    features: ['image', 'caption', 'sentids', 'split', 'img_id', 'filename'],
    num_rows: 10
})

The pipeline module (pipeline) has been loaded.

Este ejercicio forma parte del curso

Multi-Modal Models with Hugging Face

Ver curso

Instrucciones del ejercicio

  • Load the image-to-text pipeline with Salesforce/blip-image-captioning-base pretrained model.
  • Use the pipeline to generate a caption for the image at index 3.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Load the image-to-text pipeline
pipe = pipeline(task="____", model="____")

# Use the pipeline to generate a caption with the image of datapoint 3
pred = ____(dataset[3]["____"])

print(pred)
Editar y ejecutar código