ComenzarEmpieza gratis

Prompting Vision Language Models (VLMs)

Over the next two exercises, you'll use a multi-modal model to analyze the sentiment of a news article and its corresponding headline image from the BBC News dataset on Hugging Face:

BBC News dataset card

To start, you will prepare a chat template for the model that includes both the image and the news article. The dataset (dataset) and headline image (image) have been loaded.

Este ejercicio forma parte del curso

Multi-Modal Models with Hugging Face

Ver curso

Instrucciones del ejercicio

  • Load the news article content (content) from the datapoint at index 6 in the dataset.
  • Complete the text query to insert content into text_query using f-strings.
  • Add the image and text_query to the chat template, specifying the content type of text_query as "text".

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Load the news article content from datapoint 6
content = ____

# Complete the text query
text_query = f"Does the news article have a positive, negative, or neutral impact on championship winning chances: {____}. Provide reasoning."

# Add the text query dictionary to the chat template
chat_template = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": ____,
            },
            ____
        ],
    }
]
Editar y ejecutar código