LoslegenKostenlos loslegen

Prompting Vision Language Models (VLMs)

Over the next two exercises, you'll use a multi-modal model to analyze the sentiment of a news article and its corresponding headline image from the BBC News dataset on Hugging Face:

BBC News dataset card

To start, you will prepare a chat template for the model that includes both the image and the news article. The dataset (dataset) and headline image (image) have been loaded.

Diese Übung ist Teil des Kurses

Multi-Modal Models with Hugging Face

Kurs anzeigen

Anleitung zur Übung

  • Load the news article content (content) from the datapoint at index 6 in the dataset.
  • Complete the text query to insert content into text_query using f-strings.
  • Add the image and text_query to the chat template, specifying the content type of text_query as "text".

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

# Load the news article content from datapoint 6
content = ____

# Complete the text query
text_query = f"Does the news article have a positive, negative, or neutral impact on championship winning chances: {____}. Provide reasoning."

# Add the text query dictionary to the chat template
chat_template = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": ____,
            },
            ____
        ],
    }
]
Code bearbeiten und ausführen