Prompting Vision Language Models (VLMs)
Over the next two exercises, you'll use a multi-modal model to analyze the sentiment of a news article and its corresponding headline image from the BBC News dataset on Hugging Face:

To start, you will prepare a chat template for the model that includes both the image and the news article. The dataset (dataset) and headline image (image) have been loaded.
Diese Übung ist Teil des Kurses
Multi-Modal Models with Hugging Face
Anleitung zur Übung
- Load the news article content (
content) from the datapoint at index6in thedataset. - Complete the text query to insert
contentintotext_queryusing f-strings. - Add the
imageandtext_queryto the chat template, specifying the content type oftext_queryas"text".
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Load the news article content from datapoint 6
content = ____
# Complete the text query
text_query = f"Does the news article have a positive, negative, or neutral impact on championship winning chances: {____}. Provide reasoning."
# Add the text query dictionary to the chat template
chat_template = [
{
"role": "user",
"content": [
{
"type": "image",
"image": ____,
},
____
],
}
]