1. Learn
  2. /
  3. Courses
  4. /
  5. Multi-Modal Models with Hugging Face

Connected

Exercise

Prompting Vision Language Models (VLMs)

Over the next two exercises, you'll use a multi-modal model to analyze the sentiment of a news article and its corresponding headline image from the BBC News dataset on Hugging Face:

BBC News dataset card

To start, you will prepare a chat template for the model that includes both the image and the news article. The dataset (dataset) and headline image (image) have been loaded.

Instructions

100 XP
  • Load the news article content (content) from the datapoint at index 6 in the dataset.
  • Complete the text query to insert content into text_query using f-strings.
  • Add the image and text_query to the chat template, specifying the content type of text_query as "text".