BaşlayınÜcretsiz Başlayın

Text tokenizing

In this exercise, you will use the flickr dataset, which has 30,000 images and associated captions, to perform preprocessing operations on text. This is necessary to be used by models for tasks such as text classification. This is especially useful for multi-modal applications where Hugging Face models can be used to check caption suitability for an associated image.

The dataset (dataset) has been loaded and the AutoTokenizer has been imported.

Bu egzersiz

Multi-Modal Models with Hugging Face

kursunun bir parçasıdır
Kursu Görüntüle

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Load the first caption from the image at index 5
text = dataset[5]["____"][0]
print(text)
Kodu Düzenle ve Çalıştır