BaşlayınÜcretsiz Başlayın

Manipulating datasets

There will likely be many occasions when you will need to manipulate a dataset before using it within a ML task. Two common manipulations are filtering and selecting (or slicing). Given the size of these datasets, Hugging Face leverages arrow file types.

This means performing manipulations are slightly different than what you might be used to. Fortunately, there's already methods to help with this!

The dataset is already loaded for you under wikipedia.

Bu egzersiz

Working with Hugging Face

kursunun bir parçasıdır
Kursu Görüntüle

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Filter the documents
____ = wikipedia.____(lambda row: "football" in row["____"])

# Create a sample dataset
example = ____.____(range(1))

print(example[0]["text"])
Kodu Düzenle ve Çalıştır