1. เรียนรู้
  2. /
  3. Courses
  4. /
  5. Reinforcement Learning from Human Feedback (RLHF)

Connected

Exercises

Tokenize a text dataset

You are working on market research for a travel website, and would like to use an existing dataset to fine tune a model that will help you classify hotel reviews. You decide to use the datasets library.

The AutoTokenizer class has been pre-imported from transformers, and load_dataset() has been pre-imported from datasets.

คำแนะนำ

100 XP
  • Add padding to the tokenizer to process text as equal-sized batches.
  • Tokenize the text data using the pre-trained GPT tokenizer and defined function.