1. 学习
  2. /
  3. 课程
  4. /
  5. Reinforcement Learning from Human Feedback (RLHF)

Connected

练习

Tokenize a text dataset

You are working on market research for a travel website, and would like to use an existing dataset to fine tune a model that will help you classify hotel reviews. You decide to use the datasets library.

The AutoTokenizer class has been pre-imported from transformers, and load_dataset() has been pre-imported from datasets.

说明

100 XP
  • Add padding to the tokenizer to process text as equal-sized batches.
  • Tokenize the text data using the pre-trained GPT tokenizer and defined function.