1. Fine-tune models with Trainer
Time to discuss distributed training!
2. Data preparation
In data preparation, we split the data across multiple devices and copied the model on each device.
3. Distributed training
We've laid the groundwork for distributed training, where each device trains on its data in parallel.
4. Trainer and Accelerator
First, we'll explore the Trainer class as an interface for distributed training;
5. Trainer and Accelerator
then, we'll cover Accelerator, which enables custom training loops.
6. Turbocharge training with Trainer
Trainer runs the model on each device in parallel to speed up training, similar to how assembly lines produce cars in parallel. It reads arguments like the dataset, model, and metrics; we'll review these inputs before calling Trainer. We'll use Trainer to build a sentiment analysis model for an e-commerce platform, classifying customer reviews as positive or negative.
7. Product review sentiment dataset
The dataset contains product reviews and their sentiment labels, as shown in an example.
8. Convert labels to integers
Based on the Hugging Face documentation, our model expects labels to be 0 or 1. We map negative labels to 0 and positive labels to 1, showing the first label.
9. Define the tokenizer and model
Models require tokenized text, so we load a pre-trained model with AutoModelForSequenceClassification and a tokenizer with AutoTokenizer. We define an encode function to tokenize text examples, map it over the dataset, and display sample tokens.
10. Define evaluation metrics
We define evaluation metrics for Trainer using the Hugging Face evaluate library. In compute_metrics(), we use evaluate.load() to load functions for accuracy and F1 score. Next, we extract model outputs (called logits) and labels from eval_predictions, convert logits to predictions with argmax(), compute accuracy and F1 score, and return a metrics dictionary. Trainer will display these metrics after each epoch.
11. Training arguments
Next, we configure the training process. TrainingArguments specifies arguments like the output directory and hyperparameters for tuning. In every epoch, we save the model and print metrics specified by save_strategy and evaluation_strategy.
12. Setting up Trainer
Trainer provides an interface for training and evaluating models. We provide the model, training arguments, dataset, and evaluation metrics. Calling trainer.train() begins training and prints metrics every epoch. Trainer places the model on available devices and trains on each device in parallel. Here we're printing the selected devices.
13. Running sentiment analysis for e-commerce
After training, we run the model on a sample review. First, the tokenizer converts text to tokens.
14. Running sentiment analysis for e-commerce
The model outputs logits, which we convert into a predicted label by applying argmax() over the columns (denoted by dimension one) and extracting the label with item(). The sentiment is negative if the label is zero; otherwise, it's positive. Finally, we print the sentiment.
15. Checkpoints with Trainer
Training large datasets can be time-consuming. If we need to interrupt training, checkpoints allow us to resume from a saved state by calling trainer.train() with resume_from_checkpoint = True, which prints metrics from the latest checkpoint. To resume from a specific checkpoint, we provide its name in resume_from_checkpoint; checkpoint names are located in the output directory.
16. Let's practice!
Over to you!