Get startedGet started for free

Fine-tune models with Trainer

1. Fine-tune models with Trainer

Time to discuss distributed training!

2. Data preparation

In data preparation, we split the data across multiple devices and copied the model on each device.

3. Distributed training

We've laid the groundwork for distributed training, where each device trains on its data in parallel.

4. Trainer and Accelerator

First, we'll explore the Trainer class as an interface for distributed training;

5. Trainer and Accelerator

then, we'll cover Accelerator, which enables custom training loops.

6. Turbocharge training with Trainer

Trainer runs the model on each device in parallel to speed up training, similar to how assembly lines produce cars in parallel. It reads arguments like the dataset, model, and metrics; we'll review these inputs before calling Trainer. We'll use Trainer to build a sentiment analysis model for an e-commerce platform, classifying customer reviews as positive or negative.

7. Product review sentiment dataset

The dataset contains product reviews and their sentiment labels, as shown in an example.

8. Convert labels to integers

Based on the Hugging Face documentation, our model expects labels to be 0 or 1. We map negative labels to 0 and positive labels to 1, showing the first label.

9. Define the tokenizer and model

Models require tokenized text, so we load a pre-trained model with AutoModelForSequenceClassification and a tokenizer with AutoTokenizer. We define an encode function to tokenize text examples, map it over the dataset, and display sample tokens.

10. Define evaluation metrics

We define evaluation metrics for Trainer using the Hugging Face evaluate library. In compute_metrics(), we use evaluate.load() to load functions for accuracy and F1 score. Next, we extract model outputs (called logits) and labels from eval_predictions, convert logits to predictions with argmax(), compute accuracy and F1 score, and return a metrics dictionary. Trainer will display these metrics after each epoch.

11. Training arguments

Next, we configure the training process. TrainingArguments specifies arguments like the output directory and hyperparameters for tuning. In every epoch, we save the model and print metrics specified by save_strategy and evaluation_strategy.

12. Setting up Trainer

Trainer provides an interface for training and evaluating models. We provide the model, training arguments, dataset, and evaluation metrics. Calling trainer.train() begins training and prints metrics every epoch. Trainer places the model on available devices and trains on each device in parallel. Here we're printing the selected devices.

13. Running sentiment analysis for e-commerce

After training, we run the model on a sample review. First, the tokenizer converts text to tokens.

14. Running sentiment analysis for e-commerce

The model outputs logits, which we convert into a predicted label by applying argmax() over the columns (denoted by dimension one) and extracting the label with item(). The sentiment is negative if the label is zero; otherwise, it's positive. Finally, we print the sentiment.

15. Checkpoints with Trainer

Training large datasets can be time-consuming. If we need to interrupt training, checkpoints allow us to resume from a saved state by calling trainer.train() with resume_from_checkpoint = True, which prints metrics from the latest checkpoint. To resume from a specific checkpoint, we provide its name in resume_from_checkpoint; checkpoint names are located in the output directory.

16. Let's practice!

Over to you!