Fine-tuning approaches

1. Fine-tuning approaches

Great job! Let's examine different fine-tuning and transfer learning approaches.

2. Fine-tuning

We've seen that fine-tuning involves taking a pre-trained model and re-training it with domain-specific data to solve a particular downstream task. Consider a general-purpose summarization model fine-tuned on a dataset of chemistry articles to specialize in summarizing chemistry papers. There are two different fine-tuning approaches depending on how the model weights are updated.

3. Full fine-tuning

One is full-fine tuning, which entails updating weights across the entire model and being more computationally expensive. This is what we've done so far.

4. Partial fine-tuning

The other is partial fine-tuning, where weights in lower layers of the model body responsible for capturing general language understanding remain fixed, updating only the task-specific layers in the model head only. We won't show this approach as it is out of scope. The choice of the approach depends on the specific use case, task-specific data, and hardware computing capabilities.

5. Transfer learning

Related to fine-tuning, transfer learning adapts a previously trained model on one task to a different but related task. While fine-tuning typically involves training on a smaller dataset for a specific task, transfer learning leverages knowledge gained in one domain to enhance performance in another related domain. Several approaches can be adopted for transfer learning, including full and partial fine-tuning. Another popular transfer learning approach is n-shot learning, which includes zero, one, and few-shot learning.

6. N-shot learning

This is where a model is trained to generalize to a new task based on the number of examples it has seen during training. For example, in zero-shot learning, often used when data are scarce, a model is trained to generalize to new tasks never seen during training. Exposing a model to one or a few specific examples is known as one-shot and few-shot learning.

7. One-shot learning

We've seen this before when we include an example within a new input, such as using a text generation pipeline and including new input text for sentiment analysis that the model should copy.

8. Let's practice!

Time for some more practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.