1. Learning techniques
In this video, we will examine how to handle data availability when creating an LLM.
2. Where are we?
Here, we will discuss the learning techniques used in fine-tuning a pre-trained LLM.
3. Getting beyond data constraints
Fine-tuning involves training a pre-trained model on a smaller, task-specific labeled dataset to improve performance.
But what if little to no labeled data is available to learn a specific task or domain?
This is where zero-shot, few-shot, and multi-shot learning comes in, collectively called N-shot learning techniques.
4. Transfer learning
These techniques are all part of transfer learning.
So, what is transfer learning?
It involves training a model on one task and applying the learned knowledge to a different but related task.
For example, the skills acquired during piano lessons, such as reading musical notes, understanding rhythm, and grasping musical concepts, can be quickly transferred when learning to play the guitar.
In the case of LLMs, the pre-trained language model is fine-tuned on a new task with little to no task-specific training data (zero-shot or few-shot) or with more training data (multi-shot).
5. Zero-shot learning
Zero-shot learning allows LLMs to perform a task it has not been explicitly trained on. It uses its understanding of language and context to transfer its knowledge to the new task.
Suppose a child has only seen pictures of horses and is asked to identify a zebra with additional information that it looks like a striped horse. In that case, they can correctly identify it without seeing specific examples of zebras.
It demonstrates how LLMs use zero-shot learning to quickly learn new skills and generalize knowledge to new situations without using any examples.
6. Few-shot learning
On the other hand, few-shot learning allows the model to learn a new task with very few examples. This is achieved using the knowledge the model has gained from previous tasks, allowing it to generalize to new tasks with only a few examples.
For example, a student attends lectures and takes notes but doesn't study extra for exams. On exam day, they encounter a new question similar to the one taught in the class and can answer it correctly by relying on prior knowledge and experience.
When the number of examples used for fine-tuning is only one, it is called one-shot learning.
7. Multi-shot learning
Multi-shot learning is similar to few-shot learning but more examples are required for the model to learn a new task. The model uses the knowledge learned from previous tasks, along with more examples of the new task, to learn and generalize to new tasks.
Let's take an example of recognizing different breeds of dogs. If we show the model a few pictures of a Golden Retriever, it can quickly learn to recognize the breed and then generalize this knowledge to similar breeds with just a few more examples,
8. Multi-shot learning
such as Labrador Retriever.
This approach saves time and effort in collecting and labeling a large dataset for each breed while still achieving good accuracy in recognizing different dog breeds.
9. Building blocks so far
So far, we have covered a lot of ground.
We learned how data is prepared for computers to understand language and discussed how fine-tuning is an effective approach for overcoming the challenges of building LLMs.
We also explored N-shot learning techniques for dealing with the lack of data.
In the next chapter, we will dive deeper into how LLM models are pre-trained.
10. Let's practice!
Now that we have covered zero, few, and multi-shot learning techniques, it's time to test your understanding.