Self vs. multi-head attention
You are a data analyst in an AI development team. Your current project involves understanding and implementing the concepts of self-attention and multi-head attention in a language model. Consider the following phrases from a conversation dataset.
A: "The boy went to the store to buy some groceries."
B: "Oh, he was really excited about getting his favorite cereal."
C: "I noticed that he gestured a lot while talking about it."
Determine if these phrases would be best analyzed by focusing on relationships within the input data (self-attention) or attending to multiple aspects of the input data simultaneously (multi-head attention).
This exercise is part of the course
Large Language Models (LLMs) Concepts
Hands-on interactive exercise
Turn theory into action with one of our interactive exercises
