Discover what makes the hottest deep learning architecture in AI tick! Learn about the components that make up Transformer models, including the famous self-attention mechanisms described in the renowned paper "Attention is All You Need."

Transformers with PyTorch

Breaking down the Transformer

PyTorch Transformers

Embedding and positional encoding

Creating input embeddings

Creating positional encodings

Multi-head self-attention

Implementing multi-head attention

Starting the MultiHeadAttentionClass

Adding methods to the MultiHeadAttention class

The Building Blocks of Transformer Models

Design transformer encoder and decoder blocks, and combine them with positional encoding, multi-headed attention, and position-wise feed-forward networks to build your very own Transformer architectures. Along the way, you'll develop a deep understanding and appreciation for how transformers work under the hood.

Encoder transformers

Feed-forward sublayers

The encoder transformer layer

The encoder transformer body

Adding the transformer head

Decoder transformers

Designing a mask for self-attention

The decoder layer

Completing the decoder transformer

Encoder-decoder transformers

Adding cross-attention to the decoder layer

Constructing the encoder-decoder transformer

Congratulations!

Building Transformer Architectures

Dive deeper into LLMs and discover how the transformer architecture has revolutionized deep learning and enabled the generative AI boom! In this course, you'll learn to create your own transformer architecture from the ground up, component by component. You'll learn how to encode token positions, perform attention mechanism calculations, and build modular transformer components to give you greater control over your transformer's inner workings. Go from zero to LLM hero today!

<h2>Deep-Dive into the Transformer Architecture</h2>
Transformer models have revolutionized text modeling, kickstarting the generative AI boom by enabling today's large language models (LLMs). In this course, you'll look at the key components in this architecture, including positional encoding, attention mechanisms, and feed-forward sublayers. You'll code these components in a modular way to build your own transformer step-by-step.<br><br><h2>Implement Attention Mechanisms with PyTorch</h2>
The attention mechanism is a key development that helped formalize the transformer architecture. Self-attention allows transformers to better identify relationships between tokens, which improves the quality of generated text. Learn how to create a multi-head attention mechanism class that will form a key building block in your transformer models.<br><br><h2>Build Your Own Transformer Models</h2>
Learn to build encoder-only, decoder-only, and encoder-decoder transformer models. Learn how to choose and code these different transformer architectures for different language tasks, including text classification and sentiment analysis, text generation and completion, and sequence-to-sequence translation.

Deep Learning for Text with PyTorch

What makes LLMs tick? Discover how transformers revolutionized text modeling and kickstarted the generative AI boom.

Transformer Models with PyTorch

Deep Learning in Python

Developing Large Language Models

Adding cross-attention to the decoder layer

Transformer Models with PyTorch

Exercise instructions

Hands-on interactive exercise