The encoder transformer layer
With a FeedForwardSubLayer class defined, you have all of the pieces you need to define an EncoderLayer class. Recall that the encoder layer typically consists of a multi-head attention mechanism, and a feed-forward sublayer with layer normalization and dropout on the sublayer's inputs and outputs.
The classes you have already defined are available for you with the same names, along with torch and torch.nn as nn.
Questo esercizio fa parte del corso
Transformer Models with PyTorch
Istruzioni dell'esercizio
- Complete the
__init__method to instantiateMultiHeadAttention,FeedForwardSubLayer, and two layer normalizations. - Complete the
forward()method by filling-in the multi-head attention mechanism and feed-forward sublayer; for the attention mechanism, use thesrc_markprovided and the input embeddings,x, for the query, key, and value matrices.
Esercizio pratico interattivo
Prova a risolvere questo esercizio completando il codice di esempio.
class EncoderLayer(nn.Module):
def __init__(self, d_model, num_heads, d_ff, dropout):
super().__init__()
# Instantiate the layers
self.self_attn = ____(d_model, num_heads)
self.ff_sublayer = ____(d_model, d_ff)
self.norm1 = ____
self.norm2 = ____
self.dropout = nn.Dropout(dropout)
def forward(self, x, src_mask):
# Complete the forward method
attn_output = self.____
x = self.norm1(x + self.dropout(attn_output))
ff_output = self.____
x = self.norm2(x + self.dropout(ff_output))
return x