Constructing the encoder-decoder transformer
Now that you've updated the DecoderLayer
class, and the equivalent changes have been made to TransformerDecoder
, you're ready to put everything together. Because you've built your classes in a modular and hierarchical way, you only need to call instantiate two of them to build the encoder-decoder transformer: TransformerDecoder
and TransformerEncoder
.
This exercise is part of the course
Transformer Models with PyTorch
Exercise instructions
- Complete the
forward()
pass to compute the encoder and decoder outputs. - Instantiate and call the transformer on
input_tokens
usingsrc_mask
,tgt_mask
, andcross_mask
provided.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
class Transformer(nn.Module):
def __init__(self, vocab_size, d_model, num_heads, num_layers, d_ff, max_seq_length, dropout):
super().__init__()
self.encoder = TransformerEncoder(vocab_size, d_model, num_layers, num_heads, d_ff, dropout, max_seq_length)
self.decoder = TransformerDecoder(vocab_size, d_model, num_layers, num_heads, d_ff, dropout, max_seq_length)
def forward(self, x, src_mask, tgt_mask, cross_mask):
# Complete the forward pass
encoder_output = self.encoder(____, ____)
decoder_output = self.decoder(____, ____, tgt_mask, cross_mask)
return decoder_output
# Instantiate and call the transformer
transformer = ____(vocab_size, d_model, num_heads, num_layers, d_ff, max_seq_length, dropout)
outputs = ____
print(outputs)
print(outputs.shape)