Constructing the encoder-decoder transformer
Now that you've updated the DecoderLayer class, and the equivalent changes have been made to TransformerDecoder, you're ready to put everything together. Because you've built your classes in a modular and hierarchical way, you only need to call instantiate two of them to build the encoder-decoder transformer: TransformerDecoder and TransformerEncoder.
This exercise is part of the course
Transformer Models with PyTorch
Exercise instructions
- Complete the
forward()pass to compute the encoder and decoder outputs. - Instantiate and call the transformer on
input_tokensusingsrc_mask,tgt_mask, andcross_maskprovided.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
class Transformer(nn.Module):
def __init__(self, vocab_size, d_model, num_heads, num_layers, d_ff, max_seq_length, dropout):
super().__init__()
self.encoder = TransformerEncoder(vocab_size, d_model, num_layers, num_heads, d_ff, dropout, max_seq_length)
self.decoder = TransformerDecoder(vocab_size, d_model, num_layers, num_heads, d_ff, dropout, max_seq_length)
def forward(self, x, src_mask, tgt_mask, cross_mask):
# Complete the forward pass
encoder_output = self.encoder(____, ____)
decoder_output = self.decoder(____, ____, tgt_mask, cross_mask)
return decoder_output
# Instantiate and call the transformer
transformer = ____(vocab_size, d_model, num_heads, num_layers, d_ff, max_seq_length, dropout)
outputs = ____
print(outputs)
print(outputs.shape)