Aan de slagGa gratis aan de slag

Feed-forward sublayers

Feed-forward sub-layers map attention outputs into abstract nonlinear representations to better capture complex relationships.

In this exercise, you'll create a FeedForwardSubLayer for your encoder-only transformer. This layer will consist of two linear layers with a ReLU activation function between them. It also takes two parameters, d_model and d_ff, which represent the dimensionality of the input embeddings and the dimension between the linear layers, respectively.

d_model and d_ff are already available for you to use.

Deze oefening maakt deel uit van de cursus

Transformer Models with PyTorch

Cursus bekijken

Oefeninstructies

  • Define the first and second linear layers and ReLU activation for the feed-forward sublayer class, using d_model and a dimension d_ff between layers.
  • Pass the input through the layers and activation function in the forward() method.
  • Instantiate the FeedForwardSubLayer using d_model and d_ff provided (set to 512 and 2048, respectively) and apply it to the input embeddings, x.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

class FeedForwardSubLayer(nn.Module):
    def __init__(self, d_model, d_ff):
        super().__init__()
        # Define the layers and activation
        self.fc1 = ____
        self.fc2 = ____
        self.relu = ____

    def forward(self, x):
        # Pass the input through the layers and activation
        return self.____(self.____(self.____(x)))
    
# Instantiate the FeedForwardSubLayer and apply it to x
feed_forward = ____
output = ____
print(f"Input shape: {x.shape}")
print(f"Output shape: {output.shape}")
Code bewerken en uitvoeren