Get startedGet started for free

Creating a transformer model

At PyBooks, the recommendation engine you're working on needs more refined capabilities to understand the sentiments of user reviews. You believe that using transformers, a state-of-the-art architecture, can help achieve this. You decide to build a transformer model that can encode the sentiments in the reviews to kickstart the project.

The following packages have been imported for you: torch, nn, optim.

The input data contains sentences such as : "I love this product", "This is terrible", "Could be better" … and their respective binary sentiment labels such as : 1, 0, 0, ...

The input data is split and converted to embeddings in the following variables: train_sentences, train_labels ,test_sentences,test_labels,token_embeddings

This exercise is part of the course

Deep Learning for Text with PyTorch

View Course

Exercise instructions

  • Initialize the transformer encoder.
  • Define the fully connected layer based on the number of sentiment classes.
  • In the forward method, pass the input through the transformer encoder followed by the linear layer.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

class TransformerEncoder(nn.Module):
    def __init__(self, embed_size, heads, num_layers, dropout):
        super(TransformerEncoder, self).__init__()
        # Initialize the encoder 
        self.encoder = nn.____(
            nn.____(d_model=embed_size, nhead=heads),
            num_layers=num_layers)
        # Define the fully connected layer
        self.fc = nn.Linear(embed_size, ____)

    def forward(self, x):
        # Pass the input through the transformer encoder 
        x = self.____(x)
        x = x.mean(dim=1) 
        return self.fc(x)

model = TransformerEncoder(embed_size=512, heads=8, num_layers=3, dropout=0.5)
optimizer = optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()
Edit and Run Code