Creating a RNN model with attention
At PyBooks, the team has been exploring various deep learning architectures. After some research, you decide to implement an RNN with Attention mechanism to predict the next word in a sentence. You're given a dataset with sentences and a vocabulary created from them.
The following packages have been imported for you: torch
, nn
.
The following has been preloaded for you:
vocab
andvocab_size
: The vocabulary set and its sizeword_to_ix
andix_to_word
: dictionary for word to index and index to word mappingsinput_data
andtarget_data
: converted dataset to input-output pairsembedding_dim
andhidden_dim
: dimensions for embedding and RNN hidden state
You can inspect the data
variable in the console to see the example sentences.
This is a part of the course
“Deep Learning for Text with PyTorch”
Exercise instructions
- Create an embedding layer for the vocabulary with the given
embedding_dim
. - Apply a linear transformation to the RNN sequence output to get the attention scores.
- Get the attention weights from the score.
- Compute the context vector as the weighted sum of RNN outputs and attention weights
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
class RNNWithAttentionModel(nn.Module):
def __init__(self):
super(RNNWithAttentionModel, self).__init__()
# Create an embedding layer for the vocabulary
self.embeddings = nn.____(vocab_size, embedding_dim)
self.rnn = nn.RNN(embedding_dim, hidden_dim, batch_first=True)
# Apply a linear transformation to get the attention scores
self.attention = nn.____(____, 1)
self.fc = nn.____(hidden_dim, vocab_size)
def forward(self, x):
x = self.embeddings(x)
out, _ = self.rnn(x)
# Get the attention weights
attn_weights = torch.nn.functional.____(self.____(out).____(2), dim=1)
# Compute the context vector
context = torch.sum(____.____(2) * out, dim=1)
out = self.fc(context)
return out
attention_model = RNNWithAttentionModel()
optimizer = torch.optim.Adam(attention_model.parameters(), lr=0.01)
criterion = nn.CrossEntropyLoss()
print("Model Instantiated")
This exercise is part of the course
Deep Learning for Text with PyTorch
Discover the exciting world of Deep Learning for Text with PyTorch and unlock new possibilities in natural language processing and text generation.
Understand the concept of transfer learning and its application in text classification. Explore Transformers, their architecture, and how to use them for text classification and generation tasks. You will also delve into attention mechanisms and their role in text processing. Finally, understand the potential impacts of adversarial attacks on text classification models and learn how to protect your models.
Exercise 1: Transfer learning for text classificationExercise 2: Transfer learning using BERTExercise 3: Evaluating the BERT modelExercise 4: Transformers for text processingExercise 5: Creating a transformer modelExercise 6: Training and testing the Transformer modelExercise 7: Attention mechanisms for text processingExercise 8: Creating a RNN model with attentionExercise 9: Training and testing the RNN model with attentionExercise 10: Adversarial attacks on text classification modelsExercise 11: Adversarial attack classificationExercise 12: Safeguarding AI at PyBooksExercise 13: Wrap-upWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.