MulaiMulai sekarang secara gratis

Implementing multi-head attention

Before you dive in and begin building your own MultiHeadAttention class, you'll try out using the class to see how it transforms the query, key, and value matrices. Recall that these matrices are generated by projecting the input embeddings using linear transformations with learned weights.

query, key, and value matrices have already been created for you, and the MultiHeadAttention has been defined for you.

Latihan ini adalah bagian dari kursus

Transformer Models with PyTorch

Lihat Kursus

Petunjuk latihan

  • Define the attention parameters for eight attention heads and input embeddings with a dimensionality of 512.
  • Create an instance of the MultiHeadAttention class using the defined parameters.
  • Pass the query, key, and value matrices through the multihead_attn mechanism.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Define attention parameters
d_model = ____
num_heads = ____

# Instantiate a MultiHeadAttention instance
multihead_attn = ____

# Pass the query, key, and value matrices through the mechanism
output = ____
print(output.shape)
Edit dan Jalankan Kode