Defining the encoder
Here you'll be taking your first step towards creating a machine translation model: implementing the encoder. The encoder that you will implement is a very simple model compared to the complex models that are used in real-world applications such as the Google machine translation service. But don't worry, though the model is simple, the concepts are the same as of those complex models. Here we will use the prefix en
(e.g. en_gru
) to indicate anything encoder related and de
to indicate decoder related things (e.g. de_gru
).
You will see that we are choosing en_vocab
to be smaller (150) than the actual value (228) that we found. Making the vocabulary smaller reduces the memory footprint of the model. Reducing the vocabulary slightly is fine as we are removing the rarest words when we are doing so. For machine translation tasks, rare words usually have less value than common words.
This exercise is part of the course
Machine Translation with Keras
Exercise instructions
- Define an
Input
layer for an input which has a vocabulary sizeen_vocab
and a sequence lengthen_len
, using theshape
argument. - Define a
keras.layers.GRU
layer that hashsize
hidden units and returns its state. - Get the outputs from the GRU layer by feeding in
en_inputs
and assign the GRU state toen_state
and the output toen_out
. - Define a
keras.models.Model
whose input isen_inputs
and the output is theen_state
and print the model summary.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
import tensorflow.keras as keras
en_len = 15
en_vocab = 150
hsize = 48
# Define an input layer
en_inputs = keras.layers.____(____=____)
# Define a GRU layer which returns the state
en_gru = ____(____, ____=____)
# Get the output and state from the GRU
____, ____ = ____(____)
# Define and print the model summary
encoder = ____(inputs=____, ____=____)
print(encoder.____)