Defining the Teacher Forcing model layers
You will be defining a new-and-improved version of the machine translation model that you defined earlier. Did you know that models like the Google Machine Translator used this Teacher Forcing technique to train their model?
As you have already seen, your previous model needs to change slightly to adopt Teacher Forcing. In this exercise, you will make the necessary changes to the previous model. You have been provided with the language parameters en_len
and fr_len
(length of a padded English/French sentences), en_vocab
and fr_vocab
(Vocabulary size of English/French datasets) and hsize
(the hidden layer size of the GRU layers). Remember that the decoder will accept a French sequence with one item less than fr_len
. Remember that we use the prefix en
to refer to encoder related things and de
for decoder related things.
This exercise is part of the course
Machine Translation with Keras
Exercise instructions
- Import the
layers
submodule fromtensorflow.keras
. - Get the encoder output and state values and assign them to
en_out
anden_state
respectively. - Define a decoder
Input
layer which accepts afr_len-1
long sequence of onehot encoded French words. - Define a
TimeDistributed
Dense
softmax layer withfr_vocab
nodes.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import the layers submodule from keras
import ____.____.____ as layers
en_inputs = layers.Input(shape=(en_len, en_vocab))
en_gru = layers.GRU(hsize, return_state=True)
# Get the encoder output and state
____, ____ = en_gru(____)
# Define the decoder input layer
de_inputs = layers.____(shape=(____, ____))
de_gru = layers.GRU(hsize, return_sequences=True)
de_out = de_gru(de_inputs, initial_state=en_state)
# Define a TimeDistributed Dense softmax layer with fr_vocab nodes
de_dense = layers.____(____.____(____, activation=____))
de_pred = de_dense(de_out)