Implementing the decoder
1. Defining the decoder
Now it's time to learn how to implement the decoder.2. Encoder-decoder model
You have already implemented the encoder which consumes the source language, that is, English words one by one and finally produces the context vector. Now, you need to implement the decoder which consumes the context vector and produces the target language words, that is, French words one by one.3. Input of the decoder
The decoder will be implemented similar to the encoder using a Keras GRU layer. But this poses a question. The Keras GRU layer requires an input to process in addition to the hidden state. What would that input be?4. Input of the decoder
One solution would be to repeat the context vector to fit the length of the French sentence you want to produce. For example, if you need a French sentence of length 10, you repeat the context vector 10 times. The context vector qualifies as a good input to the decoder, as it is a representation of the English sentence the encoder has seen. For this purpose, you can use the RepeatVector layer provided in Keras. The RepeatVector layer allows you to repeat an input or an output a given number of times. Note that this is not the only solution and you will see a better solution in the final chapter.5. Understanding the RepeatVector layer
To use the RepeatVector layer you need to provide one argument which specifies the sequence length or the number of times you want to repeat the data. It takes an input of size batch size by input size. After the transformation, it will produce an output of size batch size by sequence length by input size, which is the type of input accepted by a GRU layer.6. Defining a RepeatVector layer
You can define a RepeatVector as shown here, which will repeat r_inp 5 times in this example. Then you can define a simple Keras model that repeats a given input of size 3, 5 times. Note that, the following two code snippets are equivalent and both can be used to obtain the output of a RepeatVector layer.7. Predicting with the model
You can now use this model to see what happens in the RepeatVector layer. For example, if you provide an array which has shape 2 by 3, the RepeatVector layer will repeat the 2 by 3 array 5 times and output a 2 by 5 by 3 shaped array.8. Implementing the decoder
Let's now use the RepeatVector along with other layers to implement the decoder. First you can define the decoder input using the RepeatVector layer and repeating the encoder state fr_len times. fr_len is the average length of a French sentence. Then, let's create the decoder GRU layer. Remember that, unlike in the encoder, you need all the outputs of the decoder GRU layer, not just the last output. This is because each of those GRU outputs are later used to predict the correct French word for each position of the decoder. Therefore, don't forget to set the return_sequences argument to True. You also need to provide the context vector produced by the encoder as the initial state of the decoder, as we don't want the decoder to start without any memory from the encoder. To do that, when you pass the input de_inputs to the decoder GRU layer, you need to set the initial_state argument to en_state, which is the context vector.9. Defining the model
Finally you define a Keras model which contains both the encoder and the decoder, where the input is en_inputs or the English words and outputs all the GRU outputs for all the sequence positions of the decoder.10. Let's practice!
Step by step you are getting to the final solution. Let's now implement the decoder of the encoder decoder model.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.