Get startedGet started for free

Implementing the full encoder decoder model

1. Implementing the full encoder decoder model

In this video, you will implement the full encoder-decoder model.

2. What you implemented so far

So far you have implemented an encoder which consumes source or English input words and produces a context vector. Then you implemented a decoder which consumes the context vector as an input using the RepeatVector layer and produces a sequence of outputs from the GRU layer.

3. Top part of the decoder

However, you still need a layer that produces the French words for each position of the decoder. The word at each position is obtained from a probabilistic output over the French vocabulary. A Keras Dense layer can be used to get this probabilistic output. The Dense layer consumes an input (here, a single GRU output) and produces a probability distribution of how likely each French word is the correct translated word, for that position. However, as the decoder produces a sequence of GRU outputs, you need to wrap the Dense layer with a TimeDistributed layer allowing the Dense layer to consume time-series inputs.

4. Implementing the full model

You can now implement the full model. You have already implemented the encoder which takes en_inputs and produces en_state as the output, and the decoder which takes de_inputs, which repeats the context vector fr_len times, and produces de_out as the output.

5. Implementing the full model

Then you can add the French word prediction functionality using a Dense layer and a TimeDistributed layer as shown here, where fr_vocab is the size of the French vocabulary. Finally, de_pred will hold the probability predictions over all French words for each position of the decoder.

6. Compiling the model

Then, you can define a Keras model with en_inputs as the input and de_pred as the output. de_pred contains the probabilistic predictions of the French words for all decoder positions. Finally you have to compile the full model with an optimizer and a loss function. An optimizer and a loss function is required to train a model. Additionally you can define metrics such as the accuracy which would allow you to monitor the model as it is trained. When you define a list of metrics you will actually get one additional metric than the ones you defined. The additional metric will be the value of the loss function you defined for the loss argument. In our example, you will get two metrics, categorical crossentropy loss and the accuracy for a given set of data.

7. Let's practice!

Great! Now that you have learned all about the encoder decoder model, Let's practice!