1. Sequence to Sequence Models
This is the last chapter of this course, and also the most challenging. You will learn about sequence to sequence models, and applications to text generation by creating a model that generate sentences based on The Big Bang Theory character Sheldon, and neural machine translation by creating a model to translate small sentences from Portuguese into English.
2. Sequence to sequence
Sequence to sequence models can be divided in two groups. The ones with one output, which we saw when doing sentiment analysis and multi-class classification, and the one with many outputs, which is the object of study in this chapter.
It contemplates text generation and neural machine translation.
3. Text generation: example
Text generation is the process of automatically creating textual content. State of the art models perform so well that a person cannot identify that the text was created by a machine.
In the model above, we can see that some words are not correct and can easily identify that it was generated by a machine.
4. Text generation: modeling
A text generation model should go through the same process as before.
First, though, you need to choose if your token will be characters or words.
If you choose words, then you will need a very large dataset to predict on the big size of the vocabulary.
If you choose chars, then the size of the vocabulary will be much smaller because we have 26 letters on the alphabet, plus some other punctuation.
Then you need to prepare the data for training, that means to create
vectors of past tokens and next token.
The next step is to design the architecture of the model. Choosing to use an
embedding layer, how many RNN layers and so on.
Finally, train the model, see the outcome and make adjustments.
5. NMT: example
Neural Machine Translation is the process to automatically translate one language to another. It is used, for example, on Google translator.
The model above was trained to translate small phrases such as this example with three words, but it becomes much more complex to train models that can translate a whole paragraph or an entire page or even a full document. Since the text would be bigger, we would need more units on the RNN cell, meaning more memory cells to keep longer dependencies. Thus, it will be a much bigger model, need more data and time to train.
6. NMT: modeling
The NMT model is similar to the text generation model, but it has to deal with two languages at the same time.
To acquire training data, we can search for open source projects such as the Anki.
In the data preparation phase, we need to tokenize the two languages. It is also possible to use characters or words as tokens, but words are used more often.
Then we have to design the model to be used, and it is separated in two parts:
the encoder and the decoder. More details will be given further in the course.
Again, as in all machine learning projects, you need to experiment and evaluate the results, then make adjustments as needed.
7. Chapter outline
In this chapter, you will learn in more details about text generation and neural machine translation models.
First, you will learn how to use a pre-trained text generation model to generate sentences,
Then you will learn how to prepare the data and build the keras model for this task.
finally, you will do both steps for neural machine translation: prepare the data, build the model and use it to translate Portuguese to English.
8. Let's practice!
Before going into details on the implementation of the models, let's see them in practice!