Text Generation Models

1. Text Generation Models

You saw how cool it can be to generate texts using RNN models. But you haven't actually seen the models! Let's fix that.

2. Similar to a classification model

The text generation models can be seen as a multi class classification model where the classes are each of the tokens in the vocabulary.. As in the classification model, it uses a softmax activation function on the last layer and uses categorical cross-entropy for the loss function.

3. Example model using keras

As an example model, we create a sequential model with two LSTM layers. We initialize the model class. Then, add the first LSTM layer defining the input_shape with shape of characters window and vocabulary size. We also add dropout values to avoid overfitting and return sequences. Next, we add the second LSTM layer, adding the dropout values and not returning sequences. Then, add the output layer with softmax activation function. Finally, compile the model with categorical cross-entropy loss function.

4. But not really classification model

The model architecture is the same as in a classification model, but the way we use the model is different. During training and testing, we don't compute performance metrics such as accuracy because we are not interested in predicting exactly one specific sentence, but instead we want the model to be flexible and be able to generate sentences that makes sense. To measure the performance, humans need to read the generated texts and determine if they're good. If they're not good, we can train the model for more epochs, or add complexity to the model by increasing the number of memory cells, adding more layers, etc. Also, depending on the task you are working on, it will be of interest to generate one character or word or sentence or paragraph etc. So, the output of the model is not the final result, and needs to be further managed.

5. Other applications

Apart from generating texts from TV shows's characters or poetry, there are other applications for text generation. For example, it can create names such as baby names or new stars. Also, it is possible to generate marked texts such as latex, markdown or XML. Finally, it can create news articles or be used on chatbots.

6. Data prep

To prepare the data for text generation tasks, we need to transform text into sequence of indexes as usual. We will use characters as tokens for text generation. Then we need to create the data for supervised learning, meaning that we keep a sequence of characters in X and the next character on Y. We pad the sequences in X to have the same length. This means that we will use this length of characters to predict the next one. The variable is called chars window. Y's dimension is the vocabulary size, or the total number of characters in the training data, and have to be one-hot encoded.

7. Let's practice!

Text generation can be very fun to experiment with on your favorite text data. Now, let's see how to put it all together to create cool models. In the following exercises, you will first learn how to use a pre-trained model, and then learn how to build them.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

AdvancedSkill Level

4.8+

76 reviews