Introduction to the course

1. Introduction to the course

Hi, my name is David. I'm a Data Scientist that focuses on text data for real world applications, and I am proud to be your instructor in this course where you will be introduced to four different applications of language models using Recurrent Neural Networks with python.

2. Text data is available online

So, why learn to model language (or text) data? Well, we know that Data Science models require data to be effective, and one kind of data that is available on the Internet is text. From news articles to tweets, the volume of text data is increasing fast and is freely accessible to anyone with an Internet connection.

3. Applications of machine learning to text data

So, what can Data Scientists do with all this data? In this course we will introduce 4 applications: sentiment analysis, multi-class classification, text generation, and machine neural translation. In the next slides, we will define each of these applications.

4. Sentiment analysis

If you have an online customer interaction, you may be interested in knowing how your customers feel towards your brand or product. To do that, you can use sentiment analysis models and classify their messages into positive or negative.

5. Multi-class classification

Or you want to build a recommender system and need to categorize news articles into a set of pre-defined categories.

6. Text generation

Also, it is possible to generate text automatically using a specific writing style, or automatically reply to messages.

7. Neural machine translation

Lastly, it is also possible to create models that translate from one language to another.

8. Recurrent Neural Networks

All these applications are possible with a type of Deep Learning architecture called Recurrent Neural Networks. So what is different about RNN architectures, and why do we use it? The main advantages to use RNN for text data is that it reduces the number of parameters of the model (by avoiding one-hot encoding) and it shares weights between different positions of the text. In the example, the model uses information from all the words to predict if the movie review was good or not.

9. Sequence to sequence models

RNNs model sequence data and can have different lengths of inputs and outputs. Many inputs to one output is commonly used for classification tasks, where the final output is a probability distribution. This is used on sentiment analysis and multi-class classification applications.

10. Sequence to sequence models

Many inputs to many outputs for text generation start the same as in the classification case, but for the outputs, it uses the previous prediction as input to the next prediction.

11. Sequence to sequence models

Many inputs to many outputs for neural machine translation is separated in two blocks: encoder and decoder. The encoder learns the characteristics of the input language, while the decoder learns for the output language. The encoder has no prediction (no arrows going up), and the decoder doesn't receive inputs (no arrows from below).

12. Sequence to sequence models

Many inputs to many outputs for language models starts with an artificial zero input, and then for every input word i the model tries to predict the next word i plus one.

13. Let's practice!

But before you dig in into the RNN models, let's first review some of the requisites.

This exercise is part of the course

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

AdvancedSkill Level

4.8+

Start Course for Free

In this chapter, you will learn the foundations of Recurrent Neural Networks (RNN). Starting with some prerequisites, continuing to understanding how information flows through the network and finally seeing how to implement such models with Keras in the sentiment classification task.

Exercise 1: Introduction to the course

Current Exercise

Exercise 2: Comparing the number of parameter of RNN and ANN Exercise 3: Sentiment analysis Exercise 4: Sequence to sequence models Exercise 5: Introduction to language models Exercise 6: Getting used to text data Exercise 7: Preparing text data for model input Exercise 8: Transforming new text Exercise 9: Introduction to RNN inside Keras Exercise 10: Keras models Exercise 11: Keras preprocessing Exercise 12: Your first RNN model

You will learn about the vanishing and exploding gradient problems, often occurring in RNNs, and how to deal with them with the GRU and LSTM cells. Furthermore, you'll create embedding layers for language models and revisit the sentiment classification task.

Exercise 1: Vanishing and exploding gradients Exercise 2: Exploding gradient problem Exercise 3: Vanishing gradient problem Exercise 4: GRU and LSTM cells Exercise 5: GRU cells are better than simpleRNN Exercise 6: Stacking RNN layers Exercise 7: The Embedding layer Exercise 8: Number of parameters comparison Exercise 9: Transfer learning Exercise 10: Embeddings improves performance Exercise 11: Sentiment classification revisited Exercise 12: Better sentiment classification Exercise 13: Using the CNN layer

Next, in this chapter you will learn how to prepare data for the multi-class classification task, as well as the differences between multi-class classification and binary classification (sentiment analysis). Finally, you will learn how to create models and measure their performance with Keras.

Exercise 1: Data pre-processing Exercise 2: Prepare label vectors Exercise 3: Pre-process data Exercise 4: Transfer learning for language models Exercise 5: Transfer learning starting point Exercise 6: Word2Vec Exercise 7: Multi-class classification models Exercise 8: Exploring 20 News Groups dataset Exercise 9: Classifying news articles Exercise 10: Assessing the model's performance Exercise 11: Precision-Recall trade-off Exercise 12: Precision or Recall, that is the question Exercise 13: Performance on multi-class classification

This chapter introduces you to two applications of RNN models: Text Generation and Neural Machine Translation. You will learn how to prepare the text data to the format needed by the models. The Text Generation model is used for replicating a character's way of speech and will have some fun mimicking Sheldon from The Big Bang Theory. Neural Machine Translation is used for example by Google Translate in a much more complex model. In this chapter, you will create a model that translates Portuguese small phrases into English.

Exercise 1: Sequence to Sequence Models Exercise 2: Text generation examples Exercise 3: NMT example Exercise 4: The Text Generating Function Exercise 5: Predict next character Exercise 6: Generate sentence with context Exercise 7: Change the probability scale Exercise 8: Text Generation Models Exercise 9: Create vectors of sentences and next characters Exercise 10: Preparing the data for training Exercise 11: Creating the text generation model Exercise 12: Neural Machine Translation Exercise 13: Preparing the input text Exercise 14: Preparing the output text Exercise 15: Translate Portuguese to English Exercise 16: Congratulations!