Introduction to the course

1. Introduction to the course

Hi, my name is David. I'm a Data Scientist that focuses on text data for real world applications, and I am proud to be your instructor in this course where you will be introduced to four different applications of language models using Recurrent Neural Networks with python.

2. Text data is available online

So, why learn to model language (or text) data? Well, we know that Data Science models require data to be effective, and one kind of data that is available on the Internet is text. From news articles to tweets, the volume of text data is increasing fast and is freely accessible to anyone with an Internet connection.

3. Applications of machine learning to text data

So, what can Data Scientists do with all this data? In this course we will introduce 4 applications: sentiment analysis, multi-class classification, text generation, and machine neural translation. In the next slides, we will define each of these applications.

4. Sentiment analysis

If you have an online customer interaction, you may be interested in knowing how your customers feel towards your brand or product. To do that, you can use sentiment analysis models and classify their messages into positive or negative.

5. Multi-class classification

Or you want to build a recommender system and need to categorize news articles into a set of pre-defined categories.

6. Text generation

Also, it is possible to generate text automatically using a specific writing style, or automatically reply to messages.

7. Neural machine translation

Lastly, it is also possible to create models that translate from one language to another.

8. Recurrent Neural Networks

All these applications are possible with a type of Deep Learning architecture called Recurrent Neural Networks. So what is different about RNN architectures, and why do we use it? The main advantages to use RNN for text data is that it reduces the number of parameters of the model (by avoiding one-hot encoding) and it shares weights between different positions of the text. In the example, the model uses information from all the words to predict if the movie review was good or not.

9. Sequence to sequence models

RNNs model sequence data and can have different lengths of inputs and outputs. Many inputs to one output is commonly used for classification tasks, where the final output is a probability distribution. This is used on sentiment analysis and multi-class classification applications.

10. Sequence to sequence models

Many inputs to many outputs for text generation start the same as in the classification case, but for the outputs, it uses the previous prediction as input to the next prediction.

11. Sequence to sequence models

Many inputs to many outputs for neural machine translation is separated in two blocks: encoder and decoder. The encoder learns the characteristics of the input language, while the decoder learns for the output language. The encoder has no prediction (no arrows going up), and the decoder doesn't receive inputs (no arrows from below).

12. Sequence to sequence models

Many inputs to many outputs for language models starts with an artificial zero input, and then for every input word i the model tries to predict the next word i plus one.

13. Let's practice!

But before you dig in into the RNN models, let's first review some of the requisites.