Introduction to machine translation

1. Introduction to machine translation

Bonjour! Hallo!, I am Thushan Ganegedara and in this course you will learn to implement machine translation models using the popular deep learning library Keras.

2. Machine translation

Ability to communicate in foreign languages helps us in many instances, such as when traveling overseas.

3. Machine translation

Machine translation services such as Google translation service, can help you to understand hundreds of languages at the press of a button. In this course, you will be learning the inner workings of the models that are empowering these services.

4. Course outline

In chapter 1, you will be introduced to machine translation and the encoder-decoder architecture, which is a common deep learning architecture used for machine translation models. Next, in chapter 2, you will be implementing an encoder-decoder model using the Keras functional API. In chapter 3, you will learn how to train a model and generate translations using the trained model. Finally, you will learn and implement several techniques that improve the performance of machine translation models such as Teacher Forcing.

5. Dataset (English-French sentence corpus)

The dataset that you'll be using in this course consists of two text files. One file contains a set of English sentences, where each line in the file contains a single sentence. And the other file contains the corresponding French translations of the English sentences.

6. Machine translation - Overview

Here, you can see an example of a machine translation task. We want to translate the English sentence "I like cats" to French.

7. Machine translation - Overview

In machine translation terminology, the English language, the language of the sentence to be translated, is called the source language. The French language, the language of the translated sentence, is called the target language.

8. Machine translation - Overview

Let's now see how a machine translation model can be used to translate a sentence. First, the words of the source sentence are fed to the model one-by-one, sequentially.

9. Machine translation - Overview

Then the model outputs the predicted translation word-by-word in a sequential manner.

10. One-hot encoded vectors

When feeding words to a machine translation model, words need to be converted to a numerical representation. One hot encoding is one of the commonly used transformations. In one hot encoding, a word is represented as a vector of zeros and ones. For example in the sentence "I like cats", the word "I" can be represented with a vector of length five and the first element being one. The length of the vector is determined by the size of the vocabulary. The vocabulary is the collection of unique words used in the dataset for a specific language.

11. One-hot encoded vectors

In Keras you can use the convenient to_categorical function to convert words to onehot encoded vectors. However, in order to use this function, you will first need to convert individual words to integers or IDs. To do that you define a Python dictionary that maps words to integers. Then you create a list called word_ids that iteratively maps each word to an ID.

12. One-hot encoded vectors

By passing these word IDs to the to_categorical function, you can obtain the one-hot vectors. If you don't pass in the number of classes or the length of the vector, Keras will automatically detect it from the data that you pass. But you can fix the length of the vector by passing the num_classes argument to the function. It is usually a good practice to fix the length. In situations where your training and testing data have different words, leaving the length unfixed can lead to errors and unexpected behaviors.

13. Let's practice!

Great! Now let's have some fun with one-hot vectors.