Tokenizing sentences with Keras

Here you will get your hands dirty with the Keras Tokenizer. The Keras Tokenizer is a great utility that helps you to do some crucial text processing with a few lines of code. For example, the Keras Tokenizer will automatically map the words in your vocabulary to IDs with a single function call. Here, you will learn about this in more detail.

You will be creating a Keras Tokenizer object and fitting it on some text, which will allow the Tokenizer to build a dictionary of words and their corresponding IDs. The text used to train the Tokenizer is obtained from the Udacity Github Repo.

Define a Keras Tokenizer object.
Fit the tokenizer on en_text.
Get the word ID for each word w in the given list ["january", "apples", "summer"].
Print the word and its corresponding ID.

Introduction to Machine Translation

Implementing an Encoder-Decoder Model with Keras

Training and Generating Translations

Teacher Forcing and Word Embeddings

Exercise

Tokenizing sentences with Keras

Instructions