1. Learn
  2. /
  3. Courses
  4. /
  5. Machine Translation with Keras

Exercise

Tokenizing sentences with Keras

Here you will get your hands dirty with the Keras Tokenizer. The Keras Tokenizer is a great utility that helps you to do some crucial text processing with a few lines of code. For example, the Keras Tokenizer will automatically map the words in your vocabulary to IDs with a single function call. Here, you will learn about this in more detail.

You will be creating a Keras Tokenizer object and fitting it on some text, which will allow the Tokenizer to build a dictionary of words and their corresponding IDs. The text used to train the Tokenizer is obtained from the Udacity Github Repo.

Instructions

100 XP
  • Define a Keras Tokenizer object.
  • Fit the tokenizer on en_text.
  • Get the word ID for each word w in the given list ["january", "apples", "summer"].
  • Print the word and its corresponding ID.