BaşlayınÜcretsiz Başlayın

Tokenizing sentences with Keras

Here you will get your hands dirty with the Keras Tokenizer. The Keras Tokenizer is a great utility that helps you to do some crucial text processing with a few lines of code. For example, the Keras Tokenizer will automatically map the words in your vocabulary to IDs with a single function call. Here, you will learn about this in more detail.

You will be creating a Keras Tokenizer object and fitting it on some text, which will allow the Tokenizer to build a dictionary of words and their corresponding IDs. The text used to train the Tokenizer is obtained from the Udacity Github Repo.

Bu egzersiz

Machine Translation with Keras

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Define a Keras Tokenizer object.
  • Fit the tokenizer on en_text.
  • Get the word ID for each word w in the given list ["january", "apples", "summer"].
  • Print the word and its corresponding ID.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

from tensorflow.keras.preprocessing.text import Tokenizer

# Define a Keras Tokenizer
en_tok = ____

# Fit the tokenizer on some text
en_tok.____(____)

for w in ["january", "apples", "summer"]:
  # Get the word ID of word w
  id = en_tok.____[____]
  # Print the word and the word ID
  print(____, " has id: ", _____)
Kodu Düzenle ve Çalıştır