Tokenizing sentences with Keras
Here you will get your hands dirty with the Keras Tokenizer. The Keras Tokenizer is a great utility that helps you to do some crucial text processing with a few lines of code. For example, the Keras Tokenizer will automatically map the words in your vocabulary to IDs with a single function call. Here, you will learn about this in more detail.
You will be creating a Keras Tokenizer object and fitting it on some text, which will allow the Tokenizer to build a dictionary of words and their corresponding IDs. The text used to train the Tokenizer is obtained from the Udacity Github Repo.
Bu egzersiz
Machine Translation with Keras
kursunun bir parçasıdırEgzersiz talimatları
- Define a Keras Tokenizer object.
- Fit the tokenizer on
en_text. - Get the word ID for each word
win the given list["january", "apples", "summer"]. - Print the word and its corresponding ID.
Uygulamalı interaktif egzersiz
Bu örnek kodu tamamlayarak bu egzersizi bitirin.
from tensorflow.keras.preprocessing.text import Tokenizer
# Define a Keras Tokenizer
en_tok = ____
# Fit the tokenizer on some text
en_tok.____(____)
for w in ["january", "apples", "summer"]:
# Get the word ID of word w
id = en_tok.____[____]
# Print the word and the word ID
print(____, " has id: ", _____)