Pre-process data
You learned the differences for pre-processing the data in the case of multi-class classification. Let's put that into practice by preprocessing the data in anticipation of creating a simple multi-class classification model.
The dataset is loaded in the variable news_dataset, and has the following attributes:
news_dataset.data: array with textsnews_dataset.target: array with target categories as numerical indexes
The sample data contains 5,000 observations.
Bu egzersiz
Recurrent Neural Networks (RNNs) for Language Modeling with Keras
kursunun bir parçasıdırEgzersiz talimatları
- Instantiate the
Tokenizerclass on thetokenizervariable. - Fit the
tokenizervariable on the text data. - Use the
.texts_to_sequences()method on the text data. - Use the
to_categorical()function to prepare the target indexes.
Uygulamalı interaktif egzersiz
Bu örnek kodu tamamlayarak bu egzersizi bitirin.
# Create and fit tokenizer
tokenizer = ____
tokenizer.fit_on_texts(____)
# Prepare the data
prep_data = tokenizer.____(news_dataset.data)
prep_data = pad_sequences(prep_data, maxlen=200)
# Prepare the labels
target_labels = to_categorical(____)
# Print the shapes
print(prep_data.shape)
print(target_labels.shape)