Prepare label vectors
In the video exercise, you learned the differences between binary classification and multi-class classification. You learned that there are some modifications to the data preparation process that need to be done before training the models.
In this exercise, you will prepare a raw dataset with labels given as text. The data is given as a pandas.DataFrame
called df
, with two columns: text
with the text data and label
with the label names. Your task is to make all the necessary transformations to the labels: change string to number and one-hot encode.
The module pandas
as pd
and the function to_categorical()
from keras.utils.np_utils
are already loaded in the environment and the first lines of the dataset is printed on the console for you to see.
This exercise is part of the course
Recurrent Neural Networks (RNNs) for Language Modeling with Keras
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Get the numerical ids of column label
numerical_ids = df.label.____
# Print initial shape
print(numerical_ids.____)