Vanishing gradient problem
The other possible gradient problem is when the gradients vanish, or go to zero. This is a much harder problem to solve because it is not as easy to detect. If the loss function does not improve on every step, is it because the gradients went to zero and thus didn't update the weights? Or is it because the model is not able to learn?
This problem occurs more often in RNN models when long memory is required (having long sentences).
In this exercise you will observe the problem on the IMDB data, with longer sentences selected. The data is loaded in X and y variables, as well as classes Sequential, SimpleRNN, Dense and matplotlib.pyplot as plt. The model was pre-trained with 100 epochs, its weights and history are stored on the file model_weights.h5 and variable history.
Deze oefening maakt deel uit van de cursus
Recurrent Neural Networks (RNNs) for Language Modeling with Keras
Oefeninstructies
- Add a
SimpleRNNlayer to the model. - Load the pre-trained weights on the model using the method
.load_weights(). - Add the accuracy of the training data available on the attribute
'acc'to the plot. - Display the plot using the method
.show().
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Create the model
model = Sequential()
model.add(____(units=600, input_shape=(None, 1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
# Load pre-trained weights
model.____('model_weights.h5')
# Plot the accuracy x epoch graph
plt.plot(history.history[____])
plt.plot(history.history['val_acc'])
plt.legend(['train', 'val'], loc='upper left')
plt.____()