Get startedGet started for free

Vanishing gradient problem

The other possible gradient problem is when the gradients vanish, or go to zero. This is a much harder problem to solve because it is not as easy to detect. If the loss function does not improve on every step, is it because the gradients went to zero and thus didn't update the weights? Or is it because the model is not able to learn?

This problem occurs more often in RNN models when long memory is required (having long sentences).

In this exercise you will observe the problem on the IMDB data, with longer sentences selected. The data is loaded in X and y variables, as well as classes Sequential, SimpleRNN, Dense and matplotlib.pyplot as plt. The model was pre-trained with 100 epochs, its weights and history are stored on the file model_weights.h5 and variable history.

This exercise is part of the course

Recurrent Neural Networks (RNNs) for Language Modeling with Keras

View Course

Exercise instructions

  • Add a SimpleRNN layer to the model.
  • Load the pre-trained weights on the model using the method .load_weights().
  • Add the accuracy of the training data available on the attribute 'acc' to the plot.
  • Display the plot using the method .show().

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create the model
model = Sequential()
model.add(____(units=600, input_shape=(None, 1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])

# Load pre-trained weights
model.____('model_weights.h5')

# Plot the accuracy x epoch graph
plt.plot(history.history[____])
plt.plot(history.history['val_acc'])
plt.legend(['train', 'val'], loc='upper left')
plt.____()
Edit and Run Code