Exploding gradient problem
In the video exercise, you learned about two problems that may arise when working with RNN models: the vanishing and exploding gradient problems.
This exercise explores the exploding gradient problem, showing that the derivative of a function can increase exponentially, and how to solve it with a simple technique.
The data is already loaded on the environment as X_train
, X_test
, y_train
and y_test
.
You will use a Stochastic Gradient Descent (SGD) optimizer and Mean Squared Error (MSE) as the loss function.
In the first step you will observe the gradient exploding by computing the MSE on the train and test sets. On step 2, you will change the optimizer using the clipvalue
parameter to solve the problem.
The Stochastic Gradient Descent in Keras is loaded as SGD
.
Este exercício faz parte do curso
Recurrent Neural Networks (RNNs) for Language Modeling with Keras
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create a Keras model with one hidden Dense layer
model = Sequential()
model.add(Dense(25, input_dim=20, activation='relu', kernel_initializer=he_uniform(seed=42)))
model.add(Dense(1, activation='linear'))
# Compile and fit the model
model.compile(loss='mean_squared_error', optimizer=____(learning_rate=0.01, momentum=0.9))
history = model.fit(X_train, y_train, validation_data=(____, ____), epochs=100, verbose=0)
# See Mean Square Error for train and test data
train_mse = model.____(X_train, y_train, verbose=0)
test_mse = model.evaluate(X_test, y_test, verbose=0)
# Print the values of MSE
print('Train: %.3f, Test: %.3f' % (____, ____))