Avoiding local minima

The previous problem showed how easy it is to get stuck in local minima. We had a simple optimization problem in one variable and gradient descent still failed to deliver the global minimum when we had to travel through local minima first. One way to avoid this problem is to use momentum, which allows the optimizer to break through local minima. We will again use the loss function from the previous problem, which has been defined and is available for you as loss_function().

The graph is of a single variable function that contains multiple local minima and a global minimum.

Several optimizers in tensorflow have a momentum parameter, including SGD and RMSprop. You will make use of RMSprop in this exercise. Note that x_1 and x_2 have been initialized to the same value this time. Furthermore, keras.optimizers.RMSprop() has also been imported for you from tensorflow.

This exercise is part of the course

Introduction to TensorFlow in Python

Exercise instructions

Set the opt_1 operation to use a learning rate of 0.01 and a momentum of 0.99.
Set opt_2 to use the root mean square propagation (RMS) optimizer with a learning rate of 0.01 and a momentum of 0.00.
Define the minimization operation for opt_2.
Print x_1 and x_2 as numpy arrays.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Initialize x_1 and x_2
x_1 = Variable(0.05,float32)
x_2 = Variable(0.05,float32)

# Define the optimization operation for opt_1 and opt_2
opt_1 = keras.optimizers.RMSprop(learning_rate=____, momentum=____)
opt_2 = ____

for j in range(100):
	opt_1.minimize(lambda: loss_function(x_1), var_list=[x_1])
    # Define the minimization operation for opt_2
	____

# Print x_1 and x_2 as numpy arrays
print(____, ____)

Edit and Run Code

This exercise is part of the course

Introduction to TensorFlow in Python

IntermediateSkill Level

4.8+

Start Course for Free

Before you can build advanced models in TensorFlow 2, you will first need to understand the basics. In this chapter, you’ll learn how to define constants and variables, perform tensor addition and multiplication, and compute derivatives. Knowledge of linear algebra will be helpful, but not necessary.

Exercise 1: Constants and variables Exercise 2: Defining data as constants Exercise 3: Defining variables Exercise 4: Basic operations Exercise 5: Performing element-wise multiplication Exercise 6: Making predictions with matrix multiplication Exercise 7: Summing over tensor dimensions Exercise 8: Advanced operations Exercise 9: Reshaping tensors Exercise 10: Optimizing with gradients Exercise 11: Working with image data

In this chapter, you will learn how to build, solve, and make predictions with models in TensorFlow 2. You will focus on a simple class of models – the linear regression model – and will try to predict housing prices. By the end of the chapter, you will know how to load and manipulate data, construct loss functions, perform minimization, make predictions, and reduce resource use with batch training.

Exercise 1: Input data Exercise 2: Load data using pandas Exercise 3: Setting the data type Exercise 4: Loss functions Exercise 5: Loss functions in TensorFlow Exercise 6: Modifying the loss function Exercise 7: Linear regression Exercise 8: Set up a linear regression Exercise 9: Train a linear model Exercise 10: Multiple linear regression Exercise 11: Batch training Exercise 12: Preparing to batch train Exercise 13: Training a linear model in batches

The previous chapters taught you how to build models in TensorFlow 2. In this chapter, you will apply those same tools to build, train, and make predictions with neural networks. You will learn how to define dense layers, apply activation functions, select an optimizer, and apply regularization to reduce overfitting. You will take advantage of TensorFlow's flexibility by using both low-level linear algebra and high-level Keras API operations to define and train models.

Exercise 1: Dense layers Exercise 2: The linear algebra of dense layers Exercise 3: The low-level approach with multiple examples Exercise 4: Using the dense layer operation Exercise 5: Activation functions Exercise 6: Binary classification problems Exercise 7: Multiclass classification problems Exercise 8: Optimizers Exercise 9: The dangers of local minima Exercise 10: Avoiding local minima

Current Exercise

Exercise 11: Training a network in TensorFlow Exercise 12: Initialization in TensorFlow Exercise 13: Defining the model and loss function Exercise 14: Training neural networks with TensorFlow

In the final chapter, you'll use high-level APIs in TensorFlow 2 to train a sign language letter classifier. You will use both the sequential and functional Keras APIs to train, validate, make predictions with, and evaluate models. You will also learn how to use the Estimators API to streamline the model definition and training process, and to avoid errors.

Exercise 1: Defining neural networks with Keras Exercise 2: The sequential model in Keras Exercise 3: Compiling a sequential model Exercise 4: Defining a multiple input model Exercise 5: Training and validation with Keras Exercise 6: Training with Keras Exercise 7: Metrics and validation with Keras Exercise 8: Overfitting detection Exercise 9: Evaluating models Exercise 10: Training models with the Estimators API Exercise 11: Preparing to train with Estimators Exercise 12: Defining Estimators Exercise 13: Congratulations!