DQN with experience replay

You will now introduce Experience Replay to train an agent using a Deep Q Network. You will use the same Lunar Lander environment as you did to build your Barebone DQN.

At every step, instead of using only the learnings from the most recent transition to update the network, the Experience Replay buffer enables the agent to learn from a random batch of recent experiences. This considerably improves its ability to learn about the environment.

The QNetwork and ReplayBuffer classes from previous exercises are available to you and have been instantiated as follows:

q_network = QNetwork(8, 4)
replay_buffer = ReplayBuffer(10000)

The describe_episode() function is also again available to describe metrics at the end of each episode.

Push the latest experience into the Replay Buffer.

Exercise

DQN with experience replay

Instructions 1/2

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Exercise

Instructions 1/2

Exercise