1. Learn
  2. /
  3. Courses
  4. /
  5. Deep Reinforcement Learning in Python

Connected

Exercise

DRL training loop

To allow the agent to experience the environment repeatedly, you need to set up a training loop.

Many DRL algorithms have in common this core structure:

  1. Loop through episodes
  2. Loop through steps within each episode
  3. At each step, choose an action, calculate the loss, and update the network

You are provided with placeholder select_action() and calculate_loss() functions that allow the code to run. The Network and optimizer defined from the previous exercise are also available to you.

Instructions

100 XP
  • Ensure that the outer loop (over episodes) runs for ten episodes.
  • Ensure that the inner loop (over steps) runs until the episode is complete.
  • Take the action selected by select_action() in the env environment.
  • At the end of an inner loop iteration, update the state before starting the next step.