1. Learn
  2. /
  3. Courses
  4. /
  5. Deep Reinforcement Learning in Python

Connected

Exercise

Action selection in REINFORCE

Write the REINFORCE select_action function, which will be used by your REINFORCE agent to select an action at every step.

In DQN, the forward pass of the network returned Q-values; in REINFORCE, it returns action probabilities, from which an action can directly be sampled.

A policy network and a state have been loaded in your environment.

torch.distributions.Categorical has been imported as Categorical.

Instructions

100 XP
  • Obtain the action probabilities as a torch tensor.
  • Obtain the torch Distribution corresponding to the action probabilities.
  • Sample an action from the distribution.