Exercise

Barebone DQN action selection

The select_action() function lets the agent select the action with highest Q-value at every step.

The function takes as argument the Q-network and the current state, and returns the index of the action with highest Q-value.

The Q-network is instantiated as q_network, and a random state has been loaded in your environment with state = torch.rand(8) to give you example data to work with.

Instructions

100 XP

Calculate the Q-values corresponding to each action in the state provided as argument.
Obtain the index corresponding to the action with highest Q-value.

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Exercise

Instructions

Exercise