CommencerCommencer gratuitement

RL interaction loop

As you know by now, RL involves an agent making decisions in an environment to maximize some notion of cumulative reward. The agent must discover which actions yield the most reward through interaction.

Cet exercice fait partie du cours

Reinforcement Learning with Gymnasium in Python

Afficher le cours

Exercice interactif pratique

Passez de la théorie à la pratique avec l’un de nos exercices interactifs

Commencer l’exercice