RL interaction loop
As you know by now, RL involves an agent making decisions in an environment to maximize some notion of cumulative reward. The agent must discover which actions yield the most reward through interaction.
Este exercício faz parte do curso
Reinforcement Learning with Gymnasium in Python
Exercício interativo prático
Transforme a teoria em ação com um de nossos exercícios interativos
Começar o exercício