IniziaInizia gratis

Episode generation for Monte Carlo methods

Monte Carlo methods require episodes to be generated in order to derive the value function. Therefore, you'll now implement a function that generates episodes by selecting actions randomly until an episode terminates. In later exercises, you will call this function to apply Monte Carlo methods on the custom environment env pre-loaded for you.

The render() function is pre-loaded for you.

Questo esercizio fa parte del corso

Reinforcement Learning with Gymnasium in Python

Visualizza il corso

Istruzioni dell'esercizio

  • Reset the environment using a seed of 42.
  • In the episode loop, select a random action at each iteration.
  • Once an iteration ends, update the episode data by adding the tuple (state, action, reward).

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

def generate_episode():
    episode = []
    # Reset the environment
    state, info = ____
    terminated = False
    while not terminated:
      # Select a random action
      action = ____
      next_state, reward, terminated, truncated, info = env.step(action)
      render()
      # Update episode data
      episode.____(____)
      state = next_state
    return episode
print(generate_episode())
Modifica ed esegui il codice