Get startedGet started for free

Episode generation for Monte Carlo methods

Monte Carlo methods require episodes to be generated in order to derive the value function. Therefore, you'll now implement a function that generates episodes by selecting actions randomly until an episode terminates. In later exercises, you will call this function to apply Monte Carlo methods on the custom environment env pre-loaded for you.

The render() function is pre-loaded for you.

This exercise is part of the course

Reinforcement Learning with Gymnasium in Python

View Course

Exercise instructions

  • Reset the environment using a seed of 42.
  • In the episode loop, select a random action at each iteration.
  • Once an iteration ends, update the episode data by adding the tuple (state, action, reward).

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def generate_episode():
    episode = []
    # Reset the environment
    state, info = ____
    terminated = False
    while not terminated:
      # Select a random action
      action = ____
      next_state, reward, terminated, truncated, info = env.step(action)
      render()
      # Update episode data
      episode.____(____)
      state = next_state
    return episode
print(generate_episode())
Edit and Run Code