Exercise

Episode generation for Monte Carlo methods

Monte Carlo methods require episodes to be generated in order to derive the value function. Therefore, you'll now implement a function that generates episodes by selecting actions randomly until an episode terminates. In later exercises, you will call this function to apply Monte Carlo methods on the custom environment env pre-loaded for you.

The render() function is pre-loaded for you.

Instructions

100 XP

Reset the environment using a seed of 42.
In the episode loop, select a random action at each iteration.
Once an iteration ends, update the episode data by adding the tuple (state, action, reward).

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Exercise

Instructions

Exercise