1. Learn
  2. /
  3. Courses
  4. /
  5. Reinforcement Learning with Gymnasium in Python

Connected

Exercise

Comparing policies

You are given two state value functions (value_function_1 and value_function_2) corresponding to two different policies in the MyGridWorld environment. Your task is to compare these state value functions on a state-by-state basis to determine which policy is more effective.

The variable num_states is available for you to use.

Instructions

100 XP
  • Create a list one_is_better of boolean values, where each element checks if the state's value in value_function_1 is higher or equal than the state's value in value_function_2.
  • Create a list two_is_better of boolean values, where each element checks if the state's value in value_function_2 is higher or equal than the state's value in value_function_1.