Prioritized experience replay buffer

You will introduce the PrioritizedExperienceReplay class, a data structure that you will later use to implement DQN with Prioritized Experience Replay.

PrioritizedExperienceReplay is a refinement over the ExperienceReplay class that you have been using so far to train your DQN agents. A prioritized experience replay buffer ensures that the transitions sampled from it are more valuable for the agent to learn from than with uniform sampling.

For now, implement the methods .__init__(), .push(), .update_priorities(), .increase_beta() and .__len__(). The final method, .sample(), will be the focus of the next exercise.

In .push(), initialize the transition's priority to the maximum priority in the buffer (or 1 if the buffer is empty).
In .update_priorities(), set the priority to the absolute value of the corresponding TD error; add self.epsilon to cover edge cases.
In .increase_beta(), increment beta by self.beta_increment; ensure beta never exceeds 1.

Exercise

Prioritized experience replay buffer

Instructions

.css-6su6fj{-webkit-flex-shrink:0;-ms-flex-negative:0;flex-shrink:0;}Exercise

Instructions

Exercise