Get startedGet started for free

Congratulations!

1. Congratulations!

Congratulations for completing this exciting but demanding journey in the field of Deep Reinforcement Learning!

2. Chapter 1

Let's look back on the steps we've made together. In chapter 1, we discovered how neural networks could approximate a value function and built our first value-based DRL algorithm: the Barebone version of DQN.

3. Chapter 2

Throughout the course of Chapter 2, we introduced layer upon layer of sophistication, implementing DQN and Double DQN; our work in the domain of value-based DRL culminated in the implementation of prioritized experience replay.

4. Chapter 3

In Chapter 3, we discovered policy gradient methods with REINFORCE and A2C. These techniques are conceptually complex, but very powerful as they enable DRL in continuous action spaces.

5. Chapter 4

Finally, in chapter 4, we introduced PPO, a popular, state-of-the art algorithm taking policy-gradient methods to new heights. We also explored different ways to structure the training loops for sophisticated policy gradient techniques; and to conclude we discovered how to automate hyperparameter tuning with Optuna.

6. What next?

Deep Reinforcement Learning is a vast and exciting field, and there is a lot that we haven't discussed. Advanced algorithms such as DDPG or Soft Actor Critic; topics such as multi-agent learning or model-based reinforcement learning; specific applications such as large language models with RLHF: those are all fascinating topics for another day!

7. Well done!

Once again, congratulations! And keep learning.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.