Building a Powerful DQN in TensorFlow 2.0 (explanation & tutorial)

And scoring 350+ by implementing extensions such as double dueling DQN and prioritized experience replay

Published in

Analytics Vidhya

13 min readApr 16, 2020

A while ago, DeepMind released Agent57, a new benchmark for Atari AI. It is an incredible achievement — one I would like to talk about in-depth in the future — but neither it nor nearly any of the great advances we have seen in RL would have been possible without two key papers, Mnih et al. 2013 and Mnih et al. 2015. These two papers built the foundation for the DQN algorithm that is so widely known today.

In this article, we will implement the DQN algorithm, and some of its most common extensions (double dueling DQN with PER) in TensorFlow 2 and OpenAI Gym. In the end, our algorithm will be able to score 350+ in the Atari Breakout environment.

All of the source code that we will build is available on my GitHub, here.

I would like to give a shout-out to both this notebook by Fabio M. Graetz and this video by Machine Learning with Phil for inspiring this project and getting me started with the code. Both sources have been useful countless times, and I feel safe to say this article would not exist without them.

Building a Powerful DQN in TensorFlow 2.0 (explanation & tutorial)

And scoring 350+ by implementing extensions such as double dueling DQN and prioritized experience replay

The Basics

Written by Sebastian Theiler