What Does a DQN Think?

A brief visualization of how a DQN chooses to play Breakout

Published in

Analytics Vidhya

6 min readApr 30, 2020

In the last article, we explored how we could build a DQN to score 350+. As a quick recap, here is a short clip of that DQN playing:

The DQN is certainly playing well, but in a given situation, is it possible to tell why the DQN took the move it did?

For a game as simple as Breakout, it isn’t too tricky. If the ball is headed down-left, the agent better move left.

But what about an environment like Montezuma’s Revenge? Or StarCraft, DOTA, maybe even an RL agent in the real world? Can we understand how agents in those environments make their decisions? And if we can’t, can we trust these agents to make reliable decisions in the real world?

To try and answer some of these questions, I set out to build a visualization of what our DQN has learned, and what’s going on inside its head as it makes decisions.

All the code for the visualizations in this article will be available on my GitHub, here.

What Does a DQN Think?

A brief visualization of how a DQN chooses to play Breakout

Base Screen

Written by Sebastian Theiler