GoAi #1: Asynchronous Methods for Deep Reinforcement Learning
I take a note about my paper reading by writing some stories on medium. I start to do reinforcement learning research recently, so I expect myself to read ai related paper frequently. Specially, I would like to share my note to everyone.
Reference : Asynchronous Methods for Deep Reinforcement Learning Paper
Introduction
First, if you don’t have the background about deep reinforcement learning, you can think of it as major algorithm behind AlphaGo. This paper introduce a new framework for deep reinforcement learning because traditional deep reinforcement learning algorithm(DQN) has several drawbacks:
- In online agent case, it is non-stationary and its data is strongly correlated.
- Even though we can use experience replay to avoid some problems mentioned above, it use more memory and computation per real interaction and required off-policy data generated by older policy.
Therefore, authors provide asynchronous Methods for Deep Reinforcement Learning to overcome these drawbacks.
Using CPU instead of GPU, we can open multi thread to run the same environment but share the same model weight.

After reading the pseudocode, we find that there is little difference from original DQN algorithm. The special point is the line — t mod Iasyncupdate.
Different thread will have different gradients. Moreover, it is possible to update the target network weight before gradients become zero and make gradients more diverse.
It also uses the share parameters of RMSProp optimization.
In addition to one-step Q learning, the idea can also apply on others algorithm such as n-step Q learning and n-step Sarsa. The async method not only provides better performance but also reduces the training time and make efficient use of resources. You can see more comparison plot on original paper.
Result
Thanks for DeepMind provided the awesome result on several games and uploaded to Youtube.
TORCS Car Racing Simulator
Continuous Action Control Using the MuJoCo Physics Simulator
Labyrinth
The most crazy thing is DeepMind can successfully play such complex game which have to change the user camera. I think it is not a dream to play Counter-Strike by AI.
It is very exciting to read the paper, but it is also very hard to figure out the algorithm and implement it. I think writing some note can help learn the concept in the paper and it will be my motivation to read more and more paper. If there is any thing ambiguous or mistake about my notes of paper, please let me know. We can discuss about it :))
#Reinforcement Learning
#AI