Tagged in

OpenAI

aureliantactics
aureliantactics
Blogging about Reinforcement Learning and Machine Learning. Github repo at: https://github.com/AurelianTactics
More information
Followers
105
Elsewhere
More, on Medium

Understanding PPO Plots in TensorBoard

OpenAI Baselines and Unity Machine Learning have TensorBoard integration for their Proximal…


Using Joint PPO with Ray

Joint PPO is a modification of Proximal Policy Optimization (PPO). Joint PPO was used by the winner of OpenAI’s Retro Contest. Joint PPO in a few lines:

During meta-training, we train a single policy to play every level in the training set. Specifically, we…