Tagged in

Retro Gym

aureliantactics
aureliantactics
Blogging about Reinforcement Learning and Machine Learning. Github repo at: https://github.com/AurelianTactics
More information
Followers
105
Elsewhere
More, on Medium

Using Joint PPO with Ray

Joint PPO is a modification of Proximal Policy Optimization (PPO). Joint PPO was used by the winner of OpenAI’s Retro Contest. Joint PPO in a few lines:

During meta-training, we train a single policy to play every level in the training set. Specifically, we…

Integrating New Games into Retro Gym

OpenAI’s retro gym is a great tool for using Reinforcement Learning (RL) algorithms on classic…


Deep Q-learning from Demonstrations (DQfD) in Keras

In an earlier post, I wrote about a naive way to use human demonstrations to help train a Deep-Q Network (DQN) for Sonic the Hedgehog. After that mostly unsuccessful attempt I read an interesting paper called Deep Q-learning from…


Attempting to Beat Sonic the Hedgehog with Reinforcement Learning

OpenAI held a Retro Contest where competitors trained Reinforcement Learning (RL) agents on Sonic the Hedgehog. The goal of the competition was to train an agent on levels of Sonic from the first three games and see…


Creating Human Transitions for Deeq Q Learning in Retro Gym

The retro-movies repo makes it easy to create human demonstrations for retro gym. I made a script to turn a human demonstrations into frame by frame transitions. Some slight modifications to the Rainbow DQN baseline provided…


Creating a Custom Reward Function in Retro Gym and Other Utilities

See my prior blog post for an intro to this and this repo for the files I’ll be discussing.

The University of California Deep Reinforcement Learning (RL) course has a lecture by John…