Reinforcement Learning Competitions — 12/1/2022 Update

Here is a list of upcoming, ongoing, and completed Reinforcement Learning competitions I have compiled:

https://docs.google.com/spreadsheets/d/1Rw_rWzI-8lfGKtHHFlSWqswyZrCRPuwmQbad1ZlYxIk/edit?usp=sharing

AurelianTactics

Jan 26, 2022

Paper Implementation of ‘Using Unity to Help Solve Intelligence:’ Code Walkthrough

Paper Implementation: Using Unity to Help Solve Intelligence

AurelianTactics

Dec 4, 2021

Custom Reinforcement Learning Environment Usage for Ray, Stable Baselines 3, and Acme

Let’s say you want to apply a Reinforcement Learning (RL) algorithm to your problem. The are dozens of open sourced RL frameworks to choose from such as Stable Baselines 3 (SB3), Ray…

AurelianTactics

Mar 2, 2021

Implementing DQNClipped and DQNReg with Stable Baselines

AurelianTactics

Jan 12, 2021

BCQ with a GAN

There’s been a lot of interesting stuff in the field of batch Reinforcement Learning (aka offline RL) since I wrote about…

AurelianTactics

Oct 25, 2020

Implementing Unity Machine Learning into an Existing Game: Gridworld in Aurelian Tactics

AurelianTactics

Sep 13, 2020

Custom Models with Baselines: IMPALA CNN, CNNs with Features, and Contra 3 Hard Mode

Beating Contra III on Easy Mode with Reinforcement Learning — Part 3 Discussion

AurelianTactics

Jan 11, 2019

Beating Contra III on Easy Mode with Reinforcement Learning — Part 2: Experiment

Beating Easy Mode on Contra III with Reinforcement Learning — Part 1: Introduction

Basic TRFL Usage: Target Network Updating

Continuing my exploration of TRFL — a Reinforcement…

AurelianTactics

Dec 30, 2018

Basic TRFL Usage: Q-Learning and Double Q-Learning

AurelianTactics

Dec 13, 2018

Understanding PPO Plots in TensorBoard

OpenAI Baselines and Unity Machine Learning have TensorBoard integration for their Proximal…

Retro Gym with Baselines: 4 Basic Usage Tips

A short summary and code example followed by explanations.

AurelianTactics

Oct 26, 2018

Tensorflow Implementation of TD3 in OpenAI Baselines

When I’m looking for new research papers to read, it’s often hard to tell what is worth reading. How reproducible are the results? Will this paper actually have a lasting impact in the field of Reinforcement Learning (RL)? With those…

AurelianTactics

Aug 23, 2018

Location CNN and Pygame Learning Environment in Ray

Two recent papers came out that discussed the same issue: how to utilize location information in Convolutional Neural Networks. Basic CNNs are useful for their ability to pick out the similar objects at different locations in the image…

AurelianTactics

Aug 5, 2018

Using Joint PPO with Ray

Joint PPO is a modification of Proximal Policy Optimization (PPO). Joint PPO was used by the winner of OpenAI’s Retro Contest. Joint PPO in a few lines:

During meta-training, we train a single policy to play every level in the training set. Specifically, we…

Using Ray for Reinforcement Learning

I’ve been exploring ray for Reinforcement Learning (RL) the past couple of weeks. ray provides…

PPO Hyperparameters and Ranges

Proximal Policy Optimization (PPO) is one of the leading Reinforcement Learning (RL) algorithms. PPO is…

4 responses

About aureliantacticsLatest StoriesArchiveAbout MediumTermsPrivacyTeams

Reinforcement Learning Competitions — 12/1/2022 Update

Paper Implementation of ‘Using Unity to Help Solve Intelligence:’ Code Walkthrough

Paper Implementation: Using Unity to Help Solve Intelligence

Custom Reinforcement Learning Environment Usage for Ray, Stable Baselines 3, and Acme

Implementing DQNClipped and DQNReg with Stable Baselines

BCQ with a GAN

Implementing Unity Machine Learning into an Existing Game: Gridworld in Aurelian Tactics

Training Tic-tac-toe AI in Unity ML

Setting Up Unity ML Agents with Ray and Stable Baselines

Batch-Constrained Deep Q Learning in TensorFlow

Learn Reinforcement Learning with TensorFlow and TRFL

Trust Region-Guided Proximal Policy Optimization

Custom Models with Baselines: IMPALA CNN, CNNs with Features, and Contra 3 Hard Mode

Beating Contra III on Easy Mode with Reinforcement Learning — Part 3 Discussion

Beating Contra III on Easy Mode with Reinforcement Learning — Part 2: Experiment

Beating Easy Mode on Contra III with Reinforcement Learning — Part 1: Introduction

Basic TRFL Usage: Target Network Updating

Continuing my exploration of TRFL — a Reinforcement…

Basic TRFL Usage: Q-Learning and Double Q-Learning

Understanding PPO Plots in TensorBoard

Retro Gym with Baselines: 4 Basic Usage Tips

Tensorflow Implementation of TD3 in OpenAI Baselines

Location CNN and Pygame Learning Environment in Ray

Using Joint PPO with Ray

Using Ray for Reinforcement Learning

PPO Hyperparameters and Ranges