RL 0 to 1: How to Learn RL

Pre-requirements

Parnian Barekatain
3 min readJun 2, 2018

Recommend reviewing my post for covering resources for the following sections:

  1. Math
  2. Python
  3. Know basic of Neural Network
  4. Frameworks

Math review

  1. Linear Algebra Review and Reference
  2. Probability Theory Review
  3. Convex Optimization Overview, Part I
  4. Convex Optimization Overview, Part II
  5. Hidden Markov Models
  6. The Multivariate Gaussian Distribution
  7. More on Gaussian Distribution
  8. Gaussian Processes

RL Vocabulary

  1. MDP
  2. Markov chain Monte Carlo
  3. Bellman equations
  4. Reward, state, policy, discounting factor, trajectory, state-space, transition function
  5. Dynamic Programing
  6. Value Function
  7. Q-Learning
  8. Policy Gradient
  9. Model Based — Model Free — partially observable
  10. Exploration vs Exploitation
  11. Inverse RL / Imitation RL / Apprenticeship Learning/ Meta Learning/ Transfer learning
  12. Reward ugmentation/ Reward shaping
  13. Actor Critic
  14. Monte Carlo Tree Search
  15. Human in the loop
  16. Deep RL
  17. Zero-shot One-shotFew-shot
  18. Differentiable

Baselines / environment

OpenAI and DeepMind build environment that allows researchers to run and test their RL models. These environments are considered as benchmarks for RL. In order to become familiar how to run any RL algorithms in these environments, I recommend reading Andrej Karpathy blog on Deep Reinforcement Learning and OpenAI Gym documentation.

Here is the list of all the environments with OpenAI and DeepMind.

OpenAI:

  1. Dota2 by OpenAI https://blog.openai.com/dota-2/
  2. OpenAI Gym https://gym.openai.com/

There are many types of environments in OpenAI Gym. The current environments are:

  1. Box2D
  2. Algorithms
  3. Atari
  4. Classical control
  5. Robotics
  6. MuJoCo
  7. ToyText

DeepMind:

  1. Starcraft2
  2. DeepMind Lab
  3. DeepMind Control Suite — It is similar to OpenAI MuJoCo

Datasets

DeepMind released many datasets for researchers to run RL model on top of them. Here are the list of datasets that is available by DeepMind:

  1. Kinetics
  2. AQuA
  3. Dsprites
  4. RC
  5. Spaceship
  6. Card2code
  7. Unsup-queries

Choosing the dataset is really depends on what model you like to run, There are few useful datasets by Kaggle, which I recommend considering to run RL models.

Main blogs

Here is the list of some famous blogs. Many of them bragging why RL does not work.

  1. Deep Reinforcement Learning Doesn’t Work Yet
  2. An Outsider’s Tour of Reinforcement Learning
  3. Greg Brockman on Resources — I recommend reading this post.
  4. Deep Deterministic Policy Gradients in TensorFlow
  5. Collection of Deep Learning resources
  6. Learning to Learn

Main-papers

I made the list based on recommendation of couple of friends, farmer colleagues and my own intuition.

  1. DQN: Nature paper
  2. A2C / A3C
  3. PPO
  4. TRPO
  5. HER
  6. Rainbow
  7. DDPG
  8. Feudal Networks
  9. Learning to learn by gradient descent by gradient descent
  10. AlphaGo Nature Paper

Fast RL Exploration/Exploitation

  1. Variational Information Maximizing Exploration
  2. The Many Faces of Optimism
  3. Deep Exploration via Bootstrapped DQN

Q-Learning

  1. DQN: Nature paper

Deep Q-Learning Paper

  1. Deep Reinforcement Learning with Double Q-learning
  2. Prioritized Replay
  3. Hindsight Experience Replay
  4. Rainbow

Policy Gradient

  1. A Natural Policy Gradient
  2. PPO
  3. TRPO
  4. Q-Prop

Monte Carlo Tree Search

  1. Monte-Carlo tree search and rapid action value estimation in computer Go
  2. AlphaGo: Mastering the game of Go without human knowledge

Human in the loop

  1. Human-level control through deep reinforcement learning

Imitation Learning

  1. Maximum Entropy Inverse Reinforcement Learning
  2. Apprenticeship Learning via Inverse Reinforcement Learning

Hierarchy

  1. Feudal Reinforcement Learning
  2. Strategic Attentive Writer

Model Based

  1. Value Iteration Networks

Meta-RL

  1. Reinforcement Learning Neural Turing Machines

Courses

I personally start learning RL by watching Pieter Abbeel`s bootcamp lectures. Here the classes that I would recommend:

  1. Stanford CS 234
  2. Deep RL bootcamp by Pieter Abbeel
  3. Stanford CS 229 — section in RL
  4. Berkeley Deep RL CS 294
  5. Nando de Freitas’ course on machine learning

I never watched online courses for RL, but it seems to be a good place to start. There are many of these courses exist — I listed a few. I am not sure, which one would be the best one:

  1. Reinforcement Learning Explained
  2. Reinforcement Learning

Textbooks

This is the most well-written text book.

  1. Reinforcement Learning: An Introduction, by Sutton and Barto
  2. Algorithms for Reinforcement Learning
  3. Markov Decision Processes: Discrete Stochastic Dynamic Programming
  4. Approximate Dynamic Programming

Main Researchers

  1. Pieter Abbeel — Professor @ Berkeley
  2. David Silver — main research lead @ Alpha Go
  3. Richard Sutton — known for RL text book
  4. John Schulman — works at OpenAI
  5. Volodymyr Mnih — initial DQN paper — worked under Geoffrey hinton

Main Research labs

  1. OpenAI
  2. DeepMind
  3. GoogleBrain
  4. Facebook — Newyork team

Extra-Math

If you would like to know some additional subjects, I recommend reviewing the following math — it is not required but it would great to have strong background in math.

  1. Knot Theory
  2. VC Dimension
  3. Homomorphism
  4. Mixture Model
  5. Direct Acyclic Graph
  6. Ambient IsotopyRandom Matrix

--

--