When to use Reinforcement Learning (and when not to)

Mauricio Fadel Argerich
The Startup
Published in
5 min readApr 9, 2020

--

RL has achieved better than human performance in most video games and has also beat the best Go player in the world. It is a general framework that can solve very different tasks without any prior knowledge, and even achieve stellar performance at it. This is why there is so much hype around RL nowadays, and it is certainly a very important framework that still has lots of potential.

However, RL cannot solve every problem, at least not yet.

This is something important to have in mind, especially for RL enthusiasts like myself. In order to limit myself to applying RL only when it makes sense, I have written down three questions to ask myself before applying RL. If I can answer all of them in a satisfactory way, then I can continue applying RL to the task at hand, otherwise, I should look elsewhere for a way to deal with it.

Hoping these questions might be useful for anyone else trying to decide if RL is the right way to go, I’ve listed them here:

Can I afford making mistakes?

Possible effects of using RL when you shouldn’t [screenshot from this video].

RL can be sample inefficient, especially deep RL. This means that it will take a long time for the RL agent to learn which actions are good and which ones are bad (i.e., which actions give a positive and a negative reward) making several mistakes on the way. This…

--

--