Member-only story
Weekly review of Reinforcement Learning papers #8
Every Monday, I present 4 publications from my research area. Let’s discuss them!
[← Previous review][Next review →]
Paper 1: Reinforcement Learning with Random Delays
Ramstedt, S., Bouteiller, Y., Beltrame, G., Pal, C., & Binas, J. (2020). Reinforcement Learning with Random Delays. arXiv preprint arXiv:2010.02966.
Delays between action and reward are common, and are a central problem in RL. Even in the real world: an action can produce a reward either immediately (e.g., negative rewards for pain that come immediately after a fall), or with a very long delay (doing well in school gets you a job away from financial trouble). Obviously, the whole intermediate spectrum is covered: an action can produce rewards arbitrarily distant in time). Conversely, a reward at a given moment cannot systematically be attributed to a single past action. In fact, it is probably a reward resulting from all the previous actions, each having a more or less important contribution.
In this paper, the authors introduce the following paradigm: a delayed environment is the result of the encapsulation of an undelayed environment (an action immediately produces the associated reward) in a delay…