Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Weekly review of Reinforcement Learning papers #8

5 min readMay 10, 2021

--

Image by the author

[← Previous review][Next review →]

Paper 1: Reinforcement Learning with Random Delays

Ramstedt, S., Bouteiller, Y., Beltrame, G., Pal, C., & Binas, J. (2020). Reinforcement Learning with Random Delays. arXiv preprint arXiv:2010.02966.

Delays between action and reward are common, and are a central problem in RL. Even in the real world: an action can produce a reward either immediately (e.g., negative rewards for pain that come immediately after a fall), or with a very long delay (doing well in school gets you a job away from financial trouble). Obviously, the whole intermediate spectrum is covered: an action can produce rewards arbitrarily distant in time). Conversely, a reward at a given moment cannot systematically be attributed to a single past action. In fact, it is probably a reward resulting from all the previous actions, each having a more or less important contribution.

In this paper, the authors introduce the following paradigm: a delayed environment is the result of the encapsulation of an undelayed environment (an action immediately produces the associated reward) in a delay…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Quentin Gallouédec
Quentin Gallouédec