Monte Carlo Methods for Reinforcement Learning

Introduction

Shivam Mohan
Nerd For Tech

--

In this article, we will discuss the Monte Carlo methods for Reinforcement Learning, which is one of the foundational concepts that act as the base of our understanding as we explore the more advanced topics and methods of Reinforcement Learning.

Let us start by understanding the meaning of the term ‘Monte Carlo’. Monte Carlo is a general term that is often used for any estimation method definition which involves a significant random component, however, with respect to Reinforcement Learning, this simply refers to methods that are based on averaging the complete returns.

Monte Carlo methods in reinforcement learning belong to a class of methods that do not assume the knowledge of the model/environment, the agent instead works with the experience the agent gains while it interacts with the environment, which can be an actual environment or a simulated one. Although for training in a simulated environment, we would need some information about the model to simulate the environment, however, it would be much less exhaustive than what we need in techniques like Dynamic Programming, which requires us to know the probabilities of each and every transition.

As mentioned, Monte Carlo methods are based on averaging complete returns, and to ensure that those complete returns are available, Monte Carlo methods are generally defined for episodic tasks, which means that only on the completion of an episode, is the value of the state estimated and…

--

--