A Baby Robot’s Guide To Reinforcement Learning

State Values and Policy Evaluation in 5 minutes

Steve Roberts
5 min readJan 11, 2023

An Introduction to Reinforcement Learning

[All images by author]

This is a summary of the article State Values and Policy Evaluation. It distils all of the key terms and theory from that article down into a single cheat-sheet that can be read in 5 minutes or less. With that in mind, we’d better get started…

You can also open this article as a Jupyter Notebook on Binder which allows you to run the associated code samples using the Baby Robot Custom Gym Environment.

Reinforcement Learning

Reinforcement Learning can be considered to be a problem that takes place in an environment that consists of multiple, independent, states.

A simple example of this would be a grid world, where each square in the grid represents a state:

State

A unique, self-contained, stage in the environment that defines the current situation. Each state is independent of previous states, which means you don’t need to know or remember what has happened before.

Episode

One complete execution of the environment.

Terminal State

--

--