A Baby Robot’s Guide To Reinforcement Learning
State Values and Policy Evaluation in 5 minutes
An Introduction to Reinforcement Learning
This is a summary of the article State Values and Policy Evaluation. It distils all of the key terms and theory from that article down into a single cheat-sheet that can be read in 5 minutes or less. With that in mind, we’d better get started…
You can also open this article as a Jupyter Notebook on Binder which allows you to run the associated code samples using the Baby Robot Custom Gym Environment.
Reinforcement Learning
Reinforcement Learning can be considered to be a problem that takes place in an environment that consists of multiple, independent, states.
A simple example of this would be a grid world, where each square in the grid represents a state:
State
A unique, self-contained, stage in the environment that defines the current situation. Each state is independent of previous states, which means you don’t need to know or remember what has happened before.
Episode
One complete execution of the environment.