You learn from mistakes!

Reinforcement Learning — The Genesis of Intelligence

Rohan Saha
Samur.AI
Published in
3 min readFeb 12, 2019

--

Photo by SpaceX on Unsplash

I hope you are having a wonderful and productive week. Let’s end this week with another shot of a cool machine learning concept. This time, the article is about the much awaited reinforcement learning. Don’t worry if you are unaware of the term or maybe you came across it but didn’t have time to learn more. In this article, you will understand the basic terminology regarding reinforcement learning along with an example to understand the concept.

Concept:
Reinforcement Learning is a common buzzword nowadays in the domain of machine learning.

Wikipedia defines it as follows:

Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.

The above definition simply means that you design a system with an algorithm that tries to learn by itself based on some kind of reward mechanism. So, if you carry out an action with a positive result, then you receive a positive score, and if you carry out a harmful action then you receive a negative score. Let’s look at an example.

Imagine you are stuck in a maze. To your dismay, there is a monster hidden in the maze. Your job is to get out of the maze safely without being eaten by the monster. How should you accomplish this? Having no knowledge of the maze, you are dependent only on your intuition and instincts. Here’s what you do. You try to escape the maze by following various routes and checking if it leads to a new place. If it does, you get a positive reward (new territory unlocked) and you follow the new path. If you hit a wall or come back to the same place, then it’s time for you to choose a different path; this time, you receive a negative reward. In fact, any game you can think of, it probably is incorporated with a reinforcement learning algorithm. You may realize that this is a trial and error procedure. Reinforcement Learning is all about Trial and Error. However, over time, the algorithm learns from its previous experience and establishes new connections in its artificial brain.

See how easy it was to understand reinforcement learning? If you got a hold of the concept, let’s move on to the definitions.

But before that, you must know the two most important components of Reinforcement Learning.

1. Agent
2. Environment

They are explained below:

Terminology:

1. Agent — The RL algorithm that learns from trial and error and takes action.

2. Environment — The space for the RL algorithm in which it moves. It returns a set of states and provides the agent with some rewards.

3. Action(A) — All the possible steps that the agent can take.

4. State(S) — Current condition of the RL agent (sate of the RL algorithm).

5. Reward(R) — Return from the environment to appreciate/ appraise the last move/action be the RL algorithm.

6. Policy(π) — a set of rules or strategy which the RL algorithm uses to take the next move based on the current state.

7. Value(V) — Expected long term return with the discount as opposed to short term reward R.

8. Action-Value(Q) — Similar to ‘Value’ by it takes an extra parameter, which is the current action(A). In other words, ‘Q’ depends on ‘A’.

Please make sure that you understand the terms mentioned above clearly because you will see them being used in various articles on reinforcement learning.

The next few articles will focus on more advanced topics on machine learning. If you like this article, consider buying me a coffee :)

--

--

Rohan Saha
Samur.AI

I write about byte sized articles on machine learning and how to survive academia.