The Absolute Basics of Reinforcement Learning

Mansi Katarey
Analytics Vidhya
Published in
4 min readNov 23, 2020


Reinforcement Learning

Reinforcement learning. What is it and what does it do? In this article, you’ll get a basic rundown of what reinforcement learning is.

First, let’s start with a basic definition:

Reinforcement learning is an area of machine learning.

It involves software agents learning to navigate an uncertain environment to maximize reward. It learns from interactive experiences and uses feedback from its actions. Basically, the bot gets points for its actions. It can gain or lose points. The way agents learn through RL is identical to the way we, as humans learn.

Think of it like a video game where you get punished or rewarded for your actions. In most video games you get rewarded by gaining more points or moving on to the next level and you get punished by losing a life or dying.

Inside the RL algorithm

We want to get the agent to learn for itself.

There are three basic elements of the reinforcement learning algorithm:

First, we’ve got the environment in which the agent is in. The environment provides input back to the agent as to if what it did was right or wrong. In other words, the environment tells the agent if the action it took resulted in a reward or punishment.

Next, we’ve got the agent. The agent is the one choosing the actions it takes.

And finally, we’ve got the reward. The reward is what the agent is aiming for. The agent’s incentive.

How the RL Algorithm learns

Now if we go back to our video game example, the environment would be the game screen that you see, the agent would be you as you’re the one making the decisions and playing the game, and the reward would be more points or moving on to the next level.

So how does it compare to other machine learning techniques :

There are 3 basic machine learning techniques; supervised learning, unsupervised learning and of course, reinforcement learning.

The main difference between each of these techniques is the goal.

The goal of unsupervised learning is to find similarities and differences between data points, while the goal of supervised learning is to sort the data based on the labels given. And of course, as we know, the goal of reinforcement learning is to get maximum reward.

RL vs. other ML techniques

Where is RL the most useful?

Reinforcement learning techniques are particularly useful since they don’t require lots of pre-existing knowledge or data to provide useful solutions or where there are many unknowns.

Where is it being used today?

Currently, RL is being used in areas like robotics, air traffic control, data processing, to create training systems and more! The applications on RL are endless and can be used almost everywhere. Google’s Deep Mind team has used RL to get an agent to learn and recognize digits and play the game, Atari all on its own!

This is a video of Google’s Deepmind algorithm playing Atari.

Challenges of RL

Any new technology comes with its fair share of challenges and it’s no different for RL. One of the biggest problems with RL is trying to use it on a big scale. It requires a lot of training time and a huge number of iterations to learn tasks. The way RL learns is by using trial-and-error. To do this in the real-world becomes nearly impossible. Let’s take the example of an agent trying to navigate through an environment to avoid people. The agent would then try different actions and then proceed with the one that would best fit in that environment. This becomes hard to do in the real-world where the environment is changing constantly and frequently.



Mansi Katarey
Analytics Vidhya

Passionate about AI and how it can solve problems around the world!