Reinforcement learning with AWS DeepRacer

Published in

hackgenius

4 min readJul 27, 2022

In this blog we are going to see about reinforcement learning, also the connectivity between Deepracer and Reinforcement Learning.

Reinforcement learning

Reinforcement learning is a machine learning training algorithm/method based on rewarding desired behaviors and/or punishing undesired ones. In general, a reinforcement learning agent is able to perceive and interpret its environment, take actions and learn through trial and error.

There are three things you have to mind while doing reinforcement learning

Action Space
Environment
Agent

ACTION SPACE :

In reinforcement learning, the set of all valid actions, or choices, available to an agent as it interacts with an environment is called an action space. In the AWS DeepRacer console, you can train agents in either a discrete or continuous action space.

In simple words, the actions that can be done by the agent in the environment is called as action space. In DeepRacer the actions taken by car as moving front and turning left/right are some of the actions done by the car in DeepRacer.

Environment:

The goal of Reinforcement Learning (RL) is to design agents that learn by interacting with an environment. In the standard RL setting, the agent receives an observation at every time step and chooses an action. The action is applied to the environment and the environment returns a reward and a new observation.

The Environment is nothing but the place or the surroundings where the agent is going to act, that actions are improvised or not improvised by the rewards that provided by the reward function.

Agent:

The goal of reinforcement learning is to train an agent to complete a task within an uncertain environment. At each time interval, the agent receives observations and a reward from the environment and sends an action to the environment.

The agent is the object or the item that is going to act on the environment based on the training provided to the agent

Real world comparison

The best example for Reinforcement Learning is training a dog. while we training a dog, we try to teach some of the actions to it. Then the dog tries to learn and repeat what we taught to it. If the dog did the action correctly then we give an reward to it.

Now the dog came to know that, this is the correct action to do in this situation, that’s why I got reward. So it starts trying to do the action correctly for more reward.

If the dog did the wrong action, then we reduce the reward or else we don’t give the reward.

This is the exact real world component that matches with Reinforcement learning. We train our agent to act on the Environment, the agent tries to learn and repeat the steps and if it done it correctly then we provide the reward to it or else reduce the reward. Then our agent tries to do the proper actions in the correct situation to get more rewards.

But in some scenarios the actions need to be changed, So sometimes our agent may fail to do the appropriate action, in that phase we reduce the reward, then it tries to do any other actions specified to it and try to get more reward.

AWS DeepRacer

In AWS DeepRacer, the agent is our DeepRacer car, the environment is the track and the surroundings our DeepRacer car needs to act on the environment, like it needs to move forward, turn right, turn left.

The DeepRacer car tries to act in the Environment based on our instructions, like stay inside the border, follow the center line, avoid zig-zag movements.

Then it learns to move properly to get more rewards

Reward Function

Throughout this blog we saw a work called REWARD, which is nothing but a biscuit to the dog, in DeepRacer we give to the car as points.

Actually we write a python snippet as reward function for providing and reducing the rewards based on the actions done by the car.

Previous blogs : https://medium.com/hackgenius/aws-deepracer-by-prijesh-6fb0b1700c8b

Reinforcement learning with AWS DeepRacer

Written by Guhan prijesh