My Experience with DeepRacer: A Headstart and My Journey

Akshaya Varshieni
kgxperience
Published in
6 min readSep 11, 2023

DeepracerπŸš— is an AWS event that is conducted virtually and physically in 🏁 track. The main motive is to learn machine learning in specific reinforcement learning in a fun way where we get to enjoy the 😊 process.

Deepracer is a reinforcement learning technique to enable autonomous driving for the AWS DeepRacer vehicle.

AWS Deepracer

Reinforcement Learning (RL) is the 🧠 science of decision-making. It is about learning the πŸ’‘ optimal behavior in an environment to obtain maximum πŸ† reward.

RL doesn’t need any 🏷️ labeled data, it takes a large amount of data, identifies patterns in the data, and makes decisions.

Terminologies you need to know for sure about RL:

πŸ’‘ Agent β€” Our Deepracer vehicle is our πŸš— agent who is going to perform all our 🏎️ actions

🏎️ Environment β€” It is a place where the agent interacts, here it is our 🏁 race track.

πŸ›£οΈ State β€” The position the agent is currently in refers to the state of the agent.

🚦 Action β€” The decisions that result in a movement made by our agent.

πŸ”± Policy β€” It is a set of actions that makes decisions for the next move. The policy gets updated after each move.

πŸ† Reward β€” They are points awarded to our agent based on the decisions they make.

πŸ“ˆ Value β€” Evaluation of the actions made by the agent.

DeepRacer works based on the βš–οΈ exploration-exploitation dilemma, which is a 🧠 fundamental concept in reinforcement learning. The dilemma is between choosing to πŸ”± exploit the current best-known policy or exploring new policies to improve πŸ† performance.

β†’πŸ’‘ Exploration is the process of trying new things and learning about the 🏎️ environment. This can be done by taking actions that the model has not taken πŸ”™ before or by exploring different parts of the 🏁 environment.

β†’πŸ”± Exploitation is the process of using the knowledge that the model has learned to make decisions that are likely to be πŸ† successful. This can be done by taking actions that have been successful in the πŸ”™ past or by taking actions that are predicted to be successful.

Diving into training your own πŸš— DeepRacer model, which is similar to πŸ‘©β€πŸ³ cooking in that it is a πŸ’‘ creative process that involves πŸ‘©β€πŸ³ trial and error, πŸ§ͺ experimentation, and a deep understanding of the 🧠 underlying principles.

Let’s πŸ₯˜ compare DeepRacer and cooking, and πŸ‘©β€πŸ³ cook our DeepRacer dish together.

The Deepracer Dish

🍳 First, you need your pan to start your dish which is your Deepracer consoleπŸ’».

πŸ₯˜ The primary ingredients required for our DeepRacer dish are:

β†’ Agent β€” 1 πŸš—

→️ Model β€” 1 🏎️

β†’ Track β€” 1 🏁

β†’ Hyperparameters as required βš™οΈ

πŸ‘©β€πŸ³ Let’s start cooking, and here are a few steps that describe what you need to do, I’ll explain what they mean after we’re done cooking.

1️⃣ Step 1: Start your lab and enter into the deepracer πŸ’» console.

2️⃣ Step 2: Design your vehicle with the required cameras and sensors according to your physical πŸš— model.

Note: Your physical model won’t fit in with the wrong virtual model features

Step 3: Creating your model πŸš—πŸ.
You need to build a model and train it before racing on the track.

So to cook your deepracer curry base of the model you need to select the Track type 🏁, 🏎️ Race type πŸ†, Algorithm πŸ€– , Hyperparameters πŸ”€, Action space 🚦, Agent 🏎️ , Reward function πŸ† and Training period ⏳ .

Step 4: Let the model be cooked 🍳 i.e., training and we’ll taste it before serving ⏳πŸ₯˜

my deepracer car-supra

πŸ₯˜ Now let’s take a look at our curry base πŸ₯˜ ingredients,

πŸš— Once you have entered your console, you can start building your DeepRacer model by choosing the ingredients that you need. These ingredients include:

  • Track type: 🏁 The track type is the environment in which the πŸš— agent will learn to race. There is a wide range of 🏁 track types provided by the Deepracer platform and you can choose accordingly as per your πŸ₯˜ needs.
  • Race type: πŸ† The race type determines the goal of the πŸš— training.
    β€’ Time trial🏁: The goal of a time trial is to get the πŸ₯‡ fastest lap time possible. This is a good way to learn the basics of racing and to get a feel for how the πŸ’» DeepRacer platform works.
    β€’ Object avoidance🏁: The goal of an object avoidance race is to avoid obstacles while driving as fast as possible. This is a more challenging race type, but it can help you improve your agent’s ability to make decisions in difficult situations.
    β€’ Head-to-head racing🏁: The goal of a head-to-head race is to race against other agents and finish πŸ₯‡ first. This is the most challenging race type, but it can be a lot of fun.
  • Training Algorithm: πŸ€– The two training algorithms available in DeepRacer are Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). Both algorithms work by learning a policy that maps states to actions. However, PPO is a πŸ”± policy gradient algorithm, while SAC is a πŸ”± value-based algorithm.
  • Hyperparameters: βš™οΈ The hyperparameters are the settings that control the training process. You can adjust these settings to improve the performance of your model.
  • Action space: 🚦 The action space is the set of all possible actions that the agent can take. You can choose from a variety of action spaces, such as continuous or discrete.
  • Reward function: πŸ† The reward function is the metric that the agent will use to evaluate its performance. You can choose from a variety of reward functions, such as the lap time or the number of obstacles avoided.
  • Training period: ⏳ The training period is the amount of ⏱️ time that the agent will train. You can adjust this setting to improve the performance of your model

I guess that our model is well-cooked, let’s taste our freshly cooked model

Evaluating the model πŸ₯˜ consists of measuring the performance measure πŸ†, which tells us how well the model is cooked πŸ₯˜ and I’ll be continuing this in the next part.

My Experience with DeepRacer

My journey with DeepRacer πŸš— was a lot of fun 🀩 and started through the Deepracer League conducted By KGiSL EDU and AWS virtual community races 🏁 and recently a workshop conducted by Thoughtworks, Coimbatore.

Tbh, It wasn’t super easy or fun in the beginning πŸ˜…, but eventually, it got better and turned out well in the end πŸ‘. Let’s get to know what DeepRacer is, and I’ll share the mistakes I made and what I learned over time and trainingπŸ“š.

Deepracer πŸš— comes under the few things I wouldn’t regret trying 🀩.
At the start, I just went to the console πŸ’» and trained the model randomly by adjusting the hyperparameters βš™οΈ and a few other requirements πŸ“, but I never touched the reward function πŸ†. I know that’s not the right way to practice πŸ˜…, but I learned from my mistake πŸ₯² and grew πŸ“ˆ.

I learned a lot about reinforcement learning πŸ”± and autonomous driving 🏎️ through this journey πŸ›£οΈ. I also had a lot of fun racing against other models 🏁. If you’re interested in machine learning πŸ€– or autonomous driving 🏎️, I highly recommend trying out DeepRacer πŸš—.

Don’t be afraid to experiment. There is no one right way to train a DeepRacer model. The best way to learn is by trying different things.

Let’s connect socially:

➑️https://www.linkedin.com/in/akshayavarshieni/

➑️https://github.com/AkshayaVarshieni14

➑️https://www.instagram.com/akshayavarshieni/

--

--