AWS DeepRacer Montreal: A Quick Recap

Humza.
TickSmith
Published in
5 min readJan 20, 2020
Team TickSmith holding our best performing model.

With the advent of autonomous driving vehicles, the ambitious future that awaits us holds opportunities and applications we have yet to unravel at this point in space & time. Given the complex nature of this infancy of autonomous vehicles and the factors that have to be addressed during the process to achieve a certain level of autonomy on the roads, the work that goes into making this happen involves not only a great deal of effort but an equal amount of knowledge as well.

For beginners, that might seem daunting. However, we always have organizations around us who are trying to put tools like this in the hand of the masses to try and accelerate the process of R&D in the respective fields. AWS DeepRacer happens to be a similar product from Amazon Web Services, which is primarily an integrated learning system for users of all levels to learn and explore reinforcement learning and to experiment and build autonomous driving applications. After my personal experience, I believe it can be a good starting point for any ambitious soul who is looking to explore the realm of autonomous driving. To further encourage competition, AWS has a global tournament called AWS DeepRacer League where the models you train using their platform compete against others with one objective: to get the shortest lap time.

AWS DeepRacer recently happened in Montreal and I got a chance along with my team to get first-class hands-on experience with the platform. As a part of this tournament, we were required to train our autonomous models using the AWS DeepRacer platform and bring them with us on the race’s day where we were supposed to witness the manifestation of our doings in the physical world, in other terms the models we would bring will be plugged into RC cars provided by AWS and have them run on a physical track. The idea was very exciting to see your models compete in a physical environment but the realization of the fact that having a model which is all over the place and driving like its dodging non-existent bullets was motivating enough to train & bring only the good ones, which I must add was a very exuberant process in its own. So armed with nothing but our ambition to win at this and not putting any models-under-influence on the track along with almost little to no exposure to reinforcement learning on a personal level, we were good to go and take our rightful place as the champions of AWS DeepRacer League, at least that was the plan.

The process of training a model is fairly simple. You have to specify values for some predefined parameters such as:

Maximum Steering Angle: How steep the car can steer.

Steering Angle Granularity: The number of maneuvers the car can make while staying within the specified angle.

Maximum Speed: The maximum speed it can achieve.

Speed Granularity: The number of slices it can break down the speed to.

These are the main factors that determine how the car will behave on the track, but for that, it needs to learn to drive itself first. Hence, coming to the most important aspect of this, the reward function:

def reward_function(params):
'''
Example of rewarding the agent to follow center line
'''

# Read input parameters
track_width = params['track_width']
distance_from_center = params['distance_from_center']

# Calculate 3 markers that are at varying distances away from the center line
marker_1 = 0.1 * track_width
marker_2 = 0.25 * track_width
marker_3 = 0.5 * track_width

# Give higher reward if the car is closer to center line and vice versa
if distance_from_center <= marker_1:
reward = 1.0
elif distance_from_center <= marker_2:
reward = 0.5
elif distance_from_center <= marker_3:
reward = 0.1
else:
reward = 1e-3 # likely crashed/ close to off track

return float(reward)

During the training process, the reward function tells the model exactly how it is performing and using this information, the model adapts its behavior and techniques in the virtual environment during training to go fast while staying on the track. For example, the reward function we see above is the one available by default and it yields a certain amount of reward based on the model’s performance: the closer it is to the center of the track, the more reward it earns. This function, when combined with the parameters above, determines how your model will behave on track.

The objectives were clear with eyes on the prize! Each of the team members was training a model targeting a specific aspect of the race: I was training for speed, one of my team members was training for stability and so on. I personally trained around 30 models for this race, some of them serving as an untold reminiscence of Le Mans while others training themselves specifically for headfirst collisions, talk about intelligent machines. The anecdotes of training these models are worthy of an article of their own, but they were quite enthralling, to say the least. After a lot of experimentation, disposing some models ruthlessly and attempting to try to every possible good combination of the parameters, we finally had something worthy of running the physical track:

Final evaluation results for the on-track model.

October 23, 2019, was our big day. Armed with nothing but our models in USB sticks and company t-shirts which were clearly too big for us as we ordered them the last minute, we were there on the track! The way race was designed, as we had 20 minutes initially to test different models on the track to have an idea of how they are performing in the physical environment, followed by 10 minutes of the actual race. The best lap time was considered during the 10-minute race and compared against other teams’ best lap times.

During the time period, different teams were evaluating their models, we held on to the second rank very dearly. Though it wasn’t as good as the first, we were still in the race. However, the last race to happen took our second spot by a mere 700 ms and hence we fell down to third, which was as horrible as it sounded in theory but provided the fact this was our first attempt at autonomous racing without having any previous relevant experience, still ending up in the top three seems propitious for the future events. Look forward to next year’s race with models which are a little older, a little wiser.

--

--

Humza.
TickSmith

Product Engineer / DeFi Advocate / NFT Analyst. Passionate about startups, cosmology, physics, and a decentralized future.