What is the AWS DeepRacer League and how did we win the French Edition ?
On April 2, 2019, the Amazon Web Services summit of Paris took place. AWS is Amazon’s cloud service, used by millions of developers worldwide. AWS offers more than 90 services, including computing, storage, network, database, data analysis, application services etc etc..
This year AWS created a brand new competition: the “AWS DeepRacer League”. In this article, we will first explain the concept of this contest and then we will analyse how our approach enabled us to win!
I. Concept of the AWS Deep Racer League
The AWS DeepRacer League is a worldwide competition for developers. The goal is to develop an algorithm that let a small autonomous car drive on a track. Candidates have to use reinforcement learning algorithms (the concept will be explained later) to train their artificial intelligence. The winner is the one who completes a complete round fastest.
The competitions take place at the AWS summit, there are 20 of them in the year in the world’s largest cities and each winner is invited by AWS to the world final in Las Vegas in 2019!
You can also attend to virtual competitions, one per month and win your place in Las Vegas.
II. How does the training of the model works ?
AWS provides two methods to train the algorithm :
- Use the dedicated service on AWS: the developer focuses on the reward function and tunes the deep Neural Network to train the model
- Three services: SageMaker, RoboMaker, S3: you have access to a larger number of variables to train the model but the platform is more difficult to handle and the results are often worse than with the first method according to what we have observed during the different summits.
In our case, we had to use the second method because the first service was not yet released.
To understand how training works, it is first essential to make a short parenthesis on the way reinforcement learning works (This article is not intended to be technical so we will not dive into too much detail).
How does reinforcement learning works ?
There are three major areas of machine learning:
- Supervised Learning: data is labelled which means that we have previously annotated each row of data to indicate the class to which each element of the dataset belongs. Thus, the algorithm tries to learn which patterns belong to each type of data. For example, we could teach to a car how to drive by controlling it with a remote, save the remote’s inputs and train the model by merging this data with images collected from a camera.
- Non-Supervised Learning: the data is not labelled which means that we try to learn patterns depending on the relation between data.
- Reinforcement Learning: The data is not labelled. An agent (the car) interacts in an environment (the track). Depending on its policy, the car gets a reward. Then, we update the policy in order to maximize the reward function.
The car has a limited range of N actions (for example speed up, move left slightly, move right strongly etc…). A Deep Neural Network (DNN) predicts which of these N actions is the most appropriate depending on an image as an input, captured from the onboard camera.
To train the network, the car makes a large number of attempts in a virtual simulator. After each attempt, we give the car a reward. If, for example, she stayed in the middle of the track and she went far, she will obtain more points than if she went off the track directly. Then, we backpropagate in the DNN to update every weight every k steps.
As you can see below, AWS provides a virtual environment to train the car. Having a virtual environment gives us access to a large number of variables: distance from the centre of the track, speed, wheel inclination, etc… Knowing that we reward the policy of the car depending on these variables, using a virtual environment is essential. On the right, you can see a plot which represents the evolution of the reward depending on the step. When there is a full bar, it means the car has completed a full lap.
You can see below one of our tours during the Paris summit.
III. What we learned from the race
We have noticed that some parameters have a big impact on the quality of the model :
- You have to find the right balance when you choose the training time of your model : if you train your model you will overfit the track and if you don’t train it enough, the model won’t be strong enough.
- Don’t focus on the speed but on the reliability of your model, if it can behave well on the track, You can accelerate the car on the tablet.
- It’s all about the trajectory, working on angles has improved our model a lot.
We will not reveal more details about our solution as we are currently working on a new algorithm for the final in Las Vegas in December 2019. 😉
I hope you have been able to learn more about the DeepRacer League through this article. If you have any questions, do not hesitate to contact Arthur (pacearthur.github.io) or myself (matrousseau.github.io) and we will be happy to answer you.
Thank you ! 😃