Results and Lessons: DeepRacer Student League April 2023
Introduction
This the second story of my new series where I detail my month-by-month progress in the DeepRacer Student League. You can find the first story here: Results and Lessons: DeepRacer Student League March 2023 | by Aleksander Berezowski | Apr, 2023 | Medium
The Track
This month’s track was “Cosmic Circuit”.
If you thought last month’s track was long, just wait until you see this one. It’s over 60 meters, is incredibly curved, and overall builds on all the concepts of last month’s track.
Overall Results
During the April 2023 season, myself and my school’s club ran several models, and the results of them are showed here:
The best models ended up being the waypoint models, although there was not as much variety this season due to finals being an absolute mess.
I think the waypoint models went well because they expanded on my idea from last month of overfitting to the track without running into catastrophic forgetfulness (learnt this is also called catastrophic interference). I’ll detail below (under Model-By-Model Results) the process I used for this model.
Further, I learnt that training all at once tends to work a lot better. This was from a combination of testing and talking to professional DeepRacer racers. I have no idea why this is, but my guess is it takes time to set up and tear down the environment every time you clone a model, thus you have more training time in the environment if you don’t do iterations on your model (ie. train for 30min, clone, retrain, etc.).
Model-By-Model Results
For the model-by-model results, I’m only going through the best model because it’s the only one really different from the other models shown so far, and did really well.
This model was incredibly “inspired” from these sources:
- An Advanced Guide to AWS DeepRacer | by Daniel Gonzalez | Towards Data Science
- https://github.com/cdthompson/deepracer-k1999-race-lines
- dgnzlz/Capstone_AWS_DeepRacer: Code that was used in the Article “An Advanced Guide to AWS DeepRacer” (github.com)
waypoints_reward_v3: 200.118s, trained for 350min
import math
# Helper Functions
def dist_2_points(x1, x2, y1, y2):
return abs(abs(x1 - x2) ** 2 + abs(y1 - y2) ** 2) ** 0.5
def closest_2_racing_points_index(racing_coords, car_coords):
# Calculate all distances to racing points
distances = []
for i in range(len(racing_coords)):
distance = dist_2_points(
x1=racing_coords[i][0], x2=car_coords[0],
y1=racing_coords[i][1], y2=car_coords[1])
distances.append(distance)
# Get index of the closest racing point
closest_index = distances.index(min(distances))
# Get index of the second closest racing point
distances_no_closest = distances.copy()
distances_no_closest[closest_index] = 999
second_closest_index = distances_no_closest.index(
min(distances_no_closest))
return [closest_index, second_closest_index]
def dist_to_racing_line(closest_coords, second_closest_coords, car_coords):
# Calculate the distances between 2 closest racing points
a = abs(dist_2_points(x1=closest_coords[0],
x2=second_closest_coords[0],
y1=closest_coords[1],
y2=second_closest_coords[1]))
# Distances between car and closest and second closest racing point
b = abs(dist_2_points(x1=car_coords[0],
x2=closest_coords[0],
y1=car_coords[1],
y2=closest_coords[1]))
c = abs(dist_2_points(x1=car_coords[0],
x2=second_closest_coords[0],
y1=car_coords[1],
y2=second_closest_coords[1]))
# Calculate distance between car and racing line (goes through 2 closest racing points)
# try-except in case a=0 (rare bug in DeepRacer)
try:
distance = abs(-(a ** 4) + 2 * (a ** 2) * (b ** 2) + 2 * (a ** 2) * (c ** 2) -
(b ** 4) + 2 * (b ** 2) * (c ** 2) - (c ** 4)) ** 0.5 / (2 * a)
except:
distance = b
return distance
# Calculate which one of the closest racing points is the next one and which one the previous one
def next_prev_racing_point(closest_coords, second_closest_coords, car_coords, heading):
# Virtually set the car more into the heading direction
heading_vector = [math.cos(math.radians(
heading)), math.sin(math.radians(heading))]
new_car_coords = [car_coords[0] + heading_vector[0],
car_coords[1] + heading_vector[1]]
# Calculate distance from new car coords to 2 closest racing points
distance_closest_coords_new = dist_2_points(x1=new_car_coords[0],
x2=closest_coords[0],
y1=new_car_coords[1],
y2=closest_coords[1])
distance_second_closest_coords_new = dist_2_points(x1=new_car_coords[0],
x2=second_closest_coords[0],
y1=new_car_coords[1],
y2=second_closest_coords[1])
if distance_closest_coords_new <= distance_second_closest_coords_new:
next_point_coords = closest_coords
prev_point_coords = second_closest_coords
else:
next_point_coords = second_closest_coords
prev_point_coords = closest_coords
return [next_point_coords, prev_point_coords]
def racing_direction_diff(closest_coords, second_closest_coords, car_coords, heading):
# Calculate the direction of the center line based on the closest waypoints
next_point, prev_point = next_prev_racing_point(closest_coords,
second_closest_coords,
car_coords,
heading)
# Calculate the direction in radius, arctan2(dy, dx), the result is (-pi, pi) in radians
track_direction = math.atan2(
next_point[1] - prev_point[1], next_point[0] - prev_point[0])
# Convert to degree
track_direction = math.degrees(track_direction)
# Calculate the difference between the track direction and the heading direction of the car
direction_diff = abs(track_direction - heading)
if direction_diff > 180:
direction_diff = 360 - direction_diff
return direction_diff
# Gives back indexes that lie between start and end index of a cyclical list
# (start index is included, end index is not)
def indexes_cyclical(start, end, array_len):
if end < start:
end += array_len
return [index % array_len for index in range(start, end)]
def reward_function(params):
#################### RACING LINE ######################
# Optimal racing line for the Spain track
# Each row: [x,y,speed,timeFromPreviousPoint]
racing_track = [[ 4.54890313, -6.44875722],
[ 5.05165897, -6.44385513],
[ 5.554082 , -6.44304691],
[ 6.05610054, -6.44723562],
[ 6.5576385 , -6.45738498],
[ 7.05859332, -6.47479244],
[ 7.55879317, -6.50162204],
[ 8.04497453, -6.53921751],
[ 8.50318492, -6.52099515],
[ 8.91096545, -6.4068567 ],
[ 9.24210041, -6.17519064],
[ 9.43668699, -5.79829582],
[ 9.54356453, -5.35613339],
[ 9.5896981 , -4.87768515],
[ 9.57467067, -4.38917939],
[ 9.4775016 , -3.93378988],
[ 9.28525264, -3.53428062],
[ 8.97978655, -3.22286991],
[ 8.59726154, -2.98867502],
[ 8.16326423, -2.81767509],
[ 7.69676479, -2.69471612],
[ 7.21504998, -2.59782398],
[ 6.73583535, -2.50472179],
[ 6.26389274, -2.39591295],
[ 5.80658119, -2.25568778],
[ 5.37140295, -2.06882179],
[ 4.96938687, -1.81582586],
[ 4.58467718, -1.53009182],
[ 4.21539479, -1.21582692],
[ 3.85208104, -0.86983916],
[ 3.45397054, -0.57473649],
[ 3.02206937, -0.38485921],
[ 2.59021732, -0.339005 ],
[ 2.20259901, -0.45931887],
[ 1.92439005, -0.76296962],
[ 1.74033474, -1.16352924],
[ 1.625075 , -1.61912237],
[ 1.56067536, -2.10281438],
[ 1.40496425, -2.52104036],
[ 1.14427587, -2.84767335],
[ 0.76380604, -3.03514398],
[ 0.32539952, -3.13986039],
[-0.1527899 , -3.18673646],
[-0.64884156, -3.17887599],
[-1.11793827, -3.09630667],
[-1.52836296, -2.9177963 ],
[-1.84865935, -2.64034638],
[-2.03155846, -2.26514463],
[-2.08953608, -1.83601305],
[-2.04669987, -1.38053226],
[-1.93646002, -0.91550601],
[-1.88391975, -0.44352611],
[-1.89489198, 0.01735881],
[-1.97849538, 0.4615439 ],
[-2.14719495, 0.8806741 ],
[-2.29189325, 1.31569104],
[-2.39962063, 1.75971008],
[-2.45878045, 2.21352648],
[-2.4459834 , 2.67714134],
[-2.38068466, 3.14415495],
[-2.27017599, 3.61159909],
[-2.12341404, 4.05468756],
[-2.03472357, 4.49670829],
[-2.05101322, 4.92852081],
[-2.21128444, 5.32362002],
[-2.49837028, 5.65819641],
[-2.86982657, 5.93381097],
[-3.29765005, 6.15800205],
[-3.76132717, 6.33029282],
[-4.23720355, 6.43871592],
[-4.71167906, 6.48048032],
[-5.17620402, 6.45162423],
[-5.62246821, 6.34848961],
[-6.03991388, 6.16452904],
[-6.40992677, 5.88777729],
[-6.73909633, 5.55081319],
[-7.03389887, 5.17086101],
[-7.30266889, 4.76193952],
[-7.55599451, 4.33716297],
[-7.79802882, 3.91685785],
[-8.04547401, 3.50097327],
[-8.30334747, 3.09313437],
[-8.5782074 , 2.69697189],
[-8.88037252, 2.31537652],
[-9.12063863, 1.91903065],
[-9.27153402, 1.51483425],
[-9.30636123, 1.11570727],
[-9.20244495, 0.74288468],
[-8.91273262, 0.441969 ],
[-8.51697717, 0.20002945],
[-8.05957947, 0.0198087 ],
[-7.58424634, -0.11235687],
[-7.13902986, -0.2561167 ],
[-6.72059662, -0.43966929],
[-6.34570692, -0.68344445],
[-6.07089921, -1.03641696],
[-5.88629669, -1.45318631],
[-5.7683417 , -1.9015999 ],
[-5.68439421, -2.36274215],
[-5.59137244, -2.83361104],
[-5.48526404, -3.30000114],
[-5.36001557, -3.759273 ],
[-5.20849686, -4.20764804],
[-5.02207248, -4.63911875],
[-4.79223894, -5.04425134],
[-4.51861421, -5.41453445],
[-4.20256348, -5.73897292],
[-3.85016375, -6.00539717],
[-3.47334352, -6.20555221],
[-3.0845406 , -6.35340245],
[-2.68691896, -6.46202554],
[-2.27939497, -6.53891722],
[-1.85901209, -6.58974026],
[-1.42195032, -6.61919628],
[-0.96391017, -6.63072373],
[-0.48326961, -6.62769882],
[ 0.01473693, -6.61330234],
[ 0.51834964, -6.59162115],
[ 1.02274509, -6.56641018],
[ 1.52704741, -6.54283508],
[ 2.03119684, -6.52106523],
[ 2.5351741 , -6.50134497],
[ 3.0389574 , -6.48394949],
[ 3.54252381, -6.46916895],
[ 4.04584759, -6.45732897],
[ 4.54890313, -6.44875722]]
################## INPUT PARAMETERS ###################
# Read all input parameters
all_wheels_on_track = params['all_wheels_on_track']
x = params['x']
y = params['y']
progress = params['progress']
speed = params['speed']
track_width = params['track_width']
############### OPTIMAL X,Y,SPEED,TIME ################
# Get closest indexes for racing line (and distances to all points on racing line)
closest_index, second_closest_index = closest_2_racing_points_index(
racing_track, [x, y])
# Get optimal [x, y, speed, time] for closest and second closest index
optimals = racing_track[closest_index]
optimals_second = racing_track[second_closest_index]
################ REWARD AND PUNISHMENT ################
# Define the default reward ##
reward = 1
# Reward if car goes close to optimal racing line ##
DISTANCE_MULTIPLE = 1
dist = dist_to_racing_line(optimals[0:2], optimals_second[0:2], [x, y])
distance_reward = max(1e-3, 1 - (dist / (track_width * 0.5)))
reward += distance_reward * DISTANCE_MULTIPLE
# Reward if speed is close to optimal speed ##
SPEED_MULTIPLE = 2
if speed > 0.8:
speed_reward = speed
else:
speed_reward = 1e-3
reward += speed_reward * SPEED_MULTIPLE
if progress == 100:
finish_reward = 100
else:
finish_reward = 0
reward += finish_reward
# Zero reward if off track #
if not all_wheels_on_track:
reward = 1e-3
# Always return a float value
return float(reward)
This function was very well explained by Daniel Gonzalez in his article, so I won’t go super deep into the code, however I will give an overview of how it works, how I implemented it, and my modifications to it.
This model works surprisingly similar to my research model, where it tries to stay close to the line and go fast. However, instead of staying close to the center line (like in my research model) it stays close to the pre-calculated racing line. This makes it so it’s trying to follow the optimal path around the track, which I believe to be the best way to “overfit” to the track. The distance to the optimal path takes a lot of math, hence the longer reward function.
I implemented this by first calculating the racing line in Jupyter Notebook (code in the links above), however I used a reduced track width because with only being able to train for 10 hours max, I was worried about the car going off the track too much.
My main modifications to the code were removing the timestamps and editing the speed multiple. I removed the timestamps because while they sounded really useful in the article, I found them to not accurately reflect the best times shown on the leaderboard, and the calculations where not coherant to DeepRacer’s Student League in my opinion. I edited the speed multiple to be higher because during initial testing, I found the car to be tracking the racing line very well, this the only area to improve is the speed. I found upping the speed did increase the overall time, thus I left it raised.
My Gameplan for Next Season
1. Set up local training.
I thought I was going to do this last month, but finals where wild and every time I tried to get this set up it didn’t work. It’s a little frustrating, but I’m sure I’ll be able to set it up this month. Like I said last article, all the top racers that I’ve spoken to told me that they tested and trained stuff on their home computer, before retraining it in the student league. This gives them a big advantage as they have a lot more data about their models, and is something I’m looking forward to doing.
2. Plan out the different types of models I want to try, some of which include:
a) Waypoints Model: According to other racers, the waypoints this last track were actually messed up, which would explain why the waypoints model did so incredibly poorly
b) Current Models That Worked: I want to try out the models that worked well this season to see if it was a fluke, or if they actually worked
c ) Current Models That Didn’t Work: I want to try out the models that did terrible this season to also see if it was a fluke, or if they are built on bad principles
3. Test out my best models locally with different time intervals and hyperparameters.
When training locally, I want to figure out the best amount of time to train a model to see if it is catastrophic learning creating the problems above, or if it’s something completely different. Further, when I say hyperparameters, I don’t mean stuff like RNN size (hyperparameters refers to a very specific set of things in the professional league). I mean stuff like adjusting constants, and the automated testing of said constants.
4. If I don’t have enough time to do local training as detailed above, I would love to try out the waypoints model for the full 10 hours and see how it does. It’s a bit of a “hail mary” play, but I think it could be fun.