Programming a self-driving robot

10 min readAug 5, 2018

UBC Snowbots

Last year, I joined UBC Snowbots’ software team, an engineering student design team at the University of British Columbia.

What does Snowbots do?

Each year, we apply our knowledge and creative talents to build robots capable of navigating challenging terrain without human input, and compete against robotics teams from around the world. http://snowbots.ca/

As I was transferring from Computer Engineering to Computer Science, joining Snowbots was a way for me to continue engineering related work. I was also interested in practical applications of machine learning, and autonomous navigation technology seemed to be a perfect fit.

Team organization

Building a robot capable of self-navigation requires a cross-disciplinary effort, mainly between the software team and the mechanical team.

The software team

The software team is responsible for developing an autonomous navigation technology; we write code to take in information about the robot’s surroundings and makes decisions about where the robot should go next.

Members: Gareth Ellis, Robyn Castro, Raad Khan, Chris Heathe, William Gu, Marinah Zhao, Marcus Swift, Martin Freeman

The mechanical team

The mechanical team takes care of designing, machining, and constructing the robot. They design the drive-train to make sure that the robot has enough power to make it up hills and across tricky terrain.

Members: Emma Park, Sherry Wang, David Gill, Ivy Shi, Geoffrey Hanks, Collin Lucas Eng, Leo Wei, Remy Zhang, Arman Zhakypbekov

The competition: IGVC 2018

IGVC 2018 obstacle course feat. Elsa X, our robot

This year, we attended the Intelligent Ground Vehicle Competition (IGVC) in Michigan, Rochester at the Oakland University.

Each participating team programs and builds a robot for roughly 8 months (two school semesters), and they all come together to present their work and demonstrate the performance of their robot by competing in a self-navigation obstacle course.

The obstacle course consists of white lanes and cones with various colours. The robot needs to stay within the lanes and steer around cones while traveling towards its goal.

The Software Side

Our software is developed in C++ using the Robot Operating System (ROS) framework (http://www.ros.org/).

The Software Stack

In order to navigate through an obstacle course, the robot needs to know information about the surrounding. Our robot has 5 sensors for just that:

GPS, IMU, and encoder for determining position of the robot in the world.

GPS estimates the longitude and latitude of the position of the robot, IMU (short for inertial measurement unit) measures the robot’s specific force, angular rate, and magnetic field, and encoder counts wheel rotations (serving as odometer). None of these sensors are that accurate; therefore, the output from each of the three sensors are passed into Extended Kalman filter (EKF) to generate the final estimate of the robot’s position.

SICK LiDAR— a sensor that shoots a laser in each angle and determines the distance from the sensor to the closest obstacle for each angle.
INTEL Realsense Camera (https://www.intel.ca/content/www/ca/en/architecture-and-technology/realsense-overview.html) — a sensor that generates a coloured point cloud (3D depth map) of the area captured by the camera.

Then, our software stack processes the information about the surrounding world to compute where the robot should go next.

Full stack explained step by step, with visualizations

So, what does it look like in practice? Let’s briefly examine each processing step through a simulation.

1. First, the robot takes in a point cloud from the Realsense camera.

(Left) Elsa X in a simulation course, (Right) visualization of incoming Point Cloud from its sensor

This is the initial step; our stack takes in a point cloud of the view in front of the robot to calculate where the lanes are.

A point cloud is a 3D map of the environment captured by the Realsense. Each pixel in the map contains a 3D coordinate of its position relative to the camera, as well as the colour associated to that pixel.

2. Then, the lanes are filtered out of the point cloud.

(Right) Everything but the lanes is filtered out from the point cloud

In this step, the lanes are filtered out from the point cloud generated by the Realsense camera. This is done through HSV filtering, which filters all points that are not white; followed by height filtering, which filters out anything above the ground (since the lanes are positioned on the ground).

We want to extract the lanes because eventually we want to ensure that the robot navigates to its goal while staying within the lanes.

3. The filtered poinIt cloud is clustered into distinct lanes

We have filtered out the lanes, but we still don’t know the answer to critical questions such as how many lanes there are, or which points belong to which lane. As far as what the computer sees, they’re still just a bunch of points in 2D space.

In this step, the points are clustered into distinct lanes using unsupervised learning.

In the illustration above, the blue cluster represents the left lane, and the red represents the right lane.

4. After the point cloud has been clustered into distinct lanes, a mathematical line is computed to represent each lane.

The turquoise lines represent the polynomial representation of each lane

After clustering the points into distinct lanes, we need to produce a suitable representation of the lanes so that the next processing steps can easily determine the position of the lanes relative to the robot.

In this step, we simplify the representation of the lanes from a cluster of points into a polynomial equation — replacing tens of thousands of points with a couple of coefficients of the polynomial equation. This is beneficial for two main reasons: the polynomial equation is much easier and simpler to work with in the future processing steps, and the regression helps smooth out the line.

5. Cones extracted from the LIDAR sensor.

(Right) The green square represents the cone in front of the robot

Cones are extracted from the LiDAR sensor by using a circle fitting algorithm, since the cones are the only cylindrical objects in the course.

6. All lanes and cones that are extracted from the sensors get registered to the obstacle map.

(Right) The grey square is the obstacle map. The darker the pixels are, the higher the chance of the presence of an obstacle.

When the polynomial line for the lanes and the position of the cones are extracted from the sensors, they are registered into our one and only obstacle map. Then the map gets passed into our path finder, which uses it to calculate the shortest path towards the goal.

7. Finally, the shortest path to goal is calculated based on the obstacle map

(Right) The shortest path towards the goal (off the map) is represented by the purple line

Given the obstacle map, Dijkstra’s algorithm is used to calculate the shortest path from the current position of the robot to the goal. The path is then used to generate a message consisting of velocity and acceleration for the robot’s motor.

My contribution to the software stack

Our software stack was developed by our entire software team — but what was my role in it?

Lane extraction

My first major contribution to the software stack was the lane extraction. As mentioned previously, lane extraction consists of two steps — clustering the points into distinct lanes and calculating the polynomial equation for each lane.

For clustering, I decided to use density-based spatial clustering of applications with noise (DBSCAN). DBSCAN is a clustering algorithm that clusters points that are densely grouped together. I chose this over other clustering algorithms (e.g. k-means) for two main advantages: it can cluster convex shapes and it is robust to outliers since it filters out points that are not closely packed together.

Then, non-linear regression is performed to calculate a polynomial equation for each cluster.

(Left) Initial point cloud, (Middle) Point cloud clustered into distinct lanes, (Right) Polynomial equation calculated for each cluster

Path finding

I also implemented the path finding module, which finds the shortest path between the position of the robot to the goal given an obstacle map.

In path finding, there are two popular algorithms: A* and Dijkstra’s. The two are actually quite similar. Dijkstra’s is the generic path finding algorithm, and A* is a modification of Dijkstra’s that has a bias towards nodes that are closer to the goal when choosing the next node in a path.

(Left) Dijkstra’s, (Right) A*, taken from Wikipedia

Exaggerated illustration of A* vs Dijkstra’s

A* is usually faster than Dijkstra’s, because it explores less nodes by greedily selecting the nodes that are closer to the goal. However, this did not always result in the desired behaviour.

In A*, the path tries to go as straight as possible towards the goal while avoiding obstacles. As illustrated in the diagram above, the resulting path ends up very close to the edge of the lane.

This is dangerous because the closer the robot is to the lane, the higher the probability that it will cross the lane. This is especially problematic because the localization of the robot is highly unreliable. As mentioned before, the position of the robot is determined using three sensors: GPS, IMU, and encoders, all of which are generally inaccurate. When the robot is close to the lane, small errors due to localization can easily mistaken the robot by thinking it’s on the other side of the lane. Therefore, the further away the robot is from the obstacles, the more room we can tolerate for error in localization.

However, although picking Dijkstra’s over A* was an improvement, this choice was still not enough to prevent generating paths that were really close to obstacles.

Obstacle manager (just a few tweaks)

In order to fully solve the problem, I decided to make some tweaks to the obstacle map generated by the obstacle manager.

The shortest path generated (purple line) is extremely close to the left lane (black line in the left)

As you can see in the simulation above, the path generated is extremely close to the left lane. This occurs because the goal is somewhere to the left of the left lane — and the shortest way to get there is to go around the left lane as tightly as possible, which is technically the expected behaviour.

First solution: Just inflate the obstacles on the map!

The path generated will still be right on the edge of the obstacles on the map, except the obstacle is a lot wider than the actual lane. This would result in a path that would naturally be generated further away from the actual lane.

This works fine but not without a major trade-off: the robot won’t be able to go through tight spaces between obstacles because the obstacles will appear to be a lot wider than they actually are.

Robot cannot go through between the cones because it thinks they are a lot bigger than they actually are

Final solution: Represent obstacles as probability distributions, instead of binaries.

The final solution we chose was to represent obstacles as a probability, (i.e. how likely is it that this space is occupied?), instead of binary (i.e. is this space occupied or is it free?). Then, we inflate the obstacles on the obstacle map in such a way that the probability of an obstacle is inversely proportional to the distance from the obstacle.

This solution still has the benefit of the first solution: because we inflated the obstacles on the map, the shortest path generated will keep more distance from the lanes.

The path generated is further away from the lane thanks to obstacle inflation

The main advantage of this solution is that the path finding algorithm can leverage this information to determine if the risk of going through an obstacle is worth the distance to the goal, instead of simply ruling it out. In essence, the algorithm will allow going through regions where there is a probability of an obstacle if the probability isn’t too high, and it is a much faster way to the goal than its alternatives. This is important because it allows the robot to navigate through narrow passages even after inflating the obstacles.

Path is generated in between obstacles even though there is a possibility of an obstacle, because it is shorter to the goal than its risk-free alternatives

Full simulation 🎉

And here it is in action!

Final words

I have learned a lot from applying machine learning algorithms for our application. It wasn’t just rewriting the pseudo-code of the algorithm — a lot of modifications needed to be made for them to work well with our specific set of requirements. For example, we needed to make adjustments to our path finding algorithm to return an optimal path that is both short and far away from the obstacles. In order to be able to make those modifications, I was forced to learn more about them and really understand what is going on under the hood.

It was definitely a big challenge to design and implement a robust large scale system in a limited amount of time and resources. Regardless, I’m proud of our team’s output and it’s been a great learning experience.

If you are at UBC and you are interested in doing this kind of work, we will be opening applications on our website in September!

You can also access the Github repository here.

Min Gyo Kim (UBC Computer Science ’19) contributed to developing the software of a self-driving robot. You can follow him on Medium or LinkedIn. To learn more about UBC Snowbots, you can visit their website.