Perception in Self Driving Cars

Illuri Sandeep
Analytics Vidhya
Published in
10 min readApr 20, 2021

Perception in self driving cars is nothing but how the car senses the environment around it. It is the most important and complex thing. For humans it is very easy task to sense the environment around us, because we have eyes , ears etc and human intelligence, but for car it is most difficult and complex task. So in this article we will see how car can senses the world in which it is driving.

Before we discuss about the car we see how human senses the world. we people are equipped with eyes to see the environment, we have ears, noses, tactical sensors, internal sensors that can measure the deflexion of muscle with those sensors we can perceive the environment around us. By all these sensors we can do multiple things, this made all possible brain, our brain process data continuously. The vast majority of our brain is dedicated to the perception i.e for visual perception and the sub-conscious so that we can know where we are in the world. When coming to car they have somewhat different sensors they have cameras instead of eyes they also have some magic sensors like radar and lidar which can help in measuring the raw distances. so instead of knowing something in-front of me these sensors tells the exact distance in centimeters. So here the complex task involved is to take the huge amount of data from sensors and use the computer intelligence to evaluate data and make something meaning of it.

Computer vision :The computer vision technology is mostly used to see and understand the world. As humans we can automatically recognize the objects, images and also relationship between those objects. But for a computer images are just a collection of red, green, blue color values. The depth of these images are three stacked two-dimensional layers.

The self driving cars have four core tasks to perceive the world

1.Detection -it means finding out where an object exactly in environment.

2.Classification-means what exactly the object is.

3.Tracking -observing the moving objects like other vehicles, pedestrians etc

4.Segmentation-matching each pixel in an image with schematic category, such as road, sky, vehicles.

What is machine learning how it is used in self driving cars?

Machine learning is the field of computer science that uses special algorithms to train computers to learn from data. Often this learning is stored in a type of data structure called a model. There are many types of models infact a model is a data structure that can understand and make predictions about the world. Machine learning has arrived a way before 1970’s but for the last 20 years it came into existence due to a lot of development in computer hardware. For example doctors uses ml in medical diagnosis to assist.

Different types of machine learning are

1)Supervised machine learning : These algorithms use training data­set to learn. They keep learning until they reach the desired level that promises minimal errors. Supervised ML algorithms can further be categorized into classification, regression, and dimension reduction algorithms.

2) Unsupervised machine learning : Unsupervised Learning allow users to perform more complex processing tasks compared to supervised learning. Although, unsupervised learning can be more unpredictable compared with other natural learning methods.

3) Semi-supervised machine learning : It is a combination of supervised and unsupervised learning. It uses a small amount of labeled data and a large amount of unlabeled data, which provides the benefits of both unsupervised and supervised learning while avoiding the challenges of finding a large amount of labeled data. That means you can train a model to label data without having to use as much labeled training data.

4) Reinforcement learning: Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

Self-driving car Machine Learning algorithms are generally divided into four categories:

1) Regression Algorithms

Regression algorithms are used explicitly for predicting events. Bayesian regression, neural network regression, and decision forest regression are the three main types of regression algorithms used in self­-driving cars.

In regression analysis, the relationship between two or more variables is estimated, and the effects of the variables are compared on different scales. Regression analysis is mainly dependent on three core metrics:

  • The number of independent variables
  • The type of dependent variables
  • The shape of the regression line.

Regression algorithms use the repetitive aspects of an environment to form a statistical model of the relation between a particular image and the position of a specific object within the image. The statistical model can provide speedy online detection through image sampling. Gradually, it can extend to learn about other objects as well, without requiring substantial human intervention.

2) Pattern Recognition Algorithms (Classification)

Generally, the images obtained by the advanced driver-assistance systems (ADAS) are replete with an array of data from the surrounding environment. This data needs to be filtered to recognize the relevant images containing a specific category of objects. This is where pattern recognition algorithms enter.

Also known as data reduction algorithms, pattern recognition algorithms are designed to rule out unusual data points. Recognition of patterns in a data set is an essential step before classifying the objects.

These algorithms help in filtering the data obtained through the sensors by detecting object edges, and fitting line segments and circular arcs to the edges. Pattern recognition algorithms combine the line segments and circular arcs in many different ways to form the ultimate features for recognizing an object.

Support vector machines (SVM) with histograms of oriented gradients (HOG), principal component analysis (PCA), Bayes decision rule, and k-nearest neighbor (KNN) are some of the most commonly used pattern recognition algorithms in self-driving cars.

3) Cluster Algorithms

Cluster algorithms excel at discovering structure from data points. It may happen that the images obtained by the ADAS aren’t clear, or it may also occur that classification algorithms have missed identifying an object, thereby failing to classify and report it to the system.

This may happen due to the images being of very low-resolution or with very few data points. In such situations, it becomes difficult for the system to detect and locate objects in the surroundings.

Clustering algorithms define the class of problem and class of methods. Generally, clustering techniques are established using centroid-­based and hierarchical modeling approaches. All clustering techniques focus on leveraging the inherent structures in the data to best organize the data into groups having the greatest commonality.

K-means and multi-class neural networks are the two most widely used clustering algorithms for autonomous cars.

4) Decision Matrix Algorithms

Decision matrix algorithms are essentially used for decision making. They are designed for systematically identifying, analyzing, and rating the performance of relationships between sets of values and information in them. The most widely used decision matrix algorithms in autonomous cars are gradient boosting (GDM) and Ada-Boosting.

Neural Networks in self driving cars:

The Artificial neural network that used in cars are inspired from the biological neurons that comprise the human nervous system. Our biological neurons are connected together to form a network of neurons or a neural network. In a similar way we can connect the layers of artificial neurons to create neural networks for machine learning. Artificial Neural Network is a tool to learn complex patterns from data. Neural Networks are comprised of large no of neurons like neurons in the neurological systems of our bodies artificial neurons are responsible for delivering and processing information. These neurons are also to be trained. We can use these neurons to recognize these images as vehicles whether they are black or white or big or small. we can also distinguish between cars, pedestrians, traffic lights and even telegraph poles.

Radar and LiDAR

Radars have in automobiles for years. You can find them in systems like adaptive cruise control, blind spot warning, collision warning and collision avoidance. Even though Radar is a mature technology, it still gets improved all the time to make it even more powerful. While other sensors measure velocity by calculating the difference between two readings, Radar uses something called the Doppler effect to measure speed directly. The Doppler effect measures the change in frequency of the Radar waves based on whether the object is moving away from you or toward you. This is kind of like how a fire engine siren will sound differently depending on whether the fire engine is moving away from you or toward you. The Doppler effect is important for sensor fusion because it gives us the velocity as an independent measure parameter, and it makes the fusion algorithms converge much faster. Radar can also be used for localization by generating Radar maps of the environment. Because Radar waves bounce off hard waves bounce off hard surfaces, they can provide measurements to objects without direct line of flight. Radar can see underneath other vehicles, and spot buildings and objects that would be obscured otherwise. Of all the sensors on the car, Radar is the least affected by rain or fog and can have a wide field of view, about 150 degrees, or a long range, 200 plus meters. Compared to LiDARs and cameras, Radars have a low resolution. Especially in the vertical direction, the resolution is very limited. The lower resolution also means that reflections from static objects can cause problems. For example, manhole covers or a soda can lying on the street can have high Radar reflectivity even though they are relatively small.This is called Radar clutter, and it’s why current automotive Radars usually disregard static objects.

LiDAR stands for Light Detection and Ranging, just as Radar stands for Radio Detection and Ranging. Unlike Radar, which uses radio waves, LiDAR uses an infrared laser beam to determine the distance between the sensor and a nearby object. Most current LiDAR use light in the 900 nanometer wave length range, although some LiDARs use longer wave lengths, which perform better in rain and fog. In current LiDARs, a rotating swivel scans the laser beam across the field of view. The lasers are pulsed, and the pulses are reflected by objects. These reflections return a point cloud that represents these objects. LiDAR has a much higher spatial resolution than Radar because of the more focused laser beam, the larger number of scan layers in the vertical direction, and the high density of LiDAR points per layer. The current generation of LiDARs cannot measure the velocity of objects directly. And have to rely on the differing position between two or more scans. LiDARs are also more affected by weather conditions and by dirt on the sensor, which requires keeping them clean. They are also much bulkier than other sensors and therefore, more difficult to integrate unless one just wants to mount a big scanner on the roof of the vehicle.

Camera vs Lidar vs Radar

By seeing above image we can say that every sensor has some limits depending upon the vary in conditions. So here comes necessity of fusing data of these sensors and get accurate results. For fusing the sensors we use the Kalman filter algorithm.

Kalman filter algorithm : Kalman filter algo is a two step estimation problem. The first step is to predict the state and second step is update the measurements. It is the endless loop of prediction and update step.

Architecture

The general architecture of the perception module is shown:

The detailed perception modules are displayed below:

The perception module inputs are:

  • 128 channel LiDAR data
  • 16 channel LiDAR data
  • Radar data
  • Image data
  • Extrinsic parameters of radar sensor calibration (from YAML files)
  • Extrinsic and Intrinsic parameters of front camera calibration (from YAML files)
  • Velocity and Angular Velocity of host vehicle

The perception module outputs are:

  • The 3D obstacle tracks with the heading, velocity and classification information.
  • The output of traffic light detection and recognition.

I think this article gave some information about perception in self driving cars. If you want my further more articles on self driving car here is the link

--

--