How Perception Stack Works in Autonomous Driving Systems

A General Framework for Perception — an Introduction to Self-Driving Cars (Part 5)

Published in

Self-Driving Cars

8 min readApr 27, 2022

In previous part of the Introduction to Self-Driving Car series, we discussed a core visual functionality of perception stack in an autonomous vehicle (AV): computer vision. In it we focused on perception sensing that involves data collection from vehicle sensors and the processing of this data into an understanding of the world around the vehicle — much like the sense of sight in a human driver.

Perception of the environment is indeed a crucial task in the pipeline to enable autonomous driving. By facilitating the perceptional sensors, such as camera, lidar, and radar, a vehicle is able to localize it self inside a static environment map. In this article, we will zoom out to better understand a higher level view of the perception stack, along with its fusion strategy to detect and classify the traffic participants in its surroundings in order to navigate safely. As different sensors possess individual strengths and weaknesses, the fusion of AV signals would facilitate a higher detection quality.

Image source: Udacity’s Self-Driving Fundamentals featuring Apollo

There are 4 core tasks in a self-driving software used to perceive the world around it:

Detection to recognize and figure out where an object…

How Perception Stack Works in Autonomous Driving Systems

A General Framework for Perception — an Introduction to Self-Driving Cars (Part 5)

Written by Moorissa Tjokro