Unlocking the Potential: How Computer Vision Is Revolutionizing Autonomous Vehicle
What is Computer Vision?
Computer Vision vs. Human Eyes
Despite computer vision and human eyes can both recognize and analyze a picture or video, there are huge differences in the process of how it’s done. However, this doesn’t mean both don’t share any similarities, some similarities they share include both process what they see based on previous knowledge, and both need lights to do so.
The differences, on the other hand, includes human uses eyes to see and the brain to process whereas computer vision uses cameras or lens to see and machine learning to process.
- Camera vs Human Eyes: When humans look at something there is a central focus, which is a small area, and the other things away from the focus will gradually increase out of focus, whereas for cameras all pixels are in the same focus. (See the example picture below).
- Human Brain vs Machine Learning: How the human brain process is complicated, we have identification, recognition, imagination, and much more. Whereas ML only focuses on recognition and processes through the ML model that undergoes training.
How Does Computer Vision (CV) Work?
In the same way that AI allows computers to think, CV allows computers to see. Computer vision is a multidisciplinary field that involves the development of algorithms and techniques to enable computers to extract, analyze, and understand information from digital images or videos in a manner similar to human vision. While the specifics can be complex, the basic process of how computer vision works can be broken down into several key steps:
- Acquisition: The process begins with the acquisition of images or videos using various devices such as cameras, sensors, or other imaging equipment. These devices capture visual data from the real world, which is then converted into digital signals that can be processed by a computer.
- Preprocessing: Once the images are acquired, they often undergo preprocessing steps to enhance the quality of the data. This can include tasks such as noise reduction, image enhancement, and normalization, which aim to improve the accuracy and reliability of subsequent analysis.
- Analysis and Understanding: The core of computer vision is the analyzing and understanding of visual data. This involves the application of various algorithms that can detect patterns, objects, and features within the images. Techniques such as edge detection, feature extraction, and pattern recognition are employed to identify and categorize different elements present in the image.
- Feature Detection and Representation: Computer vision algorithms extract meaningful features from the images, which can include edges, corners, textures, shapes, and colors.
- Image Classification and Recognition: Once the features are detected and represented, computer vision systems use machine learning algorithms, such as neural networks or support vector machines, to classify and recognize objects or patterns within the images.
- Interpretation and Decision Making: After the objects are recognized, computer vision systems can interpret the context of the visual data and make decisions based on the identified information.
- Output and Action: The final step involves generating an output based on the analysis and interpretation of the visual data. This can range from generating visual representations or annotations to triggering specific actions or responses, such as autonomous vehicle navigation, controlling robotic movements, or providing insights for decision-making in various applications.
Through those steps and the continuous refinement of algorithms and techniques, computer vision systems can achieve a level of understanding and analysis of visual data that enables them to perform tasks ranging from simple image recognition to complex real-time decision-making in various practical applications. Notably, this advancement has resulted in the crucial application of computer vision in the realm of autonomous vehicles.
Autonomous Vehicles
Autonomous Vehicles, also known as self-driving cars, rely heavily on CV. It provides many benefits and can be proven to be better than humans in many ways. It not only will never get drunk, tired, and distracted, but it also uses advanced machine learning, sensor fusion, and computer vision which enables autonomous vehicles to make quicker and better decisions than humans.
While the functioning and processes of autonomous vehicles are complex, they can be categorized into three key components: sensing, planning, and acting. This system parallels the way humans process information, yet the approach employed by autonomous vehicles markedly differs from human cognitive processes.
Prior to deployment, it is essential to establish predefined rules for the autonomous vehicle (AV) and provide comprehensive training using a dataset with minimal bias. This ensures that the autonomous vehicle is well-informed about the permissible actions on the road, guaranteeing the safety of all individuals involved.
Sensing
The initial step is sensing, the sensing system is responsible for creating a detailed and accurate representation of the vehicle’s immediate surroundings. The sensing process is the foundational process which makes it vital as all the other steps are based on this. This process requires sensors, mapping and localization technologies, and computing hardware.
- Sensor: It uses a comprehensive array of sensors, including cameras, LiDAR, and ultrasonic sensors, which are vital for gathering real-time data about the vehicle’s surroundings. It provides essential information about obstacles, road conditions, and other vehicles, enabling the autonomous vehicle to make informed decisions.
- Mapping and Localization Technology: High-precision mapping and localization technology, including GPS, HD mapping, and SLAM, are essential for accurately determining the vehicle’s position, and enabling precise navigation in complex urban environments.
- Computing Hardware: Powerful onboard computing systems equipped with advanced processors and high-performance GPUs (Graphics Processing Units) are essential for processing the vast amount of data collected by the vehicle’s sensors.
This part of AV requires the first five steps of the CV: Acquisition, preprocessing, image analysis and understanding, feature detection and representation, and image classification and recognition.
Planning
The planning phase uses the data collected by the sensing system to formulate an appropriate course of action. The planning system must account for various factors such as traffic conditions, road rules, and potential hazards. Software and algorithms, and connectivity are needed in this stage.
- Software and Algorithms: Strong software programs and algorithms are fundamental for interpreting sensor data, and making navigation decisions. These software components include perception, mapping, path planning, and decision-making algorithms that enable the vehicle to navigate and operate autonomously.
- Connectivity: AVs often require a reliable network connection to access real-time traffic data, and update navigation maps. Connectivity allows the vehicle to stay updated with the latest information, enhancing its navigation capabilities and ensuring better decision-making.
The process from the CV that is required in this step is interpretation and decision-making.
Acting
The acting phase is the execution of the planned actions based on the decisions made during the planning stage. This includes controlling the vehicle’s acceleration, braking, and steering mechanisms. The acting system ensures that the vehicle adheres to the planned trajectory, adjusting its speed according to the surrounding traffic conditions, and responding to any environmental obstacles or changes. This phase is crucial for ensuring the safe and efficient operation of the autonomous vehicle during its journey. This step involves the output and action phase of the Computer Vision.
The Next Step
Despite the promising advancements in artificial intelligence and computer vision, several key hurdles remain prevalent. Some next steps that need to be overcome include:
- Data gathering and labelling: AI’s solution is dependent on the dataset that they were trained on. Given that, quality datasets and pixel-perfect labelling are of incremental value for the model.
- Object detection and tracking: Convolutional neural network (CNN) is used to detect and track objects, but it’s not the best solution for images with multiple objects, as the model is likely not to capture all objects.
- Environmental factors and road conditions: Despite CV having LiDAR and 3D maps as real-time object detectors, the potential obstructions of traffic lights or signs, such as dirt, shadows caused by trees, or alterations, can potentially disrupt the accuracy of the vehicle’s technology.
- Stereo vision and depth estimation: This will heighten the complexity of the depth estimation system, lead to challenges in camera arrangement, result in unmatched representation, and cause distorted perspectives.
In the realm of advanced technology, Computer Vision has emerged as an innovative force, reshaping various industries and paving the way for enhanced efficiency and comprehension through its progressive capabilities. It relies on cameras and machine learning algorithms. Despite the remarkable strides in this field, there is still room for improvement, underscoring the need for ongoing research and development in the domain.
Reference
Peter Reid. “The Difference Between Computer Vision and Human Vision.” VisionAI, 27 May 2022, https://visionaisuite.net/blog/the-difference-between-computer-vision-and-human-vision#:~:text=Human%20vision%20requires%20coordination%20of,such%20as%20photos%20and%20videos.
Jianfeng Zhao, Bodong Liang, Qiuxia Chen. “The key technology toward the self-driving car.” Emerald Insight, 2 January 2018, https://www.emerald.com/insight/content/doi/10.1108/IJIUS-08-2017-0008/full/html
https://www.ibm.com/ca-en. “What is computer vision?” IBM, https://www.ibm.com/topics/computer-vision, Accessed 19 Oct. 2023.
https://www.superannotate.com/. “Computer vision challenges in autonomous vehicles: The future of AI.” Super Annotate, February 21, 2023, https://www.superannotate.com/blog/computer-vision-in-autonomous-vehicles
Read, J. “Stereo vision and strabismus.” Eye, 05 December 2014, https://doi.org/10.1038/eye.2014.279
Hi! I’m Dylan, an enthusiastic and curious high school student who is passionate about using cutting-edge technologies to create a positive impact. Whether you have suggestions, questions, critiques, or simply wish to have a conversation, please feel free to reach out to me via email at dy.dylanyang@gmail.com. Thank you for taking the time to read my message, and I sincerely hope you found it informative and engaging.
