Navigating the Future: The Role of Prediction Software in Autonomous Driving

5 min readJan 17, 2024

Prediction: A Pivotal Challenge in Autonomous Driving

As autonomous driving technology strides forward, one of the most complex challenges limiting the scalability of the technology remains the development of robust prediction algorithms. These algorithms are responsible for anticipating the future movements of other road users, including vehicles, cyclists, and pedestrians, which is essential for safe and efficient navigation. Experts in the field have ranked prediction as one of the most critical components for autonomous driving.

The Complex Nature of Prediction in Autonomous Driving

Due to the overwhelmingly complex nature of the real world and many possible behaviors that road users can execute, including corner cases such as violation of traffic rules, finding a scalable approach to behavior prediction is a big challenge. In addition, the agents can influence one another, increasing the number of possible future scenarios.

To effectively predict the behaviors of various road users, autonomous systems must consider a range of factors, from immediate physical attributes like position, velocity, and agent’s acceleration abilities to more complex aspects such as historical behavior patterns, surrounding scenarios, and map features.

Approaches to Model and Predict Human Behaviour

The complex nature of the problem at hand requires a combination of machine learning and domain knowledge to tackle the challenge. In the following, I’d like to provide an overview of the existing strategies.

  1. Physical Rules and Object Attributes

The simplest way to represent the immediate future actions of road users surrounding an autonomous vehicle is to consider physical rules along with the type and attributes of the object. For example, some robotics systems assume that the surrounding object can accelerate and decelerate with the highest possible rate at every point in time, resulting in a number of possible end states the object can reach.

2. Advanced Traffic Simulators

Some traffic simulators go a step further by adding rule-based behavioral models that describe how a certain agent type would act in response to the environment and other agents, such as car-following models and lane-changing behavior. Such approaches usually require fine-tuning the parameters for every use case, limiting their generalizability.

3. Machine Learning Models for Prediction

Similar to other components of the autonomous driving stack, data-based approaches have made significant progress over the last few years.

There are two major classes of models used for the prediction:

  • Memoryless models: The information about previous states should be fed into the memory-less models in the form of features.
  • Memory models such as RNN and LSTM: Memory models use only the current state as an input for the prediction while “remembering” the historical information.

Traffic prediction using machine learning models has been approached as a two-step problem:

Step 1: Classification problem for determining spatial goals and corresponding categorical behaviors (like lane changes or turns). This step requires structuring the environment through, for instance, the introduction of line segments.

Step 2: Regression problem for generating future trajectories to reach previously determined goals.

In more recent works, some approaches adaptively combine two steps into one utilizing direct-regression methods.

4. Estimating the Behaviour Based on Pose

Another way to enhance the realism of trajectory prediction of pedestrians is to reason about their future actions based on their pose. A recent study by researchers from TUM and DeepMind proposed a method that can accurately predict the next move based only on one picture of a human, which makes immediate predictions after occlusions possible.

More Realistic Modelling Of Human Behaviour Through Generative AI And Transformer Architecture

The latest advancements in autonomous driving prediction algorithms have seen a significant shift towards leveraging transformer architecture. Transformers is a breakthrough in machine learning originally developed to work with Sequence2Sequence problems like text generation, leading to such generative AI products as chatGPT.

  • Enhancement of individual agent’s behaviour prediction

The individual agent’s behavior can be enhanced through more realistic and efficient models based on transformer architectures. It allows for modelling motion prediction as a combined process of global intention localization and local movement refinement, combining previously described two steps into one and enhancing both efficiency and accuracy.

Another way to leverage generative AI is to model such attributes of a human’s planning process as anticipation. In addition to that, domain knowledge about human behavior can be incorporated into the models through approximation of concepts rooted in neuroscience, such as example, gaze direction or emotions.

I believe generative AI will further accelerate the development of human behavior prediction and context understanding technologies, leading to safer and more scalable autonomous driving. See another of my articles on this topic.

  • Joint prediction using transformer architectures

In addition to the advantages of the longer memory window and opportunities for better handling prediction on an individual level, the transformers allow to structure the behavior prediction problem as a “conversation” between road users, enabling a joint prediction that accounts for interactions between agents.

This method, detailed in studies by NVIDIA and Waymo researchers, can analyze historical data, traffic conditions, and map features, providing a comprehensive understanding of potential future movements. The transformer models excel in capturing the interactions between road users, allowing for modeling more realistic multi-agent traffic scenarios.

Advantages and Challenges of Machine Learning Models

The advantage of a machine learning-based model in behavior prediction and trajectory generation is that it could leverage historical actual trajectories and aim to replicate them. Provided there is enough diverse data to train the models, it could perform better than rule-based models.

The challenge of a data-based approach is to acquire enough data to train the models capable of generalizing to all corner cases. Generative AI can help by either producing adverse synthetic training data or by enabling more intelligent behavior models through novel architectures and unlocking additional data sources and types.

Conclusion

Humans’ ability to seamlessly anticipate other humans’ actions and negotiate access to the free space on the road is the key skill to safe and smooth driving. It is based not only on learning the driving task but also on knowledge acquired in other situations.

Similarly, autonomous vehicles must learn to navigate complex situations and understand humans to become truly scalable. This ability will be developed by leveraging different technologies and diverse data sources, combining the latest advancements in machine learning with an understanding of humans’ perception, word representation, and resulting actions.

The evolution of prediction software, from simple rule-based algorithms to sophisticated transformer models, reflects the growing challenge and importance of tackling it.

As autonomous vehicles enter the phase of broader adoption, scaling, and deployment in environments with people, the need for advanced prediction algorithms will only increase, making this field a crucial area of research and development in the journey toward fully autonomous transportation systems.

Thanks for giving a read and providing feedback on this article to Ahsan Ahmed and Shane Wade.

References

A. Dinparastdjadid et al., “Measuring Surprise in the Wild”, 2023;

B. Varadarajan et al., “Multipath++: Efficient Information Fusion And Trajectory Aggregation For Behavior Prediction”, 2021;

J. Ngiam et al., “Scene transformer: a unified architecture for predicting multiple agent trajectories”, 2022;

J. Philion et al., “Trajeglish: Learning the Language of Driving Scenarios”, 2023;

S. Liu et al., “Creating Autonomous Vehicle Systems, Second Edition”, 2022;

S. Shi et al., “Motion Transformer with Global Intention Localization and Local Movement Refinement”, 2023;

T. Salzmann et al., “Robots That Can See: Leveraging Human Pose for Trajectory Prediction”, 2023.

--

--

No responses yet