PRANK: Motion Prediction based on RANKing

Yandex Self-Driving Team
Yandex Self-Driving Group
4 min readDec 17, 2020

Predicting trajectories of other moving objects in traffic is crucial for a safe and comfortable driving. The safety and comfort of a passenger in a driverless car depend on how well it can solve the motion prediction problem. The variance in potential motion paths of any given object is almost infinite. And this is what makes this problem so challenging. This extreme degree of variance is caused, on the one hand, by uncertainty of the object’s intent: will it turn right or left, or continue moving along the same path? And on the other hand, it is further complicated by the uncertainty in how this intent might materialize: if turning right, which lane will it choose?

The motion prediction problem in self-driving is generally approached from two angles: predicting the intent of an object, or predicting its future trajectory. Intent-based approaches provide useful information to the motion planning system of a self-driving vehicle, but they don’t specify exactly which trajectories the object might take. That’s why approaches of this type are often completely disregarded in favor of the methods for predicting motion trajectory, or used in combination with these methods.

Motion trajectory prediction methods can predict either the most likely future trajectories of an object, or a probability distribution over possible object trajectories, or a set of distributions over object location at each point in time. While these methods can potentially capture the complex nature of future motion prediction, the cost of sophisticated training and inference procedures involved in the process is very high. Conversely, a relatively simple generative model is good at capturing frequently occurring motion patterns, such as forward movement or smooth turns, but it does not perform so well in complex scenarios.

Yandex’s Self-Driving team aims to devise a computationally efficient method that would be able to overcome these limitations and predict sophisticated maneuvers at a low cost.

Main idea: a scoring approach

One of the approaches to future trajectory prediction we have developed is called PRANK, which stands for motion Prediction based on RANKing. We’ve published a paper summarizing our method at NeurIPS 2020. The key idea of this method is that rather than synthesizing predictions by a neural network from scratch, we can select them from a very large set of possible motion trajectories that have been observed by the perception system running on our fleet. This approach has several advantages over traditional generative modeling:

  • The trajectories we predict have actually been followed by some real cars, so our predictions tend to be physically plausible;
  • Our trajectory set contains a lot of complicated maneuvers, which can get picked by the model;
  • It is generally accepted that grading a solution is often computationally simpler than generating one, so our neural network might be solving an easier problem and, thus, can train better.

A few qualitative examples of PRANK performance can be seen in subsequent images. In these images we show trajectories from the trajectory set with transparency representing their posterior probabilities. Red and blue colors are used to show different modes according to our model and the green line shows the ground truth trajectory. It can be seen that PRANK is able to successfully capture multimodal posterior distributions, where modes are produced by the uncertainty in the desired direction of movement or the planned speed along that direction.

Left: uncertainty between a possible U-turn and a left turn, with a lot of variance in how the maneuver is going to be implemented. Right: staying in a slow lane vs changing to a lane with more space.
Left: yielding to a pedestrian vs trying to go first. Middle: yielding to a car vs going first. Right: stopping on yellow vs trying to pass before the red.

As for computational efficiency, we structure our neural network in such a way (borrowed from computer vision and NLP) that picking the right trajectory out of millions of candidates can be performed in a matter of milliseconds by using approximate nearest neighbor search methods.

PRANK is able to produce predictions for 50 objects at a rate of 10Hz, with the major proportion of time spent on PRANK-agnostic components of the method, such as scene encoding and feature rasterization.

In order to compare the performance of PRANK with other methods, we evaluated it on the public Argoverse challenge created by our colleagues at Argo AI. In terms of displacement error metrics, at the time of the research paper publication our method scored in the top-3 out of a hundred entries on this dataset. This is despite using a rather simple rasterized representation of the scene, in contrast to other top-scoring entries that use a more sophisticated vector scene description.

As any technology, PRANK has space for improvement and we keep on making its predictions more and more precise and robust, as well as exploring more sophisticated scene representation methods and inference procedures. In the meanwhile, our fleet of over a hundred cars is always on the road, gathering new trajectories for PRANK 24/7 in the highly dense traffic of Moscow.

Prediction has proven to be an essential part of the self-driving pipeline for any type of vehicle — be it a car or delivery bot — that shares the road with people. PRANK has already been implemented in our vehicles, helping them to safely navigate the challenging urban traffic and provide smooth and pleasant rides with no one behind the wheel for our robotaxi clients in Innopolis. We believe that our future motion prediction method will ensure a safe and comfortable transportation system in each of our current testing locations and beyond.

Written and published by Yulia Shveyko

--

--