Paper Review: Reciprocal Velocity Obstacles for Real-Time Multi-Agent Navigation

12 min readOct 18, 2018

(Left) Results from the original paper; (Right) Replicated results. Python code for modelling any number of agents is in this repo. And here are few results

This blog talks about my experience in understanding and implementing the paper: Reciprocal Velocity Obstacles for Real-Time Multi-Agent Navigation. This research was conducted by Jur van den Berg, Ming Lin, and Dinesh Manocha at the University of North Carolina at Chapel Hill, 2008.

This paper talks about multiple agents navigating an environment by avoiding local collisions without any explicit communication. Though there were similar approaches earlier, this approach is decentralized and also expects other agents to reciprocate(share the duty to avoid an obstacle). It also considers the dynamic nature of other agents in the environment.

Before we dive into understanding the Reciprocal velocity obstacles(RVO) concept, there are a few prerequisites:

P1. The concept of Velocity Obstacle(VO):

We will see that RVO is an extension of VO which has few advantages. Formally,

Velocity Obstacles of an obstacle B to an agent is then the set consisting of all those velocities of that agent that will result in collision at some moment in time with obstacle B moving at some velocity.

It means, every agent in the given environment should have a knowledge of those velocities that will result in a collision with other agents/obstacles. This way, the agent could successfully navigate and reach its goal.

Let’s begin with a very simplified scenario: Consider 2 agents A and B to be point objects with (position, velocity) as shown.

An example of 2 point objects navigating.

By looking at the image, we can say that if agent A continues with the same velocity, it will collide with the agent B. That means,

And for any other velocities outside VO, agent A won’t collide with agent B. And with the commutative law, we can say that for a pair of agents, collision is mutual i.e.,

Now, let’s scale it up a notch by considering the agent B to be a disc while agent A is still a point. Now, the VO will be a range of velocities as shown.

An example of 1 point object and 1 non-point object.

And the range is the values that will result in tangential trajectories. This is why we get a cone of velocities and all the velocities in this cone belong to the VO.

Before we define a mathematical model for real-life situations(physical dimensions of all agents are non-zero), we should understand what Minkowski Sum and it’s significance in navigation and path planning.

P2. Minkowski Sum:

In geometry, the Minkowski sum (also known as dilation) of two sets of position vectors A and B in Euclidean space is formed by adding each vector in A to each vector in B.

Let’s take an example and understand what’s going on. Consider 2 sets of position vectors Q1 and Q2 as shown in the picture.

Minkowski sum of these 2 sets of vectors centered at Q2 can be defined as:

i.e., we take the vector sum of all the points in the green square with all the points in the red square. The result would be the blue square. Informally, it can be seen as sliding the corner of the green square along the edges of the red square.

So, what is it’s significance in this paper and why do we need it? Well, in real life scenarios, the physical dimension of the agents is not a point. But for designing mathematical models, it is desirable to consider agents as points so it’ll be simpler to get a generalized model. That is why we consider objects like points and the concept of Minkowski Sum will help us generalize it. In this case, Minkowski sum can be seen as:

A simple transfers of the physical dimension of the agent of interest to the environment and then consider the agent to be a point. The result is the new configuration space for the “point-ified agent of interest”.

This is done by using Minkowski sum but we take the negate of Q1 and perform the same above process. We now get a marginal area that defines the configuration space for Q1 or, in our case, for the agent of interest.

This is under the assumption that the agent has 2 DOF. That means, it only translates. If the rotation is considered, ie, 3DOF, it would be different.

P3. The agent’s trajectory:

Given an agent’s position and it’s velocity, we can pinpoint its location at any time t. And if we plot the points for all the time steps, we construct the trajectory the agent would take. In the below equation, consists of all the points of the trajectory.

P4. Collision: The math!

Now, how do we define the term “collision” mathematically? The following section talks about the mathematical definition and model for collision.

The intuition: If the relative velocity vector of the agent intersects the mimkowski sum, it means the agent is on collision course with the obstacle.

For an agent to present in a collision course, the intersection of the trajectory of the agent and the Minkowski sum NOT must be a null set. i.e.,

So, when the agent realizes it’s current velocity is in the VO, it chooses another velocity that is outside it’s VO to avoid the collision. This way, at every simulation step, the Velocity Obstacle is calculated for the agent of interest by considering all other agent’s position and velocity information and navigates the environment without any collisions.

This is the result of VO:

2-agents navigating using Velocity Obstacle

One point to notice is the concept of VO acts in the relative frame. It means even the new velocity chosen will fall in VO if the same change happened to the other agent. This is called translational invariance.

So, we were able to navigate by avoiding collisions. But something doesn’t seem right. Take a look at the trajectory. Notice that it isn’t smooth. As in, there is some kind of oscillation present as the agents approach each other. Though the collision is being avoided, the motion is unnatural. This is one of the drawbacks of VO.

So, why does oscillation occurs?

Consider an agent going towards its goal with a particular velocity. When an agent realizes it’s velocity is inside the VO set at a given time step, it chooses a velocity outside the VO. Now as the agent is clear of obstacles, in the next time step, it tries to choose a velocity that will take it to its goal which happens to be inside the VO of the previous one which brings us back to the same issue and this process keeps happening.

There are 2 different types of oscillations that occur in VO:

Oscillation due to drawbacks of VO concept:

Oscillation due to Velocity Obstacle concept

Reciprocal Dance: When agents don’t agree on which side to pass

A 2-agent experiment demonstrating Reciprocal Dance

The below are selected logs of one agent named “turtle1” going through Reciprocal Dance in a 3-agent experiment.

The names of other 2 agents are “turtle2” and “turtle3". Below are the ranges of relative velocity headings that will cause a collision with their respective turtles.

{‘turtle3’: array([0.24863371, 1.81863371]), ‘turtle2’: array([-0.34746361, 0.39253639])}

The desired heading is the heading that will take “turtle1" to its goal.

desired heading :
0.7806123660039597

But notice the direction it is choosing. It is doing so because the desired heading will cause a collision with “turtle3". So, it chooses a direction that is outside the VO and nearest to its desired heading.

choosen direction is
1.82

It happens to alternate between the extreme sides of RVO. This is because this agent and other agent are having trouble to come to an agreement.

- — -
{‘turtle3’: array([0.24863371, 1.81863371]), ‘turtle2’: array([-0.34746361, 0.39253639])}
===
desired heading :
0.7806123660039597
choosen direction is
1.82
===
 — — -
{‘turtle3’: array([0.25274557, 1.82274557]), ‘turtle2’: array([1.32468295, 2.08468295])}
===
desired heading :
0.7780515365549873
choosen direction is
0.25
===
 — — -
{‘turtle3’: array([-0.13773937, 1.43226063]), ‘turtle2’: array([-0.56889533, 0.20110467])}
===
desired heading :
0.7743985668906327
choosen direction is
1.43
===
 — — -
{‘turtle3’: array([0.3 , 1.87]), ‘turtle2’: array([-0.27262558, 0.52737442])}
===
desired heading :
0.7755256224396609
choosen direction is
-0.27
===

Solution: Reciprocal Velocity Obstacle:

Reciprocal Velocity Obstacles(RVO) addresses the issue of the first kind of oscillation. The second kind of oscillation as mentioned in the previous section is addressed by Hybrid Reciprocal Velocity obstacles which combine RVO and VO and it is out of the scope of this blog.

Look at the below 2 snapshots. What do you see?

4-agent experiment using VO(left) and RVO(right). Notice the trajectories are smoother and less sudden in RVO than VO.

To avoid the unnatural oscillatory motion, we structure RVO as an average of current velocity of the agent and a new velocity that is inside the VO of the obstacle.

Indirectly, what we do here is, for a pair of agents, instead of assigning the whole responsibility to avoid the collision to just 1 agent, we divide it among both the agents. This division of responsibility can be done equally or unequally.

After RVO is calculated, due to the averaging concept, the RVO is now a shifted version of VO. And hence, if the agent chooses back the previous velocity, it’ll fall in the RVO and agent wouldn’t choose it. This way, the oscillation doesn’t occur.

RVO of an agent A for an obstacle B with velocities v_A and v_B is:

Guarantees:

1. Collision-free:

This theorem states that when the agents choose velocities outside their RVO’s, the collision does not occur. This is proved by using the lemmas and translating it into the VO concept.

2. Same side:

It says that when an agent chooses a velocity outside its obstacle’s RVO, it would be on the same side of the agents.

3. Oscillation Free:

It says that the old velocity which happens to be inside RVO would remain inside even after the next step, unlike VO where the old velocity would be outside VO. This way, the oscillation would not occur.

Generalizing:

All the above was done under the assumption of equal importance to every agent. To accommodate the concept of priority into this framework aka to generalize RVO, the term:

is introduced.

Till now, alpha was assumed to be taken as 0.5. That means the agent takes a sum of 2 velocities, ie, half of the current velocity and half of the new velocity. If <check here>

it means we are giving different weightage/priorities to the chosen velocities.

Multi-agent Navigation (for #agents > 2):

12-agent experiment using VO(left) and RVO(right). Notice the trajectories are smoother and less sudden in RVO than VO.

Combined RVO:

It says that the combined RVO is the union of RVO’s of all static and dynamic agents other than itself.

Kinematic and Dynamic Constraints:

This equation says that the set of allowable velocities is a finite set and their range is limited by their max velocity and acceleration.

Penalizing:

In the case of multiple agents, there might be scenarios where VO is full. In that case, we allow the agent to choose a velocity inside VO but decrease in its magnitude. The amount by which we decrease would be in relation with:

The distance between the chosen and desired velocity
Time to collide

Once we calculated the penalized velocities, we choose the one with a minimal penalty.

Neighbor Region:

Instead of considering all the agents in the environment, we take the RVO of only a few ones that are near the agent of interest. This is because the far-away agents will no cause collision in the near future.

So, we consider a region:

around the agent of interest which decides the RVO of the agents to be considered. It helps in minimizing the computation needed without affecting the performance given an optimal region based on the speed of agents and the simulation time step.

Implementation Details:

Link to the Github repo of this implementation can be found here: https://github.com/suraj2596/RVO_rospy

This is a python implementation of RVO implemented in ROS using TurtleSim and rospy library.
Coded entirely from scratch by developing an intuition from the paper.
Defined a class and every agent instantiates an instance of the class.
Every agent publishes it’s information like its name, position and velocity to a common topic using ROS topics.
This implementation assumes the agent’s shape to be a circle.
A multi-processing code is used to generate any number of agents on demand.

Critique:

In my opinion, these are the pros and cons of this paper:

Pros:

The idea of calculating the collision-free velocities takes place without any explicit communication.
This concept considers the reactive nature of the environment.
Since this is a Decoupled process, unlike earlier centralized approaches, it is easily scalable.
The issue of oscillation was addressed through distributing the work and expecting the other agent to reciprocate.
The neighboring region was considered to address the time complexity of calculating the RVO which is O(N), especially if the number of agents in the environment is too high.

Cons:

Oscillation still persists due to reciprocal dance.
In this study, the shape of agents are considered to be symmetric. This might not reflect the effects of real-case scenarios.
When a new velocity is chosen, the agents take a jumps to the new heading. Considering the time lag in rotating to the new heading might introduce new complexity.
The admissible velocities are only calculated from the agent’s capabilities. It can also depend on the agent density in the environment if it’s too high.

Future Works/Improvements that can be done:

Here are few ideas that can be worked upon.

A more optimized procedure with an agenda of saving overall energy expenditure of all agents together.
A machine learning framework where it can be trained on the positions and velocities of agents to predict the actions to be taken by individual agents.

Conclusion:

In a multi-agent environment with both static and dynamic agents, this algorithm has proven to navigate successfully without any collisions.
The oscillations have diminished compared to Velocity Obstacle due to sharing of the task to avoid the collision.