Multi-Agent Pathfinding & ILP: Moving warehouse robots with Integer Linear Programming

Published in

Data Reply IT | DataTech

6 min readMar 16, 2023

What is the first image coming to your mind when thinking about robots in warehouses? Many would probably answer those tiny cubic-colored Amazon robots, relentlessly carrying tens of products from point to point.

Only ten years ago, this was just fiction. How could machines move across crowded areas without colliding?

While effective results were applied recently, the first literature studies on the multi-robot planning problem go back to the 80s. A crucial improvement consisted in using AGVs on grid-modeled environments, exploiting not only innovative robotic systems and sensors but also path planning algorithms from graph theory.

These path planning algorithms abound. Some are strongly related to Mathematical Optimization. In the following, I want to show you a powerful and easy-to-implement way to find an optimal solution to the multi-robot planning problem using ILP, which is the simplest mathematical optimization model, and network flow optimization. Two of the first literature papers to combine mathematical optimization formalization and MAPF are Yu and LaValle, 2012, Yu and LaValle, 2013.

Some Amazon mobile robots in a typical distribution center — Image Credits to: Azcentral

What is MAPF?

Formally, Multi-Agent Pathfinding (MAPF) is the problem of planning the trajectory of many agents, from their start location to their respective destination, while sharing the same environment and avoiding collisions.

The best-known application of MAPF is warehouse automation. But agents don’t have to be necessarily robots. Indeed, other interesting MAPF examples regard road/air traffic management and gaming.

Representation of mobile robots inside a warehouse

Researchers defined many objective functions to evaluate the MAPF solution. The most common are makespan and sum of costs.

Makespan. It is the number of time steps required for all the agents to reach their target.
Sum of costs (a.k.a flowtime). This objective function is given by the sum of time steps required for every agent to reach its target.

We will develop an ILP model for solving time-optimal and distance-optimal MAPF problems. Preliminarily, let us understand the equivalence between MAPF and multi-flow.

Network Flow Optimization

A network 𝒩 = (𝒢, c₁, c₂, S) consists of :

a directed graph 𝒢 = (𝒱, E),
c₁, c₂: E → ℝ₊ as the maps defining the capacities and costs on edges, respectively,
S = S₊ ∪ S₋ ⊆ 𝒱 as the set of sources (S₊) and sinks (S₋).

A feasible static flow on this network is a map f: E → ℝ₊ that satisfies:

edge capacity constraints,
flow conservation constraints.

The classic single-commodity maximum flow problem asks: “How much flow F can we push through a network 𝒩?”

Two solutions to the same single-commodity maximum flow problem

A further question might be: “Given a network 𝒩 and multiple single-commodity maximum flows, which one is minimizing the overall cost?”. This is called a single-commodity minimum cost maximum flow problem.

Solution to the single-commodity minimum cost maximum flow problem

In real situations flowing commodities through edges take some time, so the problem becomes a dynamic network flow problem (a.k.a. network flow over time). In a discrete-time model, for a given edge e = (u, v) ∈ E, we may view the cost cₑ as the time required to move an amount of flow (not exceeding the capacity) from the tail u to the head v of the edge e.

Link between Network Flow Optimization and Multi-Agent Pathfinding

The above formulation concerns a single commodity, which corresponds to all robots being inter-exchangeable. For MAPF, we must treat robots as different commodities. Multi-commodity flow or multi-flow captures the problem of flowing different types of commodities through a network. Instead of having a single flow function f, we have a flow function fᵢ for each commodity i. The maximum flow problem can be posed for a multi-flow setup.

Translating a multi-agent grid system into a multi-commodity network

We can transform MAPF into a multi-commodity minimum cost maximum flow problem:

sources and sinks correspond to the agents’ start and goal positions;
we consider a dynamic network flow, hence the edge cost is the time required to traverse that specific edge;
in the time-expanded graph, edge capacities are smaller than or equal to 1 (no more than one robot can traverse the same edge at the same time).

We still need a way to avoid collisions between robots.

Avoid swap conflict. Given two robots at nodes u(t)’ and v(t)’, we don’t want them to exchange their position at time t +1. To avoid this edge conflict, we build a specific graph structure connecting time t’ and t +1, where we impose some appropriate capacities, as shown below.

By considering the time dimension, we obtain:

This structure doesn’t prevent two agents from occupying the same node at time t +1, i.e., vertex conflict.

Example of vertex conflict in the time-expanded graph

(Note, in the two-dimensional space, this corresponds to one robot staying at its node and the other moving to that node).

Avoid vertex conflict. We use an intermediate time step and some edges (blue colored), guaranteeing one single agent to stay at that node.

Portion of dynamic network, avoiding vertex conflict

This is all we need to translate MAPF into a network flow optimization problem. The final time-expanded graph will look like this:

Multi-agent grid system transformed into a multi-commodity dynamic network, avoiding swap and vertex conflicts

MAPF with Integer Linear Programming

Thanks to the network flow concept, we can solve the MAPF problem using the Integer Linear Programming framework.

Time optimality can be obtained using a maximum multi-flow formulation and easily implementing an integer linear program (ILP) to solve the problem. Since the optimal time T is a parameter of the optimization model, we want to find the minimal T that yields a feasible solution. To do this, we start with T being the maximum (over all robots) single-robot shortest possible path length (ignoring all other robots). We then build the ILP model for this T and test for a feasible solution. If the model is unfeasible, we increase T and try again. The first feasible T is the optimal T.

Distance optimality objective can be encoded using a minimum cost maximum multi-flow, where T denotes the time optimal value produced by the previous optimization problem.

Below, we show a simple example of the MAPF solution and the corresponding network flow.

Result. Transforming MAPF into a Network Flow Optimization Problem and solving it with Integer Linear Programming — Transforming MAPF into a Network Flow Optimization Problem and solving it with Integer Linear Programming

MAPF & ILP: Why/Why Not?

The wide applications of MAPF combined with the technological improvements make this field one of the most flourishing of the last decade. From logistics to traffic management, companies are increasing the use of this solution, highly decreasing costs and bottlenecks.

MAPF & ILP is a specific centralized method, often with good theoretical properties like optimality.
On the other hand, the computational complexity may limit its application to small teams of agents with few obstacles.
Should we discard all the effort we made? The answer is NO! Have you ever heard about Accelerated Computing?…