The reasons behind the biased comparison of logistic planners and how to deal with it

Published in

VeeRoute

8 min readJan 28, 2020

VeeRoute has been on the logistics market since 2014. During the work we have introduced the model of providing services to address the logistics challenges via the API. In this article, we will share our experience in developing such services, market observations and our perspective to the problem.

Problem Solving Steps for Logistics

Let’s define the key terms.

Client — a business representative with a unique vision and data.
API — a description of the methods (a set of classes, procedures, functions, structures or constants) of program-to-program communications.
Planner — a logistic problem-solving system which uses abstract, unknown to the client, and client-independent terms.
Targeted result — a planning result acceptable for a client. Evaluation criteria are divided into domain-related — this is the number of employees involved, their workload, fault-tolerant scheduling, and service-related — computation time, feedback during computations, an opportunity to influence decisions through various settings.

In reality, logistic tasks vary greatly from the moment when the client decides to automate the process to the moment of obtaining the solution.

We highlighted the main interactions with the client:

Arrangements
We elicit and coordinate with the client the formal requirements set out in the contract and the informal requirements for solving the problem, mainly those for which there is no agreed method of computation.
It is worth noting that clarifications, updates or changes in arrangements may occur at any time and at any step of solving the problem.
Task formalization
The client prepares the data for planning and sends it to VeeRoute.
Supplement information
Most likely, the client does not have the necessary data on the situation on the roads, weather conditions, the schedule of public transport, the location of gas stations and so on. These factors greatly influence the decision-making process when building routes. To solve the problem, we add the missing information. During planning a part of the task information gets lost or it is not yet available.
For example, in the future an event may occur that will affect the driver’s route (an accident on the road, a sudden traffic jam, the recipient who changes the order).
You can try to predict or take into account indirectly the events such as the situation on the roads, for example, to build routes so that the change in the order has the least impact on the route feasibility.
But the computing power is not limitless, and it will be difficult to take into account all the ways to travel between locations for all possible points in time.
Getting the result
Our product solves the planning problem, converts the result into the required form and returns it to the client.
Processing the result
A logistics specialist is usually responsible for routes and he makes the final decision. For his own purposes the logistician can manually make, in his understanding, minor corrections but these corrections may be plainly contrary to the requirements agreed on by VeeRoute. Yet at the same time, significant gaps related to the imperfect interaction can be eliminated.

Client and developer. Solution quality

The key metric for the client is the value of the result for the company, or the solution quality. That’s what defines it.

Meeting the agreed formal requirements.
The adequacy of the data added by VeeRoute — how much our routing corresponds to the current situation on the road, whether the selected petrol station is working or was closed six months ago, or whether we mistakenly predict flood during the drought.
Compliance with the informal requirements of the client. One of these requirements may be the time the logistician takes to make changes to the received routes, or the general feeling of interaction. These factors are hard to measure in advance or without communication with the client. Other examples can be ‘the route density’ and ‘the evenness’ of order distribution among workers. Such requirements are often contrary to each other or formal restrictions.

It is important to note that the planning process for the client is not one-time, the planner does not exist just for the moment of a single computation.

Similar to the client’s ‘‘solution quality’’ metric, VeeRoute, as a developer, defines the ‘‘the quality of the planner’’ metric.

The concept includes:

Flexibility — the ability to work with various formal and informal requirements, the implementation rate of new requirements.
Performance — how quickly the result of acceptable quality will be achieved under given restrictions (memory consumption, hard disk space, network requirements, etc.).
Scalability — the ability to work in a variety of configurations without losing performance while increasing available capacities.
Consistency — how good the addition of missing information and the subsequent optimization problem solving are.

Why the comparison of planners through model problems does not work?

In an academic environment, model logistic problems (‘benchmarks’) are used to compare approaches to a logistic solution. For such tasks there are popular datasets such as Solomon, Gehring & Homberger.

Best known results for Gehring & Homberger’s 1000 customer instances

But in practice, the comparison of the planning quality through benchmarks does not make much sense, and here’s why:

1. As not all aspects are taken into account
Some companies describe successes in solving through model problems as direct evidence of the superiority of the planner, but it is not the case.

The examples of solving a model problems

Achieving success on fixed, publicly available data that does not change over the years cannot objectively illustrate the quality of the planner, the most important aspects are being left behind:

Flexibility. There are no changes in the data and it is difficult to measure data adaptation.
Performance. The time and the required computing power to get the result is not published and not fixed.
Scalability. It is impossible to assess it on the basis of benchmarks.
Routing. Routing of the model problem is known in advance, and the routing model used is too primitive, compared to the actual traffic situation. For example, a route from point A to point B does not depend on the start time. A logistician can ill afford this.
Compliance with informal requirements due to their absence.

Needless to say, model problems can be solved efficiently, but solutions for such simple cases are often not transferred to more complex problems. Just as a solution for transporting from one warehouse is not suitable for transporting from several warehouses.

Such problems are a good testbed for new ideas and approaches due to the low threshold of entry and formal complexity, but having an idea is not the same as a final product.

2. As a solution has no single author
In addition to conceptual problems with an oversimplification of the task, there are other features that do not interfere with academic research but put an end to the comparison of planners.

To prove the existence of a solution with the specified characteristics, it must be made publicly available. The problem is that you can use other people’s work or the current best solution as a black box. Anyone can take it as a basis, modify, improve and publish it as their own solution.

So, it’s not about a single planner capable of reproducing a published result. In fact, it is more like teamwork with the exception of last authorship.

3. As business prefers other metrics
Another difference between real problems and model ones is that in real problems the best possible result will not differ much from the ‘‘second best’’.

For business the ultimate quality metric is not distance, speed, and time, but the service provided and optimization of resources. If we compare the best results in terms of quantitative indicators, the difference of 0.01% may not be noticeable on the customer volumes and it may be less than the error due to the formalization of the task. It turns out that a solution that is worse than the ‘best’ may also be indistinguishable in quality for the end user. Benchmarks take into account only the best result devaluing all the rest.

Common benchmarks do not measure the quality of the planner, as the best solutions do not meet the criteria by which the real business will evaluate the planner. Competition on benchmarks is akin to high performance sport.

How to get an objective system for comparing planners?

We identified and described the requirements for such a system.

1. Independence of solutions
It is necessary to check that the planner finds a solution on its own without building it on other people’s work. It is worth hiding some of the results and showing only general information about them, for example, opening them for verification after a long time. Verification should be the prerogative of the check system, and participants should trust this system. So that trust is not unconditional, you need to prove that the check system is not deceiving, but you don’t have to do it right away.

2. Adding datasets
To add new datasets while preserving old datasets and their results for future use if necessary.

3. Using real-life parameters
It is necessary to bring the routing and other parameters of the tasks to the comparable real-world parameters. Otherwise, good results can be shown only due to the specifics of the data, for example, the planner working on the plane is no longer so good on the sphere, not to mention the surface of the Earth and especially a specific city. In the case of routing, it is also worth considering models in which the time and distance of travel depends on the start time.

4. Standard operating conditions of planners
For the purity of the experiment, it is essential to maintain similar operating conditions for planners:

The same hardware and computing resources (RAM, disk space, CPU, etc.)
The same time frame
The same interaction protocol

5. Using various modes of operation
On fast and slow devices with large and small time limits. The planner can quickly get ‘not a bad’ result but not be able to get a ‘good’ result under conditions of unlimited time. In the real world, different systems are needed for different operating conditions.

6. Work with a variety of data
As well as with various ‘categories’. There are general systems that are able to solve well any problems in any conditions, and there are highly specialized solutions that work well in a particular area. It makes no sense to compare a mobile offline application and a cloud service that uses cluster power for computations.

7. Providing detailed history
Demonstration of data on how, by whom and when another (not necessarily the best) result was obtained. This may be an indicator of how one planner is behind the other in development but such conclusions are, of course, subjective.

Who and why needs a comparison system for planners?

A business one way or another connected with logistics needs such a system to make objective and reasonable decisions and not to be led by an untested marketing proposal.

Different business representatives have their own criteria for a planner. A transparent comparison system will help determine the best solution for a particular scenario.

A planner development company needs such a system to get insight into its niche, to work with vacant niches, to improve the product and to back words with deeds.