When writing software for on-demand mobility, one of the most important tasks is optimising whatever is moving: from packages on a van to people on a bus, finding the most optimal way to pickup and deliver them is very important. Our bus-on-demand project is no different: we need to pick up people on predefined stops, drive them, and drop them at their home, and to do all of it efficiently.
This problem is a close relative of traditional optimisation problems such as travelling salesman and vehicle routing problem. Many scientific papers and dissertations have been written about the subject. This is not one of them…
Our goal is to explain our pragmatic approach to resolve the problem. It solves our real-world situation, in a way that works even if you don’t have a PhD.
Let’s start with a description of the problem: Assuming that we have only two mini-buses, each already with two stops scheduled (a stop can be a pickup [P] or a drop [D])
Bus1 -> D1 -> D2
Bus2 -> P3 -> D3
If another passenger wants to be picked up, a good way to find her the best bus is to use a brute-force approach: try all possible permutations of the new 2 stops (P4 and D4) on bus 1 and on bus 2, measure the impact of each, and select the best.
This brute-force approach is of order O(n!), which is not good. If a bus has two stops already scheduled and two more being requested (one pickup and one drop of a passenger) we will have to consider 24 permutations (4!) . But factorials grow quickly: ten factorial is 3,628,800.
Clearly the brute-force method is feasible for a small number of stops, and it has the advantage of giving the best result. But “best” in the real world is a relative measure, and in the middle of a cold night, when you are waiting for a bus, “good enough” will do. So we use a hybrid approach: for fewer than N stops we use brute-force. For more than that we simplify a lot and use what we call the insertion approach (more on that below).
“Brute-force” or “insertion” are just two different techniques to generate a set of permutations. The bulk of the work occurs after that, when we try to very quickly eliminate the invalid ones, reducing the set to a manageable size. After that, we can apply whatever cost function we want, to ensure that we select the best solution.
All three key words highlighted above (“invalid”, “cost”, “best”) are entirely subjective, and driven by the use case. In ours, it is obvious that a permutation that has a customer being dropped before being picked up is invalid, but how about one that leaves a customer waiting 10 minutes on a stop?
Reducing the set
To achieve any level of performance, our main goal is to reduce the size of the set quickly, before any costly operation is performed on it. In our case, we have three basic validation rules:
- Every customer must be picked up before dropped: Yes, it is obvious in English, but still needs to be coded. This rule eliminates more than 50% of the entries.
- No passenger should wait more than N minutes on a stop
- No passenger should travel more than X times it would take a car to travel.
These last two rules depend on us knowing how long it takes for the bus to travel between stops, which is the crucial time-consuming aspect of the whole thing. Since the goal here is to work through a lot of entries quickly, we use “as crow flies” distance and some fixed speed (30 kph) to do our calculations. We will refine the selection later, with better measurement, once we have filtered out most of the entries.
Most of the permutations will violate at least one business rule, so as long as we organise the code well, we can get rid of pointless permutations quickly, and only operate on a much reduced set.
This is the key to the system: each rule is built as a boolean function, which exams one permutation, and returns false if it violate the rule. Since we are using Kotlin, we can use the filter functions that are applied to the entire list, and progressively reduce it, as in the pseudo-code example below:
Of course all these calls could be chained together, but this setup allow us to check if the perms list is empty after a rule is applied, and to declare an error right there.
Once all the filters have been applied, the set should have been reduced to a manageable size, and we can apply a cost function to each entry. As I said, the cost function is entirely subjective. It normally includes measuring the current cost and calculating the marginal cost of the new pair of stops.
In the end, each valid entry is identified by:
bus, proposed order of stop (permutation), cost
All valid entries are added together (regardless of bus) and sorted by cost. We then select the top N (maybe 20) and refine the travel information, running the rules for waiting time and travel time again, but this time using a proper routing provider (Google or OSRM in our case). What we are after here is not another selection process. It is rather simpler than that: since the entries are already ordered by cost, the first one that does not violate the waiting and travel time rules “win”.
In the end, what you optimise for is defined by the cost function, which should attach different values and weights to different aspects of the journey. This is where you need your trusted “data person” behind you. The rest is just code, but here you are actually interpreting the data, and deciding the entire outcome of the system.
In our case we look at:
- Total distance covered on the new route
- Total and max distance for each passenger
- Increment compared to previous (current) route.
Naturally, as this is a bus, and the goal is to get more passengers, we favour the solutions that minimise the distance covered and the distance of each passenger. We also try to minimise the time of the passenger staying the longest in the bus.
Our “insertion” pattern
We mention a couple of times the hybrid nature of our solution: we use a brute-force approach if we have up to N (where N is probably 9 or 10) stops, and an insertion approach if more than that. We now describe this simple approach, with a caveat: this is more common sense than science. It seems reasonable to assume that if a route is optimised with 10 stops, adding another 2 should not require a complete reshuffle again (it “feels” reasonable, but don’t ask me to prove it). To make things simpler to describe, we look at a comparison between adding two new stops (X, Y) to a bus that already has two stops (A,B) using both methods.
When using the insertion approach, all we do is literally insert both stops (pickup and drop) in all possible positions, without disturbing the current order. While the brute-force generates 24 permutations (4! Or 4 x 3 x 2 x 1), the insertion solution generates 6 (or 3 + 2 + 1). The preservation of the order means that if we insert the new stops in a logical order (pickup before drop) we don’t have to worry about the order rule, as all other entries, by definition, are valid.
The origins of our solution was a “R” function written by our colleagues at the VW DataLab in Munich. After the improvements described here we rewrote the solution as a library in Kotlin. Kotlin might not be the fastest thing on the planet, but in comparison to R (running as a server on a separate AWS instance), it flies. And that is “fine” for our current needs. It takes the system a few seconds to select the best bus for a customer.
But there are clear areas for improvement when performance becomes an issue. The most obvious one is parallelisation: when the set of all permutations is generated, each permutation is entirely independent of all others. So, on a multicore server, it would be highly efficient to divided the original list into X lists (where X is probably the number of cores-1 (or 2) ) and apply all filters and cost function to each “chunk” at the same time.
There is also more sophisticated improvements: investigating the plethora of published papers with solutions varying from greedy algorithms to Ant Colony Systems, or using solvers such as Google OR Tools (or even using Google Directions API itself).
I hope we have covered enough information to help others when thinking about optimisation problems. Also, if this is the kind of problem you may enjoy dealing with, check us out at www.metropolis-lab.io. Maybe we can work together in the near future.