Inside Glovo’s deliveries optimization
Intro
In the instant-delivery business, efficiency is everything: a large volume of orders with relatively small delivery fees is sustainable only if the system is able to leverage the scale reducing the cost sustained by the company for each individual order.
That’s why in Glovo we invested a lot of resources dealing with optimization, and today I’d like to show you how the Matching team combined software and math to boost efficiency in our company.
The main goal of the team is to solve the assignment problem, which consists in deciding, for each order, who is the best courier to handle it, with the goal of improving the efficiency of the system while respecting the business constraints, where with “business constraints” we mean any kind of rule that needs to be followed by the assignment logic: for example, an order that needs to be delivered in a limited traffic area can’t be assigned to a courier that is driving a car.
Bonus: what is efficiency?
In our context, “efficiency” is strictly linked to the time the courier has to wait at the pickup point, let’s try to explain it with a simple example from our domain:
In Figure A we have 2 couriers: courier 1 with a scooter and courier 2 with a bike, and 3 orders: a pizza, a taco and sushi. The first courier is already handling the pizza and we need to decide who to assign the others. We estimate the preparation time (ETP in the figure) for tacos as 10 minutes, and for sushi as 3 minutes.
A naive approach would be to assign the closest free courier to any order, so now we imagine we went on with this simple approach and looking back after orders were delivered, you can see in Figure B what happened.
You can see here the travel time of the couriers, how much they waited at the store and the total time from the customer's point of view.
The main problem here is that courier 2 is too close and will need to wait a lot while the restaurant is preparing the taco! To make things worse, the taco delivery point is far from a bike, so the total time for the customer would be 25 minutes, while for the sushi one will be 16.
But if we try the approach of minimizing the waiting time, see in Figure C what happened this time.
The final situation looks much better, no courier had to wait and all the orders actually arrived faster!
But the most careful of you will have noticed a little drawback: the tacos were ready before courier 1 arrived, and they “waited” 1 minute more. In this case, this is not a real problem since it’s compensated by the scooter’s higher speed in the delivery phase, but we have to take into account this and many other “constraints”, while at the same time dealing with way more orders and couriers.
In order to solve the assignment problem, there are already a lot of algorithms (some of the ones we used: Hungarian algorithm, Minimum cost flow and Mixed Integer Problem). Still, for every specific domain you need to build a cost function representing the metrics you want to optimize and tune it.
Now I’d like to continue talking about how we added a new dimension to this optimization, considering not only the best possible assignments but also the best ways to combine orders together: something that strongly depends on the real-time conditions of a city. We’ll explore the problem first and then see an overview of our software architecture.
The p̶r̶o̶b̶l̶e̶m̶ idea
Once upon a time, there was Bundling, a simple intuitive idea and yet, the biggest efficiency booster we still know.
Bundling, in the original definition, simply consists in offering a courier to take 2 orders prepared in the same place.
As always, implementing the idea is trickier than it seems. It’s not enough to consider orders with close delivery locations, but also ensure we respect all the business constraints, like:
- the maximum overall distance assignable to a courier
- food getting cold due to the extra waiting time
- meeting the promised delivery deadline
- …
Finally, we needed to do everything efficiently and on a big scale.
That worked out pretty well and thus, shortly after that, we wanted more: the same concept, but also from different pickup points (figure 1)
Again, a simple idea but very tough to implement, especially because the number of possible combinations “exploded”.
Finally, we developed yet another kind of bundle: any number of orders a.k.a. the Multibundle.
Again, what seems an obvious idea hides the great complexity of efficiently and effectively solving the Vehicle Routing Problem (VRP from now on) at scale. We had two main problems: the load and the ambiguity:
The (computational) load: we are committed to providing assignment decisions for every city at most every 10 seconds, but the VRP it’s an NP-hard problem and thus can be really time-consuming, so it’s hard to be executed in such a short time.
The ambiguity: we didn’t want to deprecate the old bundling algorithm, because it could potentially find better bundles in some cases, but this means that now we may find the same order in more than one “potential” bundle, so we need to pick one!
Let’s try to picture it with an example, in Figure 2 you can see on the left a bunch of orders: those orders can be then processed by the bundling algorithms we mentioned (and even more, since the same algorithm can be run with different parameters) and each one will produce a list of bundles.
Let’s also add the “default option” which simply consists in not creating a bundle, just delivering the order “alone”.
Finally, let’s also introduce the concept of transport constraint, since a bundle could become too heavy for certain vehicles or one of the orders could need to be delivered in a restricted traffic area. As an example {(O1,O2,O3), CAR} means that it is combining the first 3 orders but this bundle can only be assigned to a courier with a CAR.
Now you can see that we can’t simply pick them all, in fact, if we take, say, the first Bundle of the first two algorithms: {(O1,O2,O3), CAR} and {(O1,O2,O4), CAR} we would have a conflict.
Choosing the best bundles is not an easy task, even simplifying the problem by assuming we only want to minimize the total delivery time while meeting the promised deadlines, is it better to pick {(O1,O2,O3)} and {(O4)} alone, or {(O1,O2)} and {(O3,O4)}?
There is no obvious answer… for example, you may think that O3 will be delivered first by choosing the second option, but what if our estimations say that O4 will take a lot to prepare? bundling them together would mean that the courier will need to wait for both orders to be ready… on the other hand, what if O2’s delivery location is very far from O3’s? Maybe in this case, the first solution is better only if we assign it to a CAR. We should also consider how many couriers and which vehicles are active right now in the area, what the promised deadlines that we need to meet, the chances of food getting cold and many other factors; so the best option here is to extend our optimization model to consider all the constraints and select the best non-conflicting Bundles.
[We are working on another article from our data scientists to dive deeper into the optimization problem, but here we’ll focus on the engineering part and thus we’ll continue by showing the software architecture]
The technical solution — Architecture
Let’s get to the software architecture part, we know the problem we want to solve, now let’s add some data:
- We publish ~20 million order assignment decisions per month.
- All our calculations are based on estimations, like the two we saw in the introduction: time to prepare an order, and time required to travel. That’s why we ask our estimation services for ~5 billion estimations each month
- We are expected to not delay the order assignment for more than 10 seconds with our processing, so we work with a time window of 10 seconds in each city.
- The assignment matrix (free couriers against Order/Bundle to be assigned) can grow up to 1k*2k resulting in ~2M of potential combinations to consider in each time window (every 10 seconds).
Let’s refer to Figure 3 for a high-level overview of the system.
The service is consuming a lot of events from the rest of the Glovo ecosystem to maintain the state of all the orders and couriers in the world. It is not exposed to the public internet but can be accessed from a back-office web app, the Admin panel.
It is deployed through multiple instances that are handled by a load balancer that can scale up during peak times, and scale down to save energy. Finally, it saves and publishes assignment decisions that are consumed by services in the Glovo ecosystem.
It depends explicitly only on the estimator services to calculate the cost function of each assignment, which will be used in the optimization engine, but of course, it also depends on all the order and courier events it consumes (but not specifically from the services that are publishing them).
It requires a relational database to store the state of the “world” (mainly active orders and couriers) and the decisions taken by previous cycles. However, the overall data volume can be kept small due to low retention requirements.
A Redis cluster is also essential to act as a caching system and to leverage the geo-indexing capabilities to greatly reduce the combination of orders to consider.
Now that we saw the high-level components, let’s dive deeper into what we can find in every instance of the service (Figure 4).
Subscribers and API are pretty obvious components with little business logic, then we have:
Bundle Generator(s)
A bundle generator’s goal is to combine orders in a smart way to create Bundles.
As already mentioned, there are different implementations of this, so you can see the bundle generator as a composition of many single implementations with a common interface; where these implementations are loosely coupled with the rest of the system: they can be added or removed without changing anything else. The Multibundling is just one of these, the latest and most complex we implemented.
For the multibundling, we have a dedicated scheduler that continuously checks cities that need bundles, the locking mechanism is based on Redlock, and when the lock is acquired, an execution is scheduled with a timeout equal to the lock’s one.
In execution, the multibundling generator will:
- load all the orders’ data of the city from a read replica, to avoid putting pressure on the main node;
- run the VRP algorithm with all the constraints applied to calculate the best combinations of orders into bundles;
- save the bundles in Redis temporary storage, eventually overriding the previous execution of the same city.
As the VRP algorithm is very CPU-demanding and we don’t have the same strict requirements as the optimizer, we relaxed the time window to 1 minute.
The Optimizer
The optimizer is the real core of the system, whose goal is to publish assignment decisions.
A scheduler continuously checks cities that need assignments, acquires a lock with timeout through the RDS DB (select * for update …) and schedules an “assignment cycle”.
In each cycle, it:
- Loads the status of the city;
- Loads and validates the bundles proposed by the generators (validation is required as the bundle generation happens concurrently and based on the asynchronous read replica, so it could be out of date);
- Create the “matrix” of assignments (orders VS couriers available) checking all the eligibility constraints;
- Loads (or asks, if there aren’t) all the required estimations;
- Calculates the cost function for every eligible assignment;
- Translate the problems into a standard optimization formulation (Minimum cost flow or Mixed Integer Problem);
- Solve it, and translate it back to the service domain;
- Apply custom domain logic, like manual assignment requests;
- Save and publish decisions: This step involves the commitment of the Bundle chosen, the rest of the Glovo ecosystem will then know that this Bundle has been created (can still be modified in the future);
Conclusion: the hard reality
The assignment problem in a delivery platform is core to the efficiency and thus sustainability of the whole service.
The complexity to handle is huge, and needs to be tackled at a big scale and fast, so the system needs to be designed accordingly with a combined effort of engineers and data experts.
In this article, we wanted to give you an overall idea of how we are dealing with this at Glovo, but of course, the reality is always more complex, to mention some of the requirements we avoided to simplify:
- some stores may have dedicated couriers to them, they may be able to use both the dedicated ones and the others or even share the dedicated ones with the nearby stores, all depending on the specific store configuration;
- the constraints mentioned on the eligibility of a courier to a particular Order are just a small subset of the real ones;
- We have to deal with many different types of orders, food and non-food, with potentially different optimisation requirements (for example: “getting cold” is a food-only constraint);
- the estimations can’t be always accurate, and so there are features in place, like hot-swapping of the order, to compensate for that;
- a courier is free to reassign any order, so the same order can be considered in the optimization cycles many times, not just once;
- a decision taken too early is likely suboptimal, estimations are trying to “predict the future” and the closer the future, the more accurate they are, so we always try to postpone as much as possible every decision;
- Our system makes decisions, but there’s no guarantee that they’ll be applied to reality: a millisecond after a decision is taken, the courier’s phone may run out of battery, a bike tire can break or the customer could cancel the order… we have to keep following the “actual” status of the world and keep trying to make decisions to chase efficiency;
I hope this overview of our daily world was interesting for you, looking forward to hearing your comments!