Secret for a simple yet elegant matching algorithm made from scratch

Fabien Moreno
May 12 · 6 min read

Operational Research and optimization with Python

Image for post
Image for post
Photo by why kei on Unsplash

Matching customer needs with the most relevant suppliers is crucial in many industries. This is particularly true for platforms which put in touch independent workers with clients. Companies like Uber or Lyft put in a lot of efforts to efficiently match drivers with riders. This is key for user satisfaction on both sides.

But what is the secret behind such a complex problem ? Let me answer this question by presenting you a simple yet elegant solution I have implemented from scratch.

Keep scalability in mind

To make it simple, you would say, let’s try every possible permutations of client/driver duos and keep the cheapest one. In essence that is the brute force idea. But the total number of permutation to asses increases steeply. If you start with 5 drivers for 2 clients requests, you have 20 permutations. If your business grow by 10 folds (50 drivers to be match with 20 clients), you end up with more than 10³² permutations ! Definitely not the definition of scalability.

So let’s be more clever … and that is where you open the door to an amazing, but not renowned, field of mathematics and data: Operation Research or Combinatorial optimization.

Application to car delivery service

They rely on car delivery companies such as hiflow.com which are offering a premium delivery service where the car is driven by an independent professional driver from store location to the end-client.

As any logistics company, building a more intelligent network leading to cost optimization is fundamental. The cost of such car transfer can be broken-down to :

  • Driver transportation costs from home to car pick-up point (mostly public transport)
  • Driving costs (petrol, toll, insurance, …)
  • Driver transportation costs from car delivery point back to home (mostly public transport)
  • Driver’s contract price proposed for the overall trip

Our main optimization goal will be to reduce overall transportation costs by performing an intelligent matching. The easiest solution would be to match the drivers the closest to pick-up points for each car transfer request. But don’t forget many local optimums do not always lead to global optimum.

Let’s build our optimization model

Image for post
Image for post
Optimization variables

Our goal will be to minimize the overall cost :

Image for post
Image for post
Optimization equation

Without any additional constraints, we will end up with a trivial solution where no matching is proposed (all x = 0). We need to add a minimum number of matches for each transfer requested (we don’t want any client to be left behind) :

Image for post
Image for post
Constraint equation for transfers

Finally, we need to limit the number of matches proposed to each driver otherwise at some point it will become irrelevant to receive dozens of job propositions :

Image for post
Image for post
Constrain equation for drivers

Theses constraints can be adjusted depending on the situation of each driver or transfer. For example, a transfer located in the middle of nowhere, should be proposed several times to find a match. Knowing this, you can raise the minimum number of matches proposed for this particular transfer.

Define what drives your cost

The primary criteria which will be used is the distance between driver’s home and transfer pick-up point. This is the criteria most correlated to transfer direct costs.

There are several ways to compute this estimation (distances are as the crow flies, public transport network nodes, accurate multi-modal transportation costs, machine-learning regression). Improving this cost estimation is a never-ending challenge. The secret recipe you will use mostly depends on what you want to improve. In the peer-to-peer ridesharing example, that will be the client waiting time for a driver to reach him/her.

Boost certain matches

In our case, to reduce driver churn, we decided to boost those who have done few transfers in the last couple of weeks. So for each of them we define a coefficient (Bi) which is applied to the cost estimation as such :

Image for post
Image for post
Boosted cost matrix

Boost for a given transfer or for a specific driver/transfer duo can also be applied. At the end of the day, this is a matching proposition, drivers remain always in control on which transfer they would like to make an offer.

Define your constraints

In our case, we have the same problematic. Some drivers which are not certified for a given client transfer process will not be matched accordingly. Thus, an exclusion matrix id defined and a new constraint is added to our model :

Image for post
Image for post
Exclusion constraint

Time to solve the model

The most accessible, and one that you may know is the Excel Solver. This is basic but can help you quickly solve a problem without any line of code. Obviously, this is not an option applicable in production.

The easiest solution we found to solve this model in a reliable manner is by using a Python programming language based library named PuLP. With a few lines of code, PuLP allows you to generate any Linear Programming model. Once the model generated, you can select among many different solvers (such as Open Source GLPK or CBC, or a paid one like Gurobi) to solve it.

I will go into the details of the code programming of these solvers and how to use them in production in a following article. In the meantime, I recommend this article :

Conclusion

Anytime you are facing a matching problem, you should consider this solution, whether for ridesharing or dating.

The Startup

Medium's largest active publication, followed by +734K people. Follow to join our community.

Thanks to Mado Hemery and Sophie Catonné

Fabien Moreno

Written by

Product / Data / Performance @Hiflow.com

The Startup

Medium's largest active publication, followed by +734K people. Follow to join our community.

Fabien Moreno

Written by

Product / Data / Performance @Hiflow.com

The Startup

Medium's largest active publication, followed by +734K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store