At Gousto Data Science we often find ourselves using Combinatorial Optimisation in a variety of contexts. A few examples are:

• Order routing — deciding which stations to send a box to optimise our line throughput.
• Warehouse space optimisation-deciding the allocation of space to stock in our warehouse to facilitate optimal stock flows.
• Recipe to site allocation — deciding which recipes to host at each of our fulfilment sites to balance operational flexibility with operational complexity.

Constraint Programming

While we always try to use the best tool for the job at hand, in practise very often we find Constraint Programming to be the right choice, for a few reasons.

This means that you declare a model and pass it to a pre-existing solver to solve. This has a few advantages:

• It allows the user to concentrate on defining an appropriate model without worrying about how to actually solve it.
• It allows us to leverage custom solvers built by people with deep expertise in the field, such as Google’s fantastic ortools CP-SAT solver.
• It makes it easy to add in additional constraints as needed, since we simply need to define them in our model, rather than change our whole algorithm.

CP is based on logic propagation and doesn’t rely on assumptions about the solution space, such as convexity or linearity. This makes it easy to add a range of constraints such as:

• Logical constraints such as and, or, xor, if and only if, etc.
• Arithmetic expressions, including nonlinear expressions such as variable multiplications, min and maxes, absolute values etc.

This gives us significant flexibility to model complex problems.

Exact optimisation methods can give us a provably optimal solution. This is in contrast to heuristic or metaheuristic methods which cannot guarantee optimality. The tradeoff here is that difficult or large optimisation problems can be so complex that the solver cannot prove optimality, or even give a feasible solution in a reasonable timescale.

Because CP can struggle with more complex problems, a variety of tricks and techniques can be used to drastically improve the ability of your programme to find a good or even optimal solution. To demonstrate some of these techniques, I will use a famous problem in computer science — the graph colouring problem.

Job scheduling and the Graph Colouring Problem

Imagine you have a set of jobs to perform, each taking a similar time to execute, and each requiring some kind of shared resource to complete — meaning that some jobs cannot be completed at the same time since they use some of the same resources and hence create conflicts. You are keen to finish as soon as possible, so want to find a job schedule that you can complete in the quickest possible time.

For example, we have five Gousto warehouse employees with different warehousing skill licences, and six jobs complex to compete. The dependence of jobs on employees is shown in the table below:

We can see that employee A is in demand — they are required for three separate jobs, so we know it will take at least three time slots to complete all the jobs. However, is this maximum time or could it take longer? To help us find out, we can start by representing the problem as a graph as follows:

Here, the jobs are represented by the nodes of the graph, and the job conflicts are shown by the edges between nodes — e.g. if we are doing job one, then we cannot do job four and five at the same time, as they all require employee A. We create the graph by generating edges between all job nodes that are in the same column in our job dependence table above.

The problem can then be modelled as a Graph Colouring problem — a classic problem in computer science, formally stated as:

Graph colouring is the procedure of assigning colours to each node of a graph G such that no adjacent nodes are the same colour. The objective is to minimise the number of colours while colouring a graph.

In our problem, we can think of the colours as each representing a different time slot; — hence the number of colours on the graph is the number of time slots needed. If two jobs are in conflict then they can’t be done in the same time slot — they can’t have the same colour. We can see a solution represented below:

Looking at the graph we can see that we can indeed complete all the jobs within three time slots.

The GCP is an NP-complete problem which can be very challenging for large and dense graphs — this makes it a great toy problem to demonstrate some advanced CP modelling techniques.

A formulation for the graph colouring problem can be seen below:

It’s interesting to note that the constraint linking the two variable sets is a logical constraint rather than an algebraic one — this is something easy to achieve in CP which might be harder using another paradigm such as Integer Programming. Example code for the vanilla GCP model is below:

Improving on our vanilla GCP model

The model above shows a basic implementation of CP to solve the GCP. The following sections will detail some tricks to improve the power of our model.

Symmetry in combinatorial optimisation problems occurs when we can permute either the variables of a solution, or the values of those variables, and retain an equivalent solution.

Looking at the simple graphs above, we can see that while each solution has the colours uniquely arranged, the solutions are in fact equivalent. In general, if a solution has k colours, then there are k! equivalent solutions, dramatically increasing the size of the solution space for larger problem instances.

In our toy problem, the symmetry comes from the fact that each time slot is equivalent — the ordering of tasks doesn’t matter as long as the conflicts are respected.

We can leverage this symmetry to improve the solve time of our model. We can add a constraint:

This constraint ensures that a lower indexed colour has to be used before a higher indexed one can be, breaking the value symmetry of our model.

Now this might get a bit technical...CP problems are solved using search techniques that use logical inferences gained from the model constraints to iteratively tighten variable domains. When the solver can no longer infer variable domains based on the constraints it has to make a guess at a variable’s value and restart the search process, backtracking if the particular line of inference results in infeasibility.

It is possible to provide the solver with rules for choosing an ordering of variables to assign and branch on. The motivation here is that it is beneficial to assign certain variables first — typically variables which are hard to assign while maintaining feasibility of the overall solution.

For the GCP, the hardest variables to assign are ones in dense parts of the graph, as these are the most constrained by the colour of their neighbouring nodes. By assigning these nodes colours first, we reduce the chance of the solver finding a solution to be infeasible when almost all colours are assigned.

In our toy example we can order our nodes by the average degree of their neighbours and search in this order, ensuring that the hardest parts of the search space are explored first.

Like many classic optimisation problems, the GCP has a lot of well researched heuristic/greedy algorithms to solve it, which we can leverage to speed up our CP solver. To understand why, it is important to know that CP solvers optimise by iteratively finding better solutions and re-solving with the latest objective value as an upper bound.

A good greedy algorithm can find decent solutions in a few milliseconds or less, which we can then use to set an upper bound for our CP model. This means the solver doesn’t need to waste time investigating any solutions that provide worse objective values than we have set by our upper bound.

Some code for a greedy algorithm follows:

A Git Gist containing a GCP model with all of the above implemented can be found here.

Testing our improvements

Now we have seen some of the potential tricks to improve our model, let’s try them out on some test cases.

The code below can be used to create test instances of the GCP, with a parameter to set the number of nodes, a parameter to set the density of edges using a Power Distribution, and some upper and lower bounds:

The histograms below show the degree of the nodes in two test instances — one pretty trivial and one very difficult.

In the easy test instance we have 50 nodes and 339 edges. Each optimisation was ran for 60 seconds before stopping. The results can be seen below:

Interestingly, only the greedy algorithm wasn’t able to find the optimal solution of 9 (although it was roughly 1000 times faster than the best other model ¯\_(ツ)_/¯).

The vanilla and VSO models weren’t able to prove optimality in the 60 second limit. Of the two other models, symmetry breaking proved better than greedy bounding, with the combination of all three solving to optimality in an impressive 0.62 seconds. This would be of particular importance if you have to solve the model fast or many times over (like a certain Order Routing Algorithm mentioned earlier).

Here we look at a significantly harder problem instance. Here we have 500 nodes and a whopping 53000+ edges. Each optimisation was run for 10 minutes before stopping. The results can be seen below:

Here we have little hope of proving optimality — we are more interested in how much we can improve the objective value. The greedy algorithm is predictably the worst, although of course solves much quicker than all the other models. The vanilla model only performs slightly better than the greedy one.

Counterintuitively, both the VSO model and the greedy bounding perform better than all three tricks combined in one model. This illustrates something that anyone who has worked with Combinatorial Optimisation will know well; you may have some intuitions about what to do, but you won’t truly know what works best without a bit of trial and error.

Summary

We’ve gone through a few different techniques to boost the power of your CP model and help you solve more complex problems. The blog post is of course non-exhaustive — CP is a deep area of research after all. In particular, if you enjoyed this blog post and are thinking of getting started using CP, look into Global Constraints and how they can be used to improve your modelling skills. It would be remiss of me to not give a shout out to Coursera’s fantastic Discrete Optimisation course which directly and indirectly taught me many of the concepts discussed in this post.

A final note — if this kind of work is of interest to you then have a look at our Careers page as we are often recruiting for a variety of roles in Data.

Gousto Engineering & Data

Gousto Engineering & Data Blog