An Intro to Integer Programming for Engineers: Simplified Bus Scheduling
This article was originally part of Remix’s series on the software engineering problems they face. In this installment, Dan Hipschman walks through bus driver scheduling and a simplified approach to solving it using integer programming.
I studied NP-Hardness in school but never tackled any problems in real life. Until now.
At Remix, I finally encountered NP-Hardness, adding a powerful new tool called integer programming to my tool belt. This is an intro to integer programming for engineers who haven’t used it. The goal is for you to start thinking about some ways you can use integer programming and wanting to learn more.
The Bus Driver Scheduling Problem
Let’s start with a motivating example. At Remix we improve public transportation, especially bus systems. Have you ever wondered how bus drivers coordinate their lunch breaks? Or if you aren’t on the bus that often, maybe you’ve wondered how flight attendants and pilots fly around the world and end up back home, which is a similar problem. We’re going to solve the problem of scheduling bus driver shifts in this post. I’m going to include working code, so for the full experience, please break out your shell and editor to follow along.
Let’s frame the problem. Say we have four bus routes, called A, B, C, and D. Let’s assume that each route takes an hour to complete. They’re loops, so they start and end in the same place. Each route has one bus running on it from 9 AM until 9 PM. Finally, the routes all converge at the same point every hour on the hour, so it would be easy for drivers to switch shifts and take their lunch breaks at that location. I drew some demo routes that all converge in the top right of the map to make this clear.
Since our bus service runs from 9 AM until 9 PM, and each route takes an hour to complete, we’ll have 12 full trips for each route (48 total trips). Let’s represent each trip as a tuple: (route_name, hour_of_day)
. Let’s write a function to generate these. Don’t worry if you don’t know Python, we’ll keep it simple.
Here’s our resulting 48 trips:
>>> generate_trips('ABCD', 9, 12 + 9)
[('A', 9), ('A', 10), ('A', 11), ('A', 12), ('A', 13), ('A', 14), ('A', 15), ('A', 16), ('A', 17), ('A', 18), ('A', 19), ('A', 20), ('A', 21), ('B', 9), ('B', 10), ('B', 11), ('B', 12), ('B', 13), ('B', 14), ('B', 15), ('B', 16), ('B', 17), ('B', 18), ('B', 19), ('B', 20), ('B', 21), ('C', 9), ('C', 10), ('C', 11), ('C', 12), ('C', 13), ('C', 14), ('C', 15), ('C', 16), ('C', 17), ('C', 18), ('C', 19), ('C', 20), ('C', 21), ('D', 9), ('D', 10), ('D', 11), ('D', 12), ('D', 13), ('D', 14), ('D', 15), ('D', 16), ('D', 17), ('D', 18), ('D', 19), ('D', 20), ('D', 21)]
The driver scheduling problem is that we have some work that needs to be done, and we need to assign it to drivers. In other words, we want to come up with a set of duties for the bus drivers, where a duty is simply a list of trips the need to drive. Every trip must be part of exactly one driver’s duty. Let’s further assume we have some rules from payroll:
- Each duty over four hours should allow the driver to have a break
- Drivers get paid for each hour they work, but not for breaks
- They get a minimum pay of eight hours for coming into work (known as guaranteed pay)
- They get 1.5x pay for each hour they work over eight hours (overtime pay)
We’d like not only to determine a set of duties, but the payroll department would like us to minimize driving costs. How do we do this? Well, as you might have guessed, we use integer programming… somehow. But first let me explain what integer programming is.
What Is Integer Programming Anyway?
As a precursor to integer programming, I’m going to define linear programming. Linear programs are mathematical models and there are algorithms to solve them. Let’s start with a linear expression. This is called the objective function:
The goal of linear programming is to minimize some objective function. By “some” objective function, I mean that n and the a’s are all known for a particular problem. E.g., 3 * x1 + 14 * x2
.
Now, of course, if that was all there was to it, we could just set all the x to zero. However, linear programs are also allowed to have one or more constraints. Constraints are each simply another linear expression, set less than or equal to zero. Note that the coefficients of the x’s can be different than the objective, and there can be a constant term as well. E.g.,
And so on. There can be as many constraints as you like. Note that some factors in the constraints could be zero, effectively meaning that not every variable needs to be in every constraint.
Equality is allowed, using a sort of “trick” which you can verify. The following:
is equivalent to:
OK, so given the objective and the constraints, the goal of linear programming is to minimize the objective.
Let’s Jump Into a Shell
Let’s make this concrete and jump into a shell. We can install and use a Python integer programming solver library to demonstrate all this (it also solves linear programs). The Python package is called Pulp. Here’s a demo:
$ pip install pulp
$ python
>>> from pulp import *
>>> x1 = LpVariable('x1')
>>> p = LpProblem('p', LpMinimize)
>>> p += x1
>>> p.solve()
1
>>> LpStatus
{0: 'Not Solved', 1: 'Optimal', -2: 'Unbounded', -1: 'Infeasible', -3: 'Undefined'}
>>> x1.value()
0.0
Let’s walk through this line by line.
$ pip install pulp
This simply installs pulp. If you’re not in a virtualenv
, you might need to use sudo
.
>>> from pulp import *
This imports Pulp. I’m going to be lazy and import everything into the global namespace using *
. Everything in Pulp starts with “Lp” or “lp” so it should be clear what came from the module.
>>> x1 = LpVariable('x1')
This creates a variable. The argument is the name, which is arbitrary (just make it unique if you create more than one variable).
>>> p = LpProblem('p', LpMinimize)
This creates a problem, which is Pulp’s way of keeping pieces of the linear program together. The first argument is another arbitrary name. The second is LpMinimize
because the default is to maximize the objective. (Minimizing and maximizing are mathematically equivalent — multiply the objective by -1 — so linear programming can do both).
We’ll “add” the objective function with the following syntax (Pulp uses a lot of operator overloading):
>>> p += x1
Hence our objective function is as follows:
Not very exciting, but let’s see what happens.
>>> p.solve()
1
>>> LpStatus
{0: 'Not Solved', 1: 'Optimal', -2: 'Unbounded', -1: 'Infeasible', -3: 'Undefined'}
The solve
function returns a status. The different statuses are described in LpStatus
. In this case, Pulp is telling us it found an optimal solution to the problem. After calling solve
, the variables will have their value set to whatever minimizes the objective. So let’s inspect:
>>> x1.value()
0.0
Pulp tells us that the assigning 0
to x1
minimizes the objective (remember that all the variables are nonnegative). In this case the problem was trivial.
So let’s add a constraint:
>>> p += x1 >= 1
>>> p.solve()
1
>>> x1.value()
1.0
This adds a constraint that x1 >= 1
, and then re-solves. Obviously the minimal value of x1
is now 1
, since 0
is not allowed by the constraint. Now our complete linear program is:
Progress! We’ve learned how to write and solve linear programs using Pulp.
Solving the Driver Scheduling Problem
Let’s move on to integer programming, and how it gets us closer to solving our driver scheduling problem.
An integer program is exactly the same as a linear program, with the additional constraint that all x are integers. I.e,
Without getting into theory, I’ll just mention that solving integer programs is an NP-hard problem. However, there are algorithms that can often solve them much faster than brute force.
Pulp can also solve integer programs, which I’ll show you in a bit. But first, let’s discuss the high level approach to how integer programs will solve our original driver scheduling problem. We’ll solve our problem by modeling it as a weighted set partitioning problem. This image gives an intuitive idea of the problem:
The idea is that we have a set of items, represented by the points in the lefthand image above. We also have a set of sets, each represented by a squiggly loop in the left image. In this case, for example, there are seven sets (trust me, since counting them is probably bad for your eyes). The set partitioning problem is to find the smallest number of sets that contain all the items (points), without any overlap between sets (hence a “partition”). Every point must be in one and exactly one set. The solution to the problem above is shown on the righthand side.
The weighted set partitioning problem is when we assign a cost to each set. Then, instead of minimizing the number of sets, we minimize the sum of the values of the sets we choose. You can see that if we set all the weights to one, we get the original set partitioning problem.
Set partitioning is NP-complete (both weighted and unweighted). However, we can use integer programming to solve it optimally. NP-complete doesn’t mean that problems are always slow to solve. It only means that we can construct degenerates cases that cause any general algorithm to run slowly some of the time. At least, to the best of our knowledge (unless someone proves P=NP). Integer programs can often be solved relatively quickly.
In our problem, the dots in the set partitioning diagram are trips a bus takes from its origin to its destination (our tuples), and the sets are potential driver duties. In other words, each duty is a set of trips that a driver could potentially do. We’ll assign a weight to those sets using the driver pay rules we defined above (overtime, guarantee pay, etc.). When we solve the weighted set partitioning problem, we’ll get the set of duties that cover all the trips and minimize the sum of individual drivers’ costs. This is exactly what we want.
So here’s how we’ll do it. Each potential duty will be a variable (an x) in our integer program. We’ll further constrain all the x to be between zero and one. Since they’re integers, this actually restricts them to be exactly zero or one. The idea behind this is that a particular x will be one in the solution if the solver decides that using the duty is part of the optimal solution, and zero if the duty is discarded as not being part of the optimal solution.
The coefficient of each x is the cost of that duty in terms of driver pay.
Hence our integer program looks like this (let’s use p for pay instead of the a’s we were using above):
This means we have n potential duties. Each p is the driver pay (hence payroll cost) of the duty. We want Pulp to choose some x’s to be zero and some to be one. By doing this, our objective function will turn into the sum of the payments we have to make. And we want to minimize this.
Of course, as is, there are no constraints that some x need to be selected (set to one) at all. Hence, Pulp will helpfully set all x to zero to minimize the objective. What constraints do we need to add?
Well, every trip needs to be in exactly one duty. So we can add a constraint for every trip, so that the x’s corresponding to the duties that trip appears in sum up to 1. For example, we might have a constraint like this:
This assumes for simplicity’s sake that the first five x’s correspond to the duties containing trip ('A', 9)
. Likewise, we’d have a constraint like this for every other trip.
These constraints say that for a given trip, exactly one duty must include it. No more, no less.
Time to Write Some Code!
OK, so we’re ready to write some code to solve our problem! Well, we’re close! But there’s one more hitch. Where are we going to get all the possible duties from? Generating every possible duty explodes exponentially with the number of trips. Hence, for now, we’ll randomly generate duties, and see what happens.
We already have code written above to generate the set of trips. Let’s write a function to randomly generate some duties from those trips. We’ll make sure that we don’t generate a duty with multiple trips that occur at the same time.
So let’s write it like this:
- We’ll randomly choose the number of trips in the duty
- Then we’ll randomly choose the hours for those trips
- Then we’ll randomly choose the routes for each trip
Here’s the fully functioning code to generate trips and duties (extending the code above):
When I run this, here’s what I get:
$ ./duty_generation.py
[[('A', 14), ('A', 18)],
[('A', 9), ('B', 11), ('B', 12), ('D', 14), ('A', 15), ('A', 16), ('D', 18)],
[('C', 17), ('B', 18)],
[('C', 18)],
[('B', 9), ('C', 11), ('A', 16), ('B', 18), ('D', 19), ('C', 20)]]
Great! Now let’s write a function that tells us how much each duty costs:
And here’s what the output looks like when I run it (I actually ran it a few times to get an interesting mix of costs):
[(1000000.0,
[('C', 9),
('A', 10),
('C', 11),
('C', 12),
('B', 13),
('A', 14),
('B', 15),
('C', 16),
('A', 17),
('C', 18),
('B', 19),
('B', 20)]),
(8, [('B', 12), ('A', 13), ('C', 17)]),
(8, [('B', 11), ('D', 13), ('D', 14), ('A', 16), ('A', 17), ('C', 19)]),
(8, [('C', 16)]),
(11.0,
[('C', 10),
('A', 11),
('C', 12),
('D', 13),
('C', 14),
('D', 15),
('A', 16),
('A', 18),
('B', 19),
('C', 20)])]
The Fun Part
OK! Now for the really fun part, solving the problem! Here’s the code. Recall the Pulp code we wrote above, as we’re building on top of it.
Here’s the complete runnable code.
Let’s run it and see what happens. The output will be fairly long because I’m asking Pulp to print a representation of the problem. I’ll truncate some repetitive parts to make it shorter:
driver_scheduling:
MINIMIZE
12.5*x1 + 8*x10 + 8*x11 + 8*x12 + 8*x13 + 8*x14 + 8*x15 + 8*x16 + 8*x17 + 8*x18 + 8*x19 + 9.5*x2 + 8*x20 + 8*x21 + 8*x22 + 8*x23 + 8*x24 + 8*x25 + 8*x26 + 8*x27 + 8*x28 + 8*x29 + 8*x3 + 8*x30 + 8*x31 + 8*x32 + 8*x33 + 8*x34 + 8*x35 + 8*x36 + 8*x37 + 8*x38 + 8*x39 + 1000000.0*x4 + 8*x40 + 8*x41 + 8*x42 + 8*x43 + 8*x44 + 8*x45 + 8*x46 + 8*x47 + 8*x48 + 8*x49 + 11.0*x5 + 8*x50 + 8*x51 + 8*x52 + 8*x53 + 8*x6 + 8*x7 + 8*x8 + 8*x9 + 0.0
SUBJECT TO
_C1: x7 = 1_C2: x33 = 1_C3: x1 + x45 = 1_C4: x14 + x2 + x5 = 1_C5: x1 + x20 + x4 = 1..._C47: x25 + x4 = 1_C48: x24 = 1VARIABLES
0 <= x1 <= 1 Integer
...
0 <= x9 <= 1 IntegerOptimal
Cost: 239.5
[[('B', 9),
('B', 10),
('B', 11),
('D', 12),
('D', 13),
('D', 14),
('D', 16),
('B', 17),
('D', 18),
('D', 19),
('B', 20)],
[('D', 10),
('C', 11),
('B', 13),
('B', 14),
('D', 15),
('C', 16),
('A', 17),
('B', 18),
('A', 19),
('D', 20)],
[('A', 9)],
[('A', 10)],
[('A', 11)],
[('A', 12)],
[('A', 13)],
[('A', 14)],
[('A', 15)],
[('A', 16)],
[('A', 18)],
[('A', 20)],
[('B', 12)],
[('B', 15)],
[('B', 16)],
[('B', 19)],
[('C', 9)],
[('C', 10)],
[('C', 12)],
[('C', 13)],
[('C', 14)],
[('C', 15)],
[('C', 17)],
[('C', 18)],
[('C', 19)],
[('C', 20)],
[('D', 9)],
[('D', 11)],
[('D', 17)]]
So for all our hard work, we have a solution! Hooray! But it’s a pretty crummy one. Most drivers do one trip, get paid the minimum of 8 hours, and go home. Well, it’s good for them, but payroll won’t be happy. This is because we’re only generating 5 duties, of course. Let’s increase that to 100 duties. And let’s also time how long it takes. Here’s the result (without Pulp output this time):
Optimal
Cost: 120
[[('A', 9), ('C', 11), ('D', 13), ('A', 14), ('D', 15)],
[('A', 10), ('D', 11), ('C', 13), ('A', 16), ('B', 18), ('D', 19)],
[('C', 17)],
[('D', 14), ('C', 15), ('D', 16), ('D', 17), ('A', 19)],
[('C', 9), ('B', 10), ('D', 18), ('C', 19), ('A', 20)],
[('B', 11), ('C', 14), ('C', 16), ('A', 18), ('B', 19), ('B', 20)],
[('D', 10), ('A', 12), ('A', 13), ('B', 16), ('A', 17), ('C', 18), ('C', 20)],
[('D', 9), ('C', 10), ('D', 12), ('B', 13), ('B', 17), ('D', 20)],
[('A', 11)],
[('A', 15)],
[('B', 9)],
[('B', 12)],
[('B', 14)],
[('B', 15)],
[('C', 12)]]real 0m1.297s
user 0m0.110s
sys 0m0.000s
That’s much better! The cost is almost half! But it’s still not great. Let’s generate 1000 duties.
Optimal
Cost: 70.0
[[('C', 9), ('C', 11), ('C', 12), ('A', 15), ('C', 18), ('A', 19)],
[('A', 10), ('A', 11), ('D', 12), ('A', 13), ('A', 16), ('D', 18), ('C', 20)],
[('A', 9),
('C', 10),
('A', 12),
('C', 13),
('B', 14),
('D', 15),
('D', 16),
('C', 17),
('B', 19)],
[('D', 10),
('B', 12),
('B', 13),
('A', 14),
('B', 16),
('A', 17),
('B', 18),
('C', 19),
('B', 20)],
[('B', 9),
('B', 10),
('B', 11),
('D', 13),
('D', 14),
('B', 15),
('C', 16),
('B', 17),
('A', 18),
('D', 19)],
[('D', 9), ('D', 11), ('C', 14), ('D', 17), ('A', 20)],
[('D', 20)],
[('C', 15)]]real 4m41.586s
user 3m11.270s
sys 0m0.990s
Phew, that’s much better. But also much slower! Well, we are solving an NP-complete problem after all. In practice, we could speed this up by generating duties more cleverly. E.g., there’s no reason for drivers to jump back and forth between routes all the time. If we limited that, we could generate more “good” duties, while ignoring “bad” duties without passing them to the solver.
There are also much more advanced techniques to speed up and improve results, but the intent of this article is to whet your appetite.
Integer Programming as a Flexible Framework
So that’s it! We’ve used integer programming to solve our problem. And we’ve saved the effort of thinking up an algorithm for this specific problem, which would probably be either slow, or suboptimal, or both. Integer programming also gives us a very flexible framework for making changes. E.g., if we can customize the cost function to penalize some duties (like those that switch routes too often because that’s confusing). When the problem gets more complicated, such as routes ending in different locations, or trips starting and ending at different times, integer programming can handle that really easily. If you don’t want something to happen you can either just add a severe penalty (like we did with the driver break rule), or add a constraint.
There are many other interesting classes of problems that can be solved using integer programming as well. Since solving an integer program is NP-hard, it can in fact be used to solve many NP and NP-complete problems (as long as you can model it as an integer program without exponential growth). This is a very powerful tool, because it’s often much easier and effective to model an NP or NP-complete problem as an integer program than it is to write an algorithm to solve it. You can also solve polynomial time algorithms using integer programming, and this is still often useful as a first cut solution, and it often turns out to be fast enough.
To name just a few other examples of problems that integer programming is well suited to solve:
- Transportation problems (such as package delivery, or resource delivery such as energy distribution)
- Warehouse and container packing problems
- Matching problems (such as what stock brokerages do to match buyers with sellers, or scheduling which sports teams play each other)
- Map coloring
By the way, Pulp is a Python wrapper around back end solvers written in C. By default it uses CBC, which is an open source solver. There are commercial solvers as well. These are often used for harder problems in industry because they can be much faster. One well-known commercial solver is called Gurobi, and they have many example usages of integer programming. I encourage you to look through some Pulp case studies as well.
I hope you’ve found this interesting! Please comment or share this if you did. We’re planning to write another post going into more depth, so follow us if you’re interested.
Want to work on these types of problems? Remix is hiring!