Integer Programming for Graph Theory and Others with Python: 01 - Introduction / Knapsack

Alex Brou
Analytics Vidhya
Published in
5 min readApr 5, 2020

I have been fascinated with Operation Research and AI ever since I was introduced to it. Looking back to my classes with Professor João Pedro Pedroso, I was fascinated with the contents while everyone else was bored with all the numbers and polynomial equations written on the board.

Professor João Pedro Pedroso (University of Porto). The best teacher I have ever had.

Those equations changed my life. The fact that you could model a problem as a set of equalities and inequalities and just let an algorithm find the best possible solution was mindblowing. After I finished my BSc Degree and started looking for work I would tell everyone during an interview that I had skills regarding Linear Programming and Heuristics most of the interviewers would look at me clueless and ask me what I was talking about. Fortunately, I had the chance to face an optimization project on my own while working for a Software House called Waterdog, based in Portugal. They said “you told us you are good with algorithms and such, do this” and I was faced with a flux optimization problem with no help from anyone else in the company, no papers to read, nothing. And I solved it, and that is how I got my 2 next jobs working for Mobility-oriented companies.

The Railway Map of the Great Western Railway (GWR) , which can be generated using Integer Programming as a Cost Minimization Problem

What will this series of blog posts consist of?

I will include some tutorials on generic Graph Theory problems such as Shortest Path, Minimum Spanning Tree, Travelling Salesman, and Flux problems, include Python code for every solution and, at the end of the series, show how to write a complete Carpool Optimization Solution.

What is Integer Programming?

Integer Programming is a subset of Linear Programming where you use Integer Variables. Linear Programming is, putting it simply, an optimization method where you write an optimization function and a set of constraints for an algorithm to find one solution that has the highest (or lowest, we’ll get to that) value for the optimization function, or tell you that a solution for that problem does not exist. All functions must be polynomial of degree 1 or zero.

A visual representation of a Linear Programming model and Solution using only 2 variables.

Here’s an example of Linear Programming:

Optimization Function ( z ) = 7*x + 5*y , to maximizewhere:x + y ≤ 602x + 0.6y ≤ 65x ≤ 30 and y ≤55 and x ≥ 0 and y ≥ 0

If we run this problem through a Linear Programming Solver, we will find that the solution x=20.714286 and y=39.285714 maximizes the optimization function, reaching a value of 341.428572.

Here is the Python code to solve this problem, using PuLP:

Python Code for the LP model, using the PuLP package

Just to make something clear, we cannot write an optimization function like the following:

z = x*y

This function is not a polynomial function of degree 1 or zero, therefore it is not allowed. This is a big problem when it comes to writing LP models, and some problems can’t be tackled with LP at all.

An Integer Programming example, the Knapsack Problem:

Can we fit all of this in the backpack? If not, what items do we take? Integer Programming can solve this!

Let’s imagine you are going grocery shopping for your family during the COVID-19 Pandemic. You take your backpack with you, but your backpack can only hold so much weight without ripping itself. You need to choose carefully which items to purchase.

For simplification purposes, each item will have an Importance Value and a Weight Value. Here are the items:

Data for this problem to be solved with Integer Programming

Your bag can only hold 50kg. Which items should you purchase?

In this scenario, our variables must only be Integers (because we cannot purchase half a can of tuna), so we must define them as such. The objective function is the sum of all variables multiplied by their respective Importance value, and the constraint is that the sum of the variables multiplied by their respective weights is smaller or equal than the weight limit of the backpack.

We can model the problem like this:

Python Code for the IP model

The solution to this problem is 2 tuna cans, zero bottles of water and 10 rice bags, reaching an objective of 86.

Why use Integer Programming when we have other methods?

This is a good point. We solve a problem using IP by modeling an objective function and constraints, ad we can always add more constraints! One of the problems this series will address is the Shortest Path Problem (stay tuned, it’s in the next episode), and it can easily be solved using Djikstra’s algorithm. But what if you are driving from city A to city B and say “oh, I want to check out that new road the just built, but I don’t want any repeated roads or streets in my whole path”, you might need Dijkstra to jump out of his grave to help you on this one… or you can use Integer Programming!

A young Edsger Dijkstra

Some other problems, like Travelling Salesman, don’t have any algorithms that guarantee the best solution (all they have is heuristics), but LP and IP guarantee the best solution.

In my opinion, learning IP is a skill that may come in handy when you least expect it. If you want to be different, don’t learn what everyone else is learning.

Next episode: Shortest Path Problem and BigN

--

--