Optimization Modelling in Python: SciPy, PuLP, and Pyomo

Published in

Analytics Vidhya

6 min readJan 26, 2020

Optimization modelling is one the most practical and widely used tools to find optimal or near-optimal solutions to complex decision-making problems. Optimization modelling, most of the time used as simply ‘optimization’, is a part of broader research field called Operations Research.

In this article I will give brief comparison of three popular open-source optimization libraries: SciPy, PuLP, and Pyomo. We will try to solve single use-case to highlight implementation and syntax differences of mentioned libraries.

Let’s consider simplified transportation type problem. We have set of customers I = [1,2,3,4,5] and set of factories J = [1,2,3]. Each customer has some fixed product demand d_i and each factory has fixed production capacity M_j. There are also fixed transportation costs to deliver one unit of good from factory j to customer i.

Mathematically this optimization problem can be described as follows:

Our task is to deliver necessary amount of goods to each customer (satisfy customer demand and factories production capacity) at minimal total transportation cost. To formulate this situation as optimization problem we must separate it into 3 main components:

decision variables — quantities of goods to be sent from factory j to customer i (positive real numbers)
constraints — total amount of goods must satisfy both customer demand and factory production capacity (equalities/inequalities that have linear expression on the left-hand side)
objective function — find such values of decision variables that total transportation cost is the lowest (linear expression in this case)

In optimization terms this particular situation is Mixed-Integer Linear Programming problem, because decision variables are not restricted to integers (Integer Programming), and according to business logic all constraints and objective function are linear. Presence of only one business objective makes it a single-objective optimization problem (multi-objective optimization is also possible).

Let’s start implementing solution in python. First we prepare all data structures:

import sys
import numpy as npd = {1:80, 2:270, 3:250, 4:160, 5:180}  # customer demand
M = {1:500, 2:500, 3:500}               # factory capacityI = [1,2,3,4,5]                         # Customers
J = [1,2,3]                             # Factoriescost = {(1,1):4,    (1,2):6,    (1,3):9,
     (2,1):5,    (2,2):4,    (2,3):7,
     (3,1):6,    (3,2):3,    (3,3):3,
     (4,1):8,    (4,2):5,    (4,3):3,
     (5,1):10,   (5,2):8,    (5,3):4
   }                                    # transportation costs

SciPy

# to be used in SciPy we must transform cost dictionary into 2D aray
cost2d = np.empty([len(I), len(J)])
for i in range(len(I)):
    for j in range(len(J)):
        cost2d[i,j] = cost[i+1,j+1]

Pyomo

from pyomo import environ as pe# ConcreteModel is model where data values supplied at the time of the model definition. As opposite to AbstractModel where data values are supplied in data file
model = pe.ConcreteModel()# all iterables are to be converted into Set objects
model.d_cust_demand = pe.Set(initialize = d.keys())
model.M_fact_capacity = pe.Set(initialize = M.keys())# Parameters
# Cartesian product of two sets creates list of tuples [((i1,j1),v1),((i2,j2),v2),...] !!!
model.transport_cost = pe.Param(
    model.d_cust_demand * model.M_fact_capacity,
    initialize = cost,
    within = pe.NonNegativeReals)model.cust_demand = pe.Param(model.d_cust_demand, 
    initialize = d,
    within = pe.NonNegativeReals)model.fact_capacity = pe.Param(model.M_fact_capacity, 
    initialize = M,
    within = pe.NonNegativeReals)

Let’s initialize decision variables.

SciPy

x0 = np.ones(len(cost))*100
bounds = list((0,max(d.values())) for _ in range(cost2d.size))

PuLP

import pulpx = pulp.LpVariable.dicts("amount of goods", ((i, j) for i in I for j in J), lowBound = 0, cat = 'Continuous')

Pyomo

model.x = pe.Var(
    model.d_cust_demand * model.M_fact_capacity,
    domain = pe.NonNegativeReals,
    bounds = (0, max(d.values())))

Let’s declare objective function.

SciPy

def objective(x):
    obj_func = sum(x[idx]*cost2d[idx//len(J), idx%len(J)] for idx in range(cost2d.size))
    return obj_func

PuLP

objective = pulp.LpAffineExpression(e = [(x[i,j],cost[i,j]) for i,j in x], name = 'Objective function')
model = pulp.LpProblem(name = "Transportation cost minimization", 
                        sense = pulp.LpMinimize)
model += pulp.lpSum(objective)

Pyomo

model.objective = pe.Objective(
    expr = pe.summation(model.transport_cost, model.x),
    sense = pe.minimize)

Let’s define constraints.

SciPy

# Constraints: sum of goods == customer demand
def const1():
    tmp = []
    for idx in range(0, cost2d.size, len(J)):
        tmp_constr = {
            'type': 'eq',
            'fun': lambda x, idx: d[idx//len(J) + 1] - np.sum(x[idx: idx + len(J)]),
            'args': (idx,)
            }
        tmp.append(tmp_constr)
    return tmp# Constraints: sum of goods <= factory capacity
def const2():
    tmp = []
    for idx in range(0, cost2d.size, len(I)):
        tmp_constr = {
            'type': 'ineq',
            'fun': lambda x, idx=idx: M[idx//len(I) + 1] - np.sum(x[idx: idx + len(I)])
            }
        tmp.append(tmp_constr)
    return tmplist_of_lists = [const1(), const2()]
constraints = [item for sublist in list_of_lists for item in sublist]

PuLP

# Constraint: sum of goods == customer demand
for i in I:
    tmpExpression = pulp.LpAffineExpression(e = [(x[i,j], 1) for j in J if (i,j) in x])
    tmpConstraint = pulp.LpConstraint(e = pulp.lpSum(tmpExpression),
        sense = pulp.LpConstraintEQ,                                
        rhs = d[i])
    model.addConstraint(tmpConstraint)# Constraint: sum of goods <= factory capacityy
for j in J:
    tmpExpression = pulp.LpAffineExpression(e = [(x[i,j], 1) for j in J if (i,j) in x])
    tmpConstraint = pulp.LpConstraint(e = pulp.lpSum(tmpExpression),
        sense = pulp.LpConstraintLE,
        rhs = M[j])
    model.addConstraint(tmpConstraint)

Pyomo

# Constraints: sum of goods == customer demand
def meet_demand(model, customer):
    sum_of_goods_from_factories = sum(model.x[customer,factory] for factory in model.M_fact_capacity)
    customer_demand = model.cust_demand[customer]
    return sum_of_goods_from_factories == customer_demand
model.Constraint1 = pe.Constraint(model.d_cust_demand, rule = meet_demand)# Constraints: sum of goods <= factory capacity
def meet_capacity(model, factory):
    sum_of_goods_for_customers = sum(model.x[customer,factory] for customer in model.d_cust_demand)
    factory_capacity = model.fact_capacity[factory]
    return sum_of_goods_for_customers <= factory_capacity
model.Constraint2 = pe.Constraint(model.M_fact_capacity, rule = meet_demand)

Now, let’s actually solve the optimization problem. To do this reader will need to have GLPK solver installed on his/her machine. SciPy module will use inbuilt solver SLSQP.

GLPK can be installed for example like this:

$sudo apt-get install glpk-utils

SciPy

from scipy.optimize import minimizesolution = minimize(fun = objective,
                x0 = x0,
                bounds = bounds,
                method = 'SLSQP',
                constraints = constraints,
                tol = None,
                callback = None,
                options = {'full_output':False, 'disp':False, 'xtol': 1e-8}
                )

PuLP

solver = pulp.solvers.GLPK_CMD(msg=0)
results = model.solve(solver)

Pyomo

solver = pe.SolverFactory("glpk")
solution = solver.solve(model)

Lets check the results

SciPy

if (solution.success) and (solution.status == 0):
    print("Solution is feasible and optimal")
    print("Objective function value = ", solution.fun)
elif solution.status != 0:
    print("Failed to find solution. Exit code", solution.status)
else:
    # something else is wrong
    print(solution.message)if solution.success:
    EPS = 1.e-6
    for i,_ in enumerate(solution.x):
        if solution.x[i] > EPS:
            print("sending quantity %10s from factory %3s to customer %3s" % (round(solution.x[i]), i%len(J) + 1, i//len(J) + 1))

PuLP

if model.status == 1:
    print('Solution is optimal: %s' %pulp.LpStatus[model.status])
else:
    print('Failed to find solution: %s' %pulp.LpStatus[model.status])print('Objective function value =', pulp.value(model.objective))EPS = 1.e-6
for (i,j) in x:
    if x[i,j].varValue > EPS:
        print("sending quantity %10s from factory %3s to customer %3s" % (x[i,j].varValue,j,i))

Pyomo

from pyomo.opt import SolverStatus, TerminationConditionif (solution.solver.status == SolverStatus.ok) and (solution.solver.termination_condition == TerminationCondition.optimal):
    print("Solution is feasible and optimal")
    print("Objective function value = ", model.objective())
elif solution.solver.termination_condition == TerminationCondition.infeasible:
    print ("Failed to find solution.")
else:
    # something else is wrong
    print(str(solution.solver))assignments = model.x.get_values().items()
EPS = 1.e-6
for (customer,factory),x in sorted(assignments):
    if x > EPS:
        print("sending quantity %10s from factory %3s to customer %3s" % (x, factory, customer))

As we can see all three optimization modules found the same value of objective function 3350. However, SLSQP solver that was used in SciPy achieved this with slightly different values of decision variables than GLPK solver that was used by PuLP and Pyomo.

Three optimization modules analysed here are quite different in both syntax and implementation philosophy. SciPy is probably the most supported, has the most capabilities, and uses plain python syntax. However, as far as I know it doesn’t support binary optimization problems.

PuLP and Pyomo have somewhat similar syntax structure. I intentionally implemented solutions for these two modules fully wrapping every possible variable or function into pulp or pyomo objects. The same solution could be achieved using plain python.

PuLP is arguably the easier module to learn from the three, however it can deal only with linear optimization problems. Pyomo seems to be more supported than PuLP, has support for nonlinear optimization problems, and last but not the least, can do multi-objective optimization.

Optimization Modelling in Python: SciPy, PuLP, and Pyomo

Written by Igor Shvab