OPL vs Python docplex

8 min readOct 31, 2019

Decision Optimization for Watson Studio (DO for WS) allows to formulate models with either OPL (Optimization Programming Language) or with docplex (Python package for CPLEX).

Decision Optimization is now available in Watson Studio. You can use DO from Jupyter notebooks, but a dedicated model builder is now also available where you can formulate models using with Python or with OPL.

Choosing between Python (docplex) or OPL…

Choosing between a dedicated language like OPL or a generic language like Python (with docplex) is a matter of personal preferences. Alex Fleischer has written this complete analysis.

Beside other considerations, let’s focus on the different model formulation steps and how each language supports them using a simple example of pricing optimization.

Hence you will be able to compare:

how to access and pre-process data,
how to create optimization model,
how to create decision variables,
how to create KPIs and objectives,
how to create constraints,
how to solve the model,
how to access solution and post-process it.

Hopefully, you will be better informed to make your choice, if this is not already the case.

You can directly look at the complete OPL and docplex models.

Insurance Pricing

For this analysis we will use an Insurance Pricing problem. The data is made of a set of customers to which we are selling an insurance.

In this model., data is mainly made of two input tables:

rangeAsSet.csv is the list of customers with previous price, along with lower and upper bound on new price.
rawData.csv contains price, probability and revenue for all combinations of customers and possible prices.

The aim is to decide the which price to set in order to maximize the revenue.

In this model, we create an objective to optimize revenue, volume and/or price increase, and parameters include the weights to be used so that an application can use the model in different ways.

Constraints include lower and upper bound, along with some price elasticity constraint.

All you need (data and models) to reproduce what is described her can be found in this github repository.

You might read some more complete introduction on price optimization in this blog.

Input data

The first section of a Decision Optimization is to connect to data.

With DO for WS model builder, all data imported in the scenario will be made available to the model, whether they are coming from CSV files or databases.

With OPL, you define the tuple structure and the tuple set corresponding to one input data table as follow:

tuple TRawData {  
  int index;  
  key int customer;  
  key int priceIndex;  
  float price;  
  float probability;  
  float revenue;
};{TRawData} rawData = ...;

With docplex, the input data tables are made available to the model as pandas dataframe elements of a inputs dictionary. For example, the same input data table is accessed using:

rawData = inputs['rawData']

The structure of the pandas dataframe comes from the structure of the input table. You can use pandas functions to modify the structure of the data, extract IDs, etc. In this case, you can extract the uniquecustomers andpriceIndices easily, and then use them as indices for the dataframe so that elements will be easily accessed to formulate KPIs and constraints.

customers = rawData['customer'].unique().tolist()
priceIndices = rawData['priceIndex'].unique().tolist()
rawData = rawData.set_index(['customer', 'priceIndex'])

Pre-processing

In general, data is not coming in the exact way you expect it to formulate the optimization model. You may want to execute some code to extract more adequate structures.

In OPL, you can either use OPL statements, for example:

{int} priceIndiceSubset =  { pi | pi in priceIndices : ord(priceIndices,pi)<card(priceIndices)-1};

OPL is loved for the power of its own data processing statements. You can create aggregated tuple sets of array in one single line, and using some powerful slicing with comprehensions and filters. You can also use more generic Javascript code (but it is being deprecated!). You could pre-process rawData to get data on the prices just below and above the previous price with some script as with the following code which look for previous price in all price points:

execute PRE_PROCESS{
  writeln("executing PRE_PROCESS");
  for(var c in customers){
   for(var pi in priceIndiceSubset){
    var lowIndex =  rawData.find(c,pi);
    var uppIndex =  rawData.find(c,pi+1);   
    if( lowIndex.price <= previousPrice[c] && previousPrice[c] < uppIndex.price){
      var delta = price[c][Opl.first(priceIndiceSubset)+1]-price[c][Opl.first(priceIndiceSubset)];
      var k = Opl.floor(previousPrice[c]/delta);
      var e = (k+1)*delta - previousPrice[c];
      var lp = e / delta ;
      lambdaPrev[c] = lp;
      tauPrevLow[c] = lowIndex.probability;
      tauPrevUpp[c] = uppIndex.probability; 
      pricePrevLow[c] = lowIndex.price;
      pricePrevUpp[c] = uppIndex.price; 
    }  
   }
  } 
 }

With docplex, pre-processing is very easy, as you rely on a powerful general programming language with also the support of the pandas package!

You can simply slice some of the dataframes:

price = rawData.price

You can also use comprehensions (with filters if required).

priceIndiceSubset =  [ pi for pi in priceIndices if priceIndices.index(pi)<len(priceIndices)-1 ]

And you can write some Python code (equivalent to the Javascript code above):

for c in customers:
    for pi in priceIndiceSubset:
        lowIndex = rawData.loc[(c,pi)];
        uppIndex = rawData.loc[(c,pi+1)];
        if( lowIndex.price <= previousPrice[c] and previousPrice[c] < uppIndex.price):
            delta = price[c][priceIndiceSubset[0]+1]-price[c][priceIndiceSubset[0]];
            k = math.floor(previousPrice[c]/delta);
            e = (k+1)*delta - previousPrice[c];
            lp = e / delta ;
            lambdaPrev[c] = lp;
            tauPrevLow[c] = lowIndex.probability;
            tauPrevUpp[c] = uppIndex.probability;
            pricePrevLow[c] = lowIndex.price;
            pricePrevUpp[c] = uppIndex.price;

Model creation

In OPL, there is nothing special to do to create a model. By default, the model will use CPLEX Mathematical Programming, and you would just need add some statement in case you are using Constraint Programming, using CP;

In docplex, you create the model from the package corresponding to your type of algorithm, for example, with Mathematical Programming (MP):

from docplex.mp.model import Model 
mdl = Model(name='InsurancePricing')

One benefit is that you can easily create multiple models in the same python code, and chain them or combine them. With OPL, you need to use a separate file to do this.

Decision Variables

In both cases, you can easily create decision variables which are binary, integer or continuous, with 1, 2 or more dimensions.

With OPL:

dvar  float lambda1[c in customers][pi in priceIndices] in 0..1;
dvar  float lambda2[c in customers][pi in priceIndices] in 0..1;
dvar  boolean z[c in customers][pi in priceIndiceSubset];

With docplex:

lambda1 = mdl.continuous_var_matrix(customers, priceIndices, lb=0, ub=1, name='lambda1')
lambda2 = mdl.continuous_var_matrix(customers, priceIndices, lb=0, ub=1, name='lambda2')
z = mdl.binary_var_matrix(customers, priceIndiceSubset, name='z')

KPIs

OPL models in DO for WS consider as KPI all dexpr which are created after the subject to block.

So KPIs are defined as follow:

dexpr float Revenue = revenue;
dexpr float Volume = volume ;
dexpr float AvgPriceIncrease = averagePriceIncrease;

If you want to use the KPIs to define the objectives, then you might have to defined them initially as simple expressions and then as KPIs in post-processing.

With docplex, a specific method is used to add KPIs to the model, so you can create an expression, and then use it as KPI and also to create other expressions or constraints.

averagePriceIncrease = mdl.sum(((priceApplied[c]-previousPrice[c])/previousPrice[c])/len(customers) for c in customers)
mdl.add_kpi(averagePriceIncrease, publish_name="KPI.AvgPriceIncrease")

Objectives

Writing objectives is very simple both in OPL and docplex.

This model is using some simple weighted sum to consider all KPIs in the objective. Playing with the weights input table, you can select which combination of KPIs to optimize. Other types of objectives are supported both in OPL and Python (I will cover that in another post).

OPL:

dexpr float resRevenue = - revenueWeight * revenue;
dexpr float resVolume = - volumeWeight * volume ;
dexpr float resAvgPriceIncrease = avgIncWeight * averagePriceIncrease;minimize resRevenue + resVolume + resAvgPriceIncrease;

docplex:

resRevenue = - revenueWeight * revenue;
resVolume = - volumeWeight * volume ;
resAvgPriceIncrease = avgIncWeight * averagePriceIncrease;mdl.minimize(resRevenue + resVolume + resAvgPriceIncrease)

Constraints

OPL has some forall and sum aggregated statements so that the creation of constraints is very easy. For example:

forall( c in customers, pi in priceIndiceSubset) {            
  ctConvexityCondition:        
    lambda1[c][pi] + lambda2[c][pi] - z[c][pi] == 0;  
}

With docplex, you can use also some Pythonfor statements, either with the mdl.sum() or with the mdl.add_constraint() to easily create constraints. You can combine them with if filters. For example, this same constraint is formulated as follow:

for c in customers:    
  for pi in priceIndiceSubset:        
    mdl.add_constraint(lambda1[c, pi] + lambda2[c, pi] - z[c, pi] == 0, 'ctConvexityCondition')

Parameters

Setting parameters from OPL and Python is very easy.

OPL

execute PARAMS {
  cplex.tilim = 100;
}

docplex

mdl.parameters.mip.tolerances.mipgap = 0.2

Solve

In OPL, as it is a dedicated language for decision optimization models, there is no specific statement to solve the model, as the model builder will solve the complete model as a whole.

In docplex, you easily solve the model calling:

ok = mdl.solve()

Post-processing Output Solution

As for pre-processing, in general the decision variables and expressions do not exactly map the structure of what you want to get out of your model, and changing this structure in the variables would complexify the model and make it much harder to solve. Some post-processing will then be required.

In OPL, all decision variables and expressions used after the constraints block will be mapped to the solution value. You can create an output table doing something similar to:

tuple TResult {
  key int customer;
  float volume;
  float price;
  float previousPrice;
  float delta;
};{TResult} result;execute POPULATE_RESULTS{
 var delta = 0;
 writeln("POPULATE_RESULTS"); 
 for(var c in customers){
  delta = priceApplied[c] - previousPrice[c];
  result.add(c,volumePerCust[c],priceApplied[c],previousPrice[c],delta);
 }
}

With OPL models used in DO for WS, all tuple sets created after the constraints block will be published as output tables, along with the decision variables.

With docplex, the member solution_value is used to access the value of a decision variable or expression from the current solution. Solution tables are returned through the outputs Python dictionary as follow:

result = [ [c,volumePerCust[c].solution_value,priceApplied[c].solution_value,previousPrice[c], priceApplied[c].solution_value - previousPrice[c]] for c in customers ]outputs['result'] = pd.DataFrame(data=result, columns=['customer', 'volume', 'price', 'previousPrice', 'delta'])

Again, it is not necessary to specify the structure of the output tables, which will be extracted from the dataframe structure.

Get more!

You can find more about formulating decision optimization models with OPL or Python docplex, in the following tutorial and manuals.

OPL

Python docplex

Reference Manuals : MP and CP
Linear Programming Tutorial

Conclusions

In this article, I focused on the decision optimization modeling syntax differences between OPL and docplex. There are other differences, such as the fact OPL is supported in the CPLEX Optimization Studio IDE, while docplex is not, or the fact, you can easily use docplex in a Jupyter notebook, along with other technologies such as Machine Learning.

And to conclude on some personal note, let me comment on my preference. As one of the developers and then project manager of the major OPL rewrite and redesign 15 years ago, you imagine I have been and I am still an OPL enthusiast. But recently, as I had to work more with Python, I think that while Python and in particular pandas is not always so intuitive (you always forget how to do this or that with dataframe), it is very powerful to embed optimization with other technologies, and when I start a new model today, I go with Python docplex.

I would be very happy to hear about your story and your preferences, and what we should add to one or other interface to make it easier to use.

email: alain.chabrier@ibm.com

linkedin: https://www.linkedin.com/in/alain-chabrier-5430656/

twitter: https://twitter.com/AlainChabrier