Start Combining ML and DO Today!

Jul 24, 2019 · 7 min read

It is now easier than ever to start combining Decision Optimization with Machine Learning.
What’s the difference between Machine Learning (ML) and Decision Optimization (DO)?
What are the benefits of combining these two techniques?
How can you get started? A commonly-used marketing campaign problem is introduced here to get you started. Follow this step-by-step guide which shows you how to set up this sample with Decision Optimization on Watson Studio.

ML and DO

Machine Learning (ML) and Decision Optimization (DO) are two different sets of techniques which comes from Data Science (DS) and Artificial Intelligence (AI). They include different sets of mathematical tools to support Business Analytics.

You can read about the main difference between these techniques using a simple example here.

The power of ML is to extract unknown data characteristics based on reproducible patterns. For example, after training a model on a large historical set of telco customers behaviors, some of them who have churned, you might expect your model to predict the future churn of current customers just based on their characteristics and behavior. ML can predict who from a set of 10 000 customers might churn, but it will not correctly tell you what to do with your marketing budget to optimally avoid these churns.

The power of DO is to prescribe sets of decisions which are constrained between them. For example, a model can be formulated to schedule the operations of a production factory where units have to flow over different sets of machines with different characteristics to be produced. DO can prescribe the optimal schedule to produce units of product to satisfy a given demand of 10 000 units, but it is not able to tell you what this demand will be, even given the production history.

Each set of techniques has its own application sweet spot. It is important to understand the different types of data to choose among the different set of techniques.

Campaign Marketing with ML + DO

ML and DO combined can work stronger together. So it is important to use a platform where these different sets of techniques are available and can be combined.

A typical use case is to use ML for predictions based on historical data and then DO for decisions based on constraint formulations.

A typical application, in place in many companies, is campaign marketing.

With unchanged buying behaviors and after training an ML model on a large number of historical customers, you can expect your model to correctly predict the individual behavior of new customers. Then, if the actions to be taken, based on these individual behaviors, are constrained among the overall sets of customers (i.e. what you can do for one customer is constrained by what you do for others), you might formulate a DO model to prescribe the optimal way to spend your marketing budget.

Some people will argue that DO is not necessary to resolve the right-hand side problem and that a greedy algorithm is enough. A greedy algorithm is a simple, intuitive algorithm that makes the optimal choice at each step as it attempts to find the overall optimal way to solve the entire problem. It reproduces the simple human approach to problem solving. Let’s see how it behaves on a simple marketing campaign problem.

Imagine the following very small set of data:

You have only 10 customers and 3 products. ML has predicted the expected revenue for promoting each product to each customer. All expected revenues are positive and assuming a zero-cost and unlimited marketing campaign, you can just promote each product to each customer. Simple an efficient greedy algorithm, which indeed finds the optimal.

But in practice, the world is limited. You have a limited budget, not all products can be promoted to all customers, etc. Let’s take the very simple case where:

• at most one product can be promoted to a given customer
• at most 3 customers can be promoted a given product

What is the optimal solution?

A greedy algorithm would probably order all expected revenues and start with Customer 6, assign it Travel, then take the next higher revenue, etc. This algorithm leads to a valid solution, but not an optimal one. In fact, the optimal solution (\$770) does not include this Customer 6 Travel combination.

With this small didactical example, you might lose \$30 and hence feel that the greedy algorithm is good enough. But in real life, banks have millions of customers and dozens of products. Greedy algorithms might hence do worse than that 5% gap to optimal, and even a small percentage is good to capture when it applies to millions of dollars.

Anyone who thinks that greedy algorithms are good enough to solve optimization problems has completely missed the point of Data Science.

Decision Optimization for Watson Studio

I posted earlier this year that Decision Optimization if now available on the cloud as part of Watson Studio. You can, without any cost, create a python notebook and start using DO with the docplex API.

Starting now, it is even easier to start as IBM has introduced the capability to create a new project from a sample template. The result is a working notebook that you can start from and modify for your need.

Let’s see the few steps to do this:

There is a video with some hints to do so if needed.

The URL is https://dataplatform.cloud.ibm.com/

3- Create a “new project”

4- Select “ Create a project from a sample or file”

5- Select “From sample”

6- Select the Marketing Campaign sample

7- Change project name if needed and click Create

The project will be automatically created, the data sets and notebook imported and associated with the right project token and Jupyter environments.

You can then open the notebook and run it as indicated in the readme box:

To run the notebook:

1. Go to the Assets page.
2. In the Notebooks section, click the pencil icon in the same row as the notebook name to open the notebook in edit and run mode.
3. Run all the cells in the notebook from beginning to end.

The notebook is self-documented so that you can understand it and modify it to your needs to start your own ML + DO project.

The main parts are:

• Understand the historical data: load historical data from a project data-set and do some visual analysis to understand which features should be used for model training,
• Train a model to predict customer behavior: train one model per product,
• Predict the new customer behavior: load new data with unknown behavior and apply the trained models to predict expected revenue for each new customer,
• Prescribe the best business decisions: formulate the DO model and run it on the unknown behavior data augmented with the predicted expected revenue to decide what actions to take,
• Run some what-if analysis: one benefit of DO is the ability to do what-if analysis, looking at optimal solution for different data or constraints scenarios. Here, we look at optimal solutions depending on the marketing budget.

The comparison between a DO model and a greedy algorithm is investigated in another version of this notebook available in the community.

Another typical application of ML + DO is predictive maintenance, where ML predicts when some assets may fail and DO plans the maintenance activities. A notebook showing this use case and with a very similar structure is available in the community.

Note that in addition to notebooks, a dedicated experience to build DO models is also available in beta.

Conclusion

Decision Optimization is the ideal companion for Machine Learning. A lot of confusion exists, sometimes reducing AI or DS to ML and ignoring DO. There is no doubt that as of today data-driven techniques such as ML are not able to solve optimization problems where multiple decisions are linked by global constraints. Only DO can solve these problems. The good news is that DO is now easier to use and easier to combine with ML to create complete algorithms to predict data and prescribe decisions.

Alain.chabrier@ibm.com

@AlainChabrier

Written by

Written by