From Data to Predictions to Actions with Watson Studio in CPD 2.5

Thomas Schaeck
6 min readDec 10, 2019

--

How can decisions and actions in your business be optimized
by intelligent predictions based on trusted data ?

Businesses take many actions every day — often ad hoc or based on fixed rules, leading to suboptimal results.

There is a large opportunity to improve business results by optimizing decisions and actions in a business more systematically.

Watson Studio with Machine Learning and Decision Optimization on Cloud Pak for Data 2.5 span across the Data and AI lifecycle, from data to predictions to optimal actions to empower your teams to:

  • Connect Data from diverse data sources on clouds or in data centers
  • Refine, visualize, analyze data to gain insights
  • Train predictive ML Models based on relevant data
  • Deploy ML Models in production to make predictions
  • Create Decision Optimization Models with objectives and constraints
  • Deploy DO Models to take optimal actions based on predictions and data
  • Use ML + DO Models from Apps or Processes to drive their actions

The following diagram shows how Watson Studio (WS), Watson Machine Learning (WML), and Decision Optimization (DO) support the above as part of the data and AI life cycle.

In remainder of this article we outline how to get from data to predictions to optimal actions, step by step.

Create a Project

A project owner creates a project, adds data scientists like Mike in the image above, as well as subject matter experts, and optimization experts to work on the project as a team. Only project members can access the secure environment provided by the project.

Create a new Project — optionally associate it with a Git repo

Connect Data

Project members can now connect to data in databases, object stores, etc via Connections, and reference or copy subsets of data as Data Assets in the project.

Connect your Project to a variety of Data Sources, on premise or on clouds

Refine, Visualize, Analyze Data

Project members can refine data as needed, and interactively visualize and analyze data in Dashboards or Notebooks to better understand it.

Use your favorite open source libraries to visualize data in your project

New: In Watson Studio in Cloud Pack for Data 2.5, we added JupyterLab as richer way to work with Notebooks, in addition to base Jupyter Notebooks. We integrated JupyterLab into Projects, so that it has access to project data assets and can work with a project’s associated Git repo.

New in Watson Studio: JupyterLab, integrated with project data assets via insert-to-code

Train predictive Models

Data scientists can create and train ML models using AutoAI (new), Notebooks, or SPSS Flows.

AutoAI provides an easy way to create a set of model pipeline candidates by providing a raw data set and letting AutoAI perform model selection, feature engineering, hyper parameter optimization, etc for a set of pipeline candidates. Data scientists can then explore various metrics of the resulting models, pick the models they like best, and save them to the project.

New: Auto AI in Watson Studio - pick and save your favorite Model to the Project for further use

Notebooks — alternatively, data scientists can train models using Notebooks in Jupyter or JupyterLab, and can save the trained models to the Project using the WML Python Library, or push a trained model to the project’s associated Git repo.

SPSS Flows allow multiple personas including domain experts to create and train models by defining model training flows in a visual editor and running these flows to create, train, and save models to the Project.

Deploy ML Models

Models can be promoted from a Project to a Space, where authorized users can create Model Deployments to serve the models for online or batch scoring.

A model that was generated using Auto AI, deployed in a Space

This makes the ML models accessible through the WML public REST APIs. Applications or business processes can now invoke the models through the WML REST APIs to get predictions.

Once models are deployed, payload input & output logs with model input data and model prediction output can be recorded in a database table, which can be continuously monitored and analyzed for fairness using Watson Open Scale. This allows to automatically detect bias in model scoring and enables taking quick corrective action if needed.

Create Decision Optimization Models

In order to progress from predictions to optimal actions, we allow combining machine learning with decision optimization, so that predictions from an ML Model plus other input data can feed into a Decision Optimization Model to determine optimal actions based on that input.

From Data to Predictions to optimal Actions with ML + DO

When there is large amount of predictions and related data, it is very hard to determine the optimal actions based on those inputs. In order to solve these problems, an optimization expert or data scientist familiar with optimization can create and test a Decision Optimization model in a Notebook or in the DO Model Builder, using data from the project and predictions created by an ML model, to solve for optimal actions. Decision Optimization in Watson Studio leverages the advanced DOCPLEX engine to solve optimization problems.

DO Model Builder with input data and model definition

Deploy Decision Optimization Models

Like ML Models, DO Models can be promoted from a Project to a Space, where authorized users can create Model Deployments to serve them for solving optimization problems. This makes the DO models accessible through the WML public REST APIs for access by applications or business processes, so that provided with data and ML predictions as input, the DOCPLEX engine will solve the problem and generate optimal actions as the solution.

Use ML + DO Models in Apps or Processes

By deploying both ML + DO Models on the same WML service, it becomes easy for applications or processes to call ML models to generate predictions based on provided data, and then call DO models with data + predictions to determine the optimal actions to take.

Deploying ML + DO Models in a Space and using them together from apps or processes

Share Data and Assets for re-use across your Enterprise

When useful data sets, notebooks, ML Models, or DO Models are created in a project, it can make sense to share them for re-use in other projects in your enterprise. To enable this, assets in Projects can be shared to Catalogs, where other users who have been granted access to the Catalog can find them, and add them to their own projects for re-use.

Learn more about . . .

Introducing AutoAI for Watson Studio

Combine Machine Learning and Decision Optimization in Cloud Pak for Data

Try Watson Studio and ML on the IBM Cloud

--

--

Thomas Schaeck

Distinguished Engineer, IBM Watson Studio — Leading architecture for Watson Studio on Cloud Pak for Data and IBM Cloud at IBM Data and AI