3 Scenarios for Machine Learning on Multicloud

Published in

Inside Machine learning

3 min readDec 20, 2017

More and more cloud-computing experts are talking about “multicloud”. The term refers to an architecture that spans multiple cloud environments in order to take advantage of different services, different levels of performance, security, or redundancy, or even different cloud vendors. But what sometimes gets lost in these discussions is that multicloud is not always public cloud. In fact, it’s often a combination of private and public clouds.

As machine learning (ML) continues to pervade enterprise environments, we need to understand how to make ML practical on multicloud — including those architectures that span the firewall.

Let’s look at three possible scenarios.

Scenario 1: Train with On-Prem Data, Deploy on Cloud

It often happens that the data science team needs to build and train an ML model on sensitive customer data even though the model itself will be deployed on a public cloud. Data gravity and security issues mean that the model needs to be trained behind the firewall, where the data lives. However, the model may need to be invoked by cloud-native applications. Concerns about the latency for scoring calls mean that the model should be deployed close to the consuming app — near the edge of the network, outside the firewall.

Scenario 2: Train on Specialized Hardware, Deploy on Systems of Record

Deep Learning models as well as some types of classic ML models can benefit from significant acceleration using specialized hardware. For example, a data science team might decide to build and train the model on specialized hardware like a PowerAI machine, which consists of Power processors coupled to GPUs through high-speed NVLink connections. The PowerAI machine is designed to significantly speed up the training process, but the model itself may need to be consumed in a system of record like an on-premises z System.

Scenario 3: Train on Cloud with Public Data, Deploy On-Prem

The third scenario is becoming increasingly common with the increased availability — and increased quality — of public data. Imagine a financial firm doing arbitrage on agricultural commodities. The data science team gathers a variety of publicly available data including weather and climate data, crop yield data, currency data, and more. Because the data is high-volume and non-proprietary, they aggregate it on a public cloud where they also train their ML model. They pull down the latest version of the model and integrate it within a proprietary application that the firm has developed to predict the prices of the commodities they trade.

IBM’s approach

Each of these scenarios calls for a fit-for-purpose, multicloud architecture for flexibly training, deploying, and consuming the machine learning models. IBM takes an enterprise approach by making our Data Science Experience (DSX) platform available both on-prem and in the cloud — with intuitive interfaces designed to let users easily move from one to the other. With the same REST APIs, you can save, publish, and consume models across environments — on the mainframe, on a private cloud, or on the public cloud, including on non-IBM public clouds, like AWS and Azure. These two videos demonstrate how easy this is: AWS / Azure.

A Kubernetes-based implementation of the DSX platform gives you the flexibility to run DSX Local within a variety of infrastructure options. For example, you can stand up a multi-node cluster with two separate infrastructure vendors, and then build and train models wherever it’s most convenient, and move your models from one vendor infrastructure to the other.

In DSX, each deployed model gets an external and internal end point. To invoke the model, simply use a REST API call for the end point. You can build and train the model on-prem and deploy the model to the cloud, where an external application like a chatbot can consume the model by making a REST API call to the particular end point.

When multicloud flexibility lets you pick and choose the cloud environments that best fit your needs, you can align with the principle of data gravity and let your consumption channels dictate where you deploy the machine learning models that will transform your organization.

Visit us to learn more about the Data Science Experience.

3 Scenarios for Machine Learning on Multicloud

Scenario 1: Train with On-Prem Data, Deploy on Cloud

Scenario 2: Train on Specialized Hardware, Deploy on Systems of Record

Scenario 3: Train on Cloud with Public Data, Deploy On-Prem

IBM’s approach

Written by John Thomas