Enterprise AI with IBM Cloud Private and Power Servers

Enterprise AI requires a combination of hardware and software stack that enables cutting-edge AI innovation, with the agility and dependability that IT requires. Read more about Why hardware matters in the cognitive enterprise.

In this article, you’ll learn how to setup an on-premise GPU farm for Enterprise AI using IBM Cloud Private and Power servers.

If you are thinking why Power servers for Enterprise AI, check the following article:


  1. You need at least 1 master and 1 worker node. The worker nodes should be having GPUs. 
    You can use a mix of Intel and Power nodes, depending on your workload requirement. For example, API gateway on Intel and AI workload on Power servers with GPU.
  2. Ensure that the correct Nvidia driver is installed on all the GPU hosts.
  3. Verify that worker nodes are properly setup. See the instructions to verify your GPU node setup in the IBM Cloud Private documentation.


IBM Cloud Private (ICP) support GPUs out-of-the-box. You’ll just need to add GPU based Power servers as worker nodes.

To setup the cluster, follow the install instructions in the product documentation.

If you are adding a worker node to an existing ICP cluster, follow the steps in Adding or removing cluster nodes in the product documentation.

Additionally, install the Kubernetes command line client, ‘kubectl’, and configure it to connect to your IBM Cloud Private instance. See Accessing your IBM Cloud Private cluster by using the kubectl CLI.


After the cluster is setup, you can use the following steps for a quick verification and sanity test of the cluster.

After successful login to the ICP Dashboard, you’ll see an output similar to the following screenshot. Notice the GPU section showing available GPUs and related data:

If you are using the IBM Cloud Private command line interface (CLI), run the following command to list the cluster nodes and associated labels:

$ kubectl get nodes --show-labels

Notice the GPU model (nvida-TeslaP100-SXM2–16GB) in the node label. This shows that cluster is able to detect the GPUs and that it’s available for use.

Learn more about the CLI at Managing your cluster with the IBM® Cloud Private CLI.

Next, you’ll deploy a sample GPU application for final verification. You can either use the CLI, or Create Resource option in the top-right corner of the Management Console (often referred to as User Interface).

See the following sample gpu-demo.yaml file that we’ll use for verifying a deployment leveraging GPUs:

apiVersion: extensions/v1beta1
kind: Deployment
name: gpu-demo
replicas: 1
run: gpu-demo
- name: gpu-demo
image: nvidia/cuda-ppc64le:8.0-runtime
command: ["/bin/sh", "-c"]
args: ["nvidia-smi && tail -f /dev/null"]
alpha.kubernetes.io/nvidia-gpu: 1

Run the following command to create the deployment:

$ kubectl create -f gpu-demo.yaml

Run the following command to check the container logs:

$ kubectl logs gpu-demo-79646b7b79-mqdtk

Your GPU cluster is now ready for use for AI workloads!

Advanced Configuration

In many enterprise deployments, the same cluster will be used by different teams for different workloads, both AI and non-AI. In many such situations you need to enforce resource quotas. IBM Cloud Private supports setting up resource quota limits for GPU, in addition to CPU and Memory. This comes in very handy and enables effective sharing of GPU resources. You can decide which users are allowed to use GPUs.

The following example configuration can be used to set a GPU limit quota of 2 GPUs for the ‘test’ namespace.

apiVersion: v1
kind: ResourceQuota
name: gpu-resources-test
namespace: test
limits.alpha.kubernetes.io/nvidia-gpu: "2"

The following configuration disables GPU access for ‘default’ namespace:

apiVersion: v1
kind: ResourceQuota
name: gpu-resources-default
namespace: default
limits.alpha.kubernetes.io/nvidia-gpu: "0"

Any POD requesting for GPUs will fail to get created in the ‘default’ namespace.

Complete Example

Using the previous information, let’s look at a complete, end-to-end flow. Let’s say as an admin, you want to provide GPU resources only to users belonging to ‘cognitive’ group.

1. Create a namespace ‘cognitive-ns’

You can either use the CLI or the Dashboard > Create Resources.

"kind": "Namespace",
"apiVersion": "v1",
"metadata": {
"name": "cognitive-ns"

2. Configure LDAP and create Team

Configure LDAP in Manage > Authentication and create ‘cognitive’ team by navigating to Manage > Teams. Add relevant users and configure the appropriate roles.

After the team is created, configure the team to provide access to ‘cognitive-ns’ namespace.

Similarly configure other teams for different users and provide access to appropriate namespaces.

Details on setting up LDAP authentication and creating teams are available from the following links to the product documentation:

3. Create Resource Quotas

Set hard limit to ‘0’ for all other namespaces.

You can either use the CLI or the Dashboard > Create Resources.

##Max GPUs quota for cognitive-ns
apiVersion: v1
kind: ResourceQuota
name: gpu-resources-cognitive
namespace: cognitive-ns
limits.alpha.kubernetes.io/nvidia-gpu: "4"
##Disable GPU allocation for generic
apiVersion: v1
kind: ResourceQuota
name: gpu-resources-generic
namespace: generic
limits.alpha.kubernetes.io/nvidia-gpu: "0"

With these settings in place, any users that belong to ‘cognitive’ team will be able to deploy GPU based workloads, whereas the rest of the users will not be able to deploy GPU based workloads.

Hopefully this will be helpful in your journey towards Enterprise AI!

Watch the demo. A short video that shows the complete flow is available from the following link:

Originally published at developer.ibm.com.