Multiuser Kubeflow on IBM Cloud OpenShift

Published in

IBM Data Science in Practice

6 min readFeb 17, 2021

image of hexagon composed of multiple other shapes, a circle with two pieces projecting into three d and a cloud icon

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.

Kubeflow tries to bridge the gap between data science experiments and production ready ml models that bring real business value. By giving a platform for model creation, pipeline, notebook, hyper parameter optimization, model serving and many more features, Kubeflow makes the job of data scientists and ml engineers easier.

This article is about setting up and using multiuser Kubeflow on IBM Cloud OpenShift cluster. Multiuser Kubeflow allows us to get multi tenancy in Kubeflow. In a production environment, it is often necessary to share the same pool of resources across different teams and users. These different users need a reliable way to isolate and protect their own resources, without accidentally viewing or changing each other’s resources.

Prerequisites

Signup for IBM Cloud if you don’t have an account
Log into IBM Cloud using the IBM Cloud Dashboard
Create and access a OpenShift cluster on IBM Cloud
To deploy Kubeflow on IBM Cloud OpenShift, you need a OpenShift 4.5+ cluster running on IBM Cloud. If you don’t have a cluster running you can create one here
Once your cluster is ready you can go to OpenShift web console from the Overview window.
In the OpenShift console click on the top right drop down and click on Copy Login Command This will give you a command like this:

oc login — token=<some-api-token> — server=<some-server>

Paste this command into your terminal and you are ready to go.

This guide assumes you have oc command line tool already installed. If not install the cli.

Understanding the Kubeflow deployment process

The deployment process is controlled by the following commands:

build — (Optional) Creates configuration files defining the various resources in your deployment. You only need to run kfctl build if you want to edit the resources before running kfctl apply.
apply — Creates or updates the resources.
delete — Deletes the resources.

App layout

Your Kubeflow application directory ${KF_DIR} contains the following files and directories:

diagram of Kubeflow application directory

CONFIG_FILE (kfdef) is a YAML file that defines configurations related to your Kubeflow deployment.
This file is a copy of the GitHub-based configuration YAML file that you used when deploying Kubeflow, such as IBM Config KFDEF.
When you run kfctl apply or kfctl build, kfctl creates a local version of the configuration file. You can further customize this file if necessary.
kustomize is a directory that contains the kustomize packages for Kubeflow applications.
The directory is created when you run kfctl build or kfctl apply.
You can customize the Kubernetes resources (modify the manifests and run kfctl apply again).

Kubeflow installation

Note: kfctl is currently available for Linux and macOS users only. If you use Windows, you can install kfctl on Windows Subsystem for Linux (WSL). Refer to the official instructions for setting up WSL.

Run the following commands to set up and deploy Kubeflow:

Download the latest kfctl release from the Kubeflow Release Page

Note: You’re strongly recommended to install kfctl v1.2 or above. kfctl v1.2 addresses several critical bugs that can break the Kubeflow deployment.

Extract the archived TAR file:

tar -xvf kfctl_<version>_<platform>.tar.gz

Make kfctl binary easier to use (optional). If you don’t add the binary to your path, you must use the full path to the kfctl binary each time you run it.

export PATH=$PATH:<path to where kfctl was unpacked>

Run the following steps to deploy Kubeflow with IBM Cloud AppID as an authentication provider.

The scenario is a Kubeflow cluster admin configures Kubeflow as a web application in AppID and manages user authentication with builtin identity providers (Cloud Directory, SAML, social log-in with Google or Facebook etc.) or custom providers.

Set up environment variables:

export KF_NAME=<your choice of name for the Kubeflow deployment> 
# Set the path to the base directory 
# where you want to store one or more 
# Kubeflow deployments. For example, use `/opt/`. 
export BASE_DIR=<path to a base directory> 
# Then set the Kubeflow application directory for this deployment. export KF_DIR=${BASE_DIR}/${KF_NAME}

Set up configuration files:

export CONFIG_FILE=kfctl_ibm_multi_user.yaml 
export CONFIG_URI=”https://raw.githubusercontent.com/kubeflow/manifests/master/kfdef/kfctl_ibm_custom_openshift_multiuser.yaml" 
# Generate and deploy Kubeflow: 
mkdir -p ${KF_DIR} 
cd ${KF_DIR} 
curl -L ${CONFIG_URI} > ${CONFIG_FILE}

Note: By default, the IBM configuration is using the Kubeflow pipeline with the Tekton backend. If you want to use the Kubeflow pipeline with the Argo backend, modify and uncomment the argo and kfp-argo-multi-user applications inside the kfctl_ibm_multi_user.yaml and remove the kfp-tekton-multi-user, tektoncd-install, and tektoncd-dashboard applications.

Deploy Kubeflow:

kfctl apply -V -f ${CONFIG_FILE}

Wait until the deployment finishes successfully — for example, all pods should be in the Running state when you run the command:

oc get pod -n kubeflow

AppID Setup

Follow the Creating an App ID service instance on IBM Cloud guide for Kubeflow authentication. You can also learn how to use App ID with different authentication methods.
Follow the Registering your app section of the App ID guide to create an application with type regularwebapp under the provisioned AppID instance. Make sure the scope contains email. Then retrieve the following configuration parameters from your AppID: clientID , secret , oAuthServerUrl
Register the Kubeflow OIDC redirect page. The Kubeflow OIDC redirect URL is http://<kubeflow-FQDN>/login/oidc. <kubeflow-FQDN> is the endpoint for accessing Kubeflow. By default, the <kubeflow-FQDN> on IBM Cloud is <worker_node_external_ip>:31380. If you don't have any experience on Kubernetes, you can expose the Kubeflow endpoint as a LoadBalancer and use the EXTERNAL_IP for your <kubeflow-FQDN>.

oc patch svc -n istio-system istio-ingressgateway -p '{"spec": {"type": "LoadBalancer"}}'

4. Then, you need to place the Kubeflow OIDC redirect URL under Manage Authentication > Authentication settings > Add web redirect URLs.

5. Create the namespace istio-system if it does not exist:

oc create namespace istio-system

6. Create a secret prior to Kubeflow deployment by filling parameters from the step 2 accordingly:

oc create secret generic appid-application-configuration -n istio-system \ 
— from-literal=clientId=<clientId> \ 
— from-literal=secret=<secret> \ 
— from-literal=oAuthServerUrl=<oAuthServerUrl> \ 
— from-literal=oidcRedirectUrl=http://<kubeflow-FQDN>/login/oidc```

<oAuthServerUrl> - fill in the value of oAuthServerUrl
<clientId> - fill in the value of clientId
<secret> - fill in the value of secret
<kubeflow-FQDN> - fill in the FQDN of Kubeflow, if you don't know yet, just give a dummy one like localhost. Then change it after you got one.

Note: If any of the parameters changed after the initial Kubeflow deployment, you will need to manually update these parameters in the secret appid-application-configuration. Then, restart authservice by running the command oc rollout restart sts authservice -n istio-system.

Verify mutli-user installation

Check the pod authservice-0 is in running state in namespace istio-system:

oc get pod authservice-0 -n istio-system

Access Kubeflow Dashboard

Find the Kubeflow dashboard url with

oc get route -n istio-system istio-ingressgateway -o=jsonpath='{.spec.host}'

Once you visit this url you will be redirected to AppID to authenticate.

screen shot of appID login page — AppID Login

Once that is done you will be redirected back to your Kubeflow Dashboard. Each user will be given their own namespace. So we get isolation of user space.

screenshot of Kubeflow dashboard — Kubeflow Dashboard

Caveats

Multiuser Kubeflow in OpenShift makes use of Istio 1.3 for RBAC and profile control. Istio 1.3 does not work out of the box on OpenShift, as documented here. To get Istio to work properly on OpenShift, we had to relax some security on OpenShift. There is currently work happening around enable the use of OpenShift Service Mesh.

Next Steps

You can enable https on your Kubeflow dashboard url
You can try the E2E Kubeflow Tutorial on IBM Cloud

Conclusion

This guide will get you started with installing Kubeflow on IBM Cloud OpenShift. The multi user installation of Kubeflow is great for teams that want to run ML experiments but still want the user level isolation.