An end to end use case by Kubeflow

hg liu
hg liu
Nov 19, 2019 · 3 min read

Kubeflow 0.7, precursor to 1.0 (due for early January) was released last week. This blog aims to show you how to run your machine learning project in a Kubernetes-native way with Kubeflow 0.7 by the classical MNIST project.

Install Kubeflow

  1. Install kubeflow as here
  2. Due to an issue, after installed, you need add KFServing inferenceservice apiGroups serving.kubeflow.org in the pipeline-runnerclusterrole as below screenshot by kubectl edit clusterrole pipeline-runner -n kubeflow
Update pipeline-runner clusterrole

Brief Kubeflow introduction

Below picture shows a brief introduction about kubeflow, including components which this end2end use case will apply:

kubeflow components
  1. User can create a Jupyter notebook by kubeflow Jupyter hub, and then user can create ML project for hyperparameter tuning, model training and inference in the notebook;
  2. Kubeflow pipelines are reusable end-to-end ML workflows built using the Kubeflow Pipelines SDK, using Argo under the hood to orchestrate Kubernetes resources. By pipeline, all the ML lifecycle (such as data pre-process, hyperparameter tuning, model training and model inference) can be managed and reused.
  3. Model training: kubeflow supports multiple ML frameworks in a Kubernetes-native way (TFJob for TensorFlow, PytorchJob for Pytorch, XGBoostJob for XGBoost and so on);
  4. Katib is a Kubernetes Native System for Hyperparameter Tuning and Neural Architecture Search. It supports multiple ML/DL frameworks (e.g. TensorFlow and PyTorch). Refer to my another blog about how Katib works.
  5. KFserving provides a Kubernetes Custom Resource Definition named Inferenceservice for serving machine learning (ML) models on arbitrary frameworks. It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX.

Run end to end MNIST

  1. Access kubeflow central dashboard, you can access it by public IP of any your kubernetes node with 31380 port
kubeflow UI

2. Create a notebook by Notebook Servers tab (the image of notebook in this case is gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0 )

Create a notebook

3. Upload my existing notebook file and test image file from here to your notebook

Upload mnist pipeline

4. Run the notebook to test end2end mnist pipeline, and the pipeline includes hyperparameters tuning by Katib, model training by TFJob and model inference by KFServing.

Run notebook

5. During run the pipeline, you can monitor the pipeline by click Run link hyperlink in the notebook.

6. After the pipeline done, you can also check the katib hyperparameter tuning result by Katib -> HP -> Monitor tab

Hyperparameter tuning result

About Me

My Name is Hou Gang Liu and now working at IBM as Advisory Software Developer for Kubeflow Contribution. I started to contribute to Katib and other components of Kubeflow since December 2018. I’m currently served as Katib maintainer, manifest owner and KFserving reviewer, besides I was also driving the Kubeflow contribution for IBM in China. Kubeflow is growing fast and we look forward to more and more contributors and users joining this great community. Stay tuned for the coming 1.0 Kubeflow release, I will show more update then.

My github: https://github.com/hougangliu

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade