Kubeflow 0.4: Release Update & What’s Coming

Published in

kubeflow

4 min readDec 11, 2018

The Kubeflow Product Management Team

Kubeflow is building the leading Kubernetes-based open source community for Machine Learning (ML) software application development. We wanted to provide details on new features and valuable new functionality that’s been committed into the upcoming Kubeflow 0.4 release, and how they’ll benefit AI development. As of December 10, the Kubeflow 0.4 release development for the P0 and P1 priority open items is ~65% completed and a preview of the release is expected to be available by the end of the month.

Over the last year, the Kubeflow Community has effectively organized and delivered 90-day software releases that have significantly increased AI researcher productivity, while improving the platform’s stability, fit and finish. Thanks to the rapid innovation made possible by our Community’s 100+ contributors across 20+ organizations, Kubeflow users are gaining strategic advantages in the race to deliver production-quality AI applications.

Here are some key features and updates you can look forward to:

Kubeflow Pipelines (more on this below) for orchestrating ML workflows.
An updated JupyterHub UI that makes it easy to spawn notebooks with PVCs.
Katib support for using TFJob.
An alpha release of fairing, a library that makes it easy for data scientists to build and start training jobs directly from a notebook.
An initial release of a CRD for managing Jupyter notebooks.
TFJob and PyTorch are going beta.

Below, we’ll dive into more details of Kubeflow 0.4’s development progress, Kubeflow Pipelines, learning resources, and opportunities to get involved.

Kubeflow 0.4 — By the numbers

Per the Kubeflow Community’s software release process, Kubeflow’s Open Issues are tracked in GitHub. In addition, the Community uses a KanBan Board (GitHub-based) to group features into themes and to track the development progress of those release themes.

A list of our KanBan Themes and corresponding open issue counts (as of 12/10):

PyTorch/TFJob beta — (1)
Train Deploy From Notebook — (3)
More robust inference management — (3)
Make ML easier to manage — Model Tracking + HP tuning — (5)
Kubeflow infra improvement — Doc/stabilization — (7)

A full list of the open issues, by Priority Level (P0 = most important):

P0s: 6 open, 5 closed
P1s: 37 open, 73 closed
P2s: 51 open, 11 closed (lower priority)

New: Kubeflow Pipelines

Kubeflow 0.4 has several new deliveries, including Pipelines. Kubeflow Pipelines provides tooling to compose, deploy and manage end-to-end machine learning workflows. They speed up experimentation by enabling researchers to re-use components and to simplify the end-to-end orchestration of pipelines, ensuring reproducibility of ML experiments.

Pipelines, and their containerized components, can be defined and managed using the Python Pipelines SDK in Notebooks or a local IDE. This simplifies the steps and parameters required for pipelines submission and monitoring. Additionally, Pipelines provides a single console to manage all your machine learning experiments, as well as an intuitive UI to compare experiments.

The Pipelines code is available on GitHub.

Learn More

If you’d like to try out Kubeflow, we have a number of options for you:

Use sample walkthroughs hosted on Katacoda
Follow a guided tutorial with existing models from the examples repository. We recommend the GitHub Issue Summarization for a complete E2E example.
Start a cluster on your own and try your own model. Any Kubernetes conformant cluster will support Kubeflow, including those from contributors Alibaba Cloud, Caicloud, Canonical, Cisco, Dell, Google, Heptio, Intel, Mesosphere, Microsoft, IBM, Red Hat/Openshift and Weaveworks.
Several companies are incorporating Kubeflow into their value-adding solutions, including Arrikto and others. Engage with these market leaders for a review of their use of Kubeflow and how it helps to deliver solutions faster.

Join the Leading Community for Kubernetes-Based Machine Learning

The Kubeflow Community is powered by 100+ contributors in ~20 different organizations, and has received more than 4,300+ GitHub stars. Many supporting projects are collaborating with the Kubeflow community to extend and expand the value of the ecosystem. (Look forward to a more granular summary of these efforts with the 0.4 release!)

To everyone who has contributed so far, we’d like to offer a huge thank you! We are getting ever closer to realizing our vision: letting data scientists and software engineers focus on the things they do well by giving them an easy-to-use, portable and scalable ML stack.

For those eager to join in, we’re listening. Please tell us about the feature (or features) you’d really like to see that aren’t there yet. Some options for making your voice heard include:

Kubeflow Slack channel
The Kubeflow-discuss Mailing list
Kubeflow Twitter
Our weekly community meeting
Please download and run Kubeflow, and submit bugs!

Thanks to Josh Bottum (Canonical), Abhishek Gupta (Google), Jeremy Lewi (Google), Edd Wilder-James (Google), Anand Iyer (Google), Fei Xue (Google), David Aronchick (Microsoft).

Kubeflow 0.4: Release Update & What’s Coming

Written by Thea Lamkin