The Containerization of Machine Learning: TensorFlow, Kubernetes and Kubeflow

This blog post was written by Syed Ahmed and was originally published on CloudOps’ blog. Find the original blog post here.

Machine learning (ML) is a method of data analysis for identifying patterns and predicting future probabilities. It is part of research on artificial intelligence (AI). By inputting data with predetermined answers into mathematical models, computers can train themselves to predict future unknown sets of inputs.

While ML has so far been successful in solving specified tasks, the analysis of data with more complex parameters requires models that can be deployed at scale with simplified operations. Such machine learning would enable computers to find and automate solutions from greater quantities of information. For those reasons it is estimated that AI and ML will be the lead catalysts driving the adoption of cloud computing by 2020. ML will need to learn efficiently at scale and integrate with cloud native technologies, especially containerization, in order to process the extent of information available in the cloud.

To that end Google recently announced the creation of Kubeflow, a composable, portable, and scalable ML stack built on top of Kubernetes. It provides an open source platform for ML models to attach themselves to containers, performing computations alongside the data instead of within a superimposed layer.

Kubeflow helps solve the inherent difficulty of implementing ML stacks. Building production-grade ML solutions requires importing, transforming, and visualizing data, and then building, validating, training, and deploying the models at scale. These stacks have frequently been built with different toolings, making the algorithms complicated to manage and the results inconsistent. The packages provided by Kubeflow 1.0 assimilate various ML tools, notably TensorFlow and JupyterHub, into one stack that can be easily transported in multi-cloud environments with Kubernetes.


Kubeflow relies on TensorFlow, an open source programming system, to build machine learning models. It’s software library uses tensor geometric structures to express linear relations between data in the form of stateful dataflow graphs. It abstracts the hardware platform, allowing models to be run on either CPUs (central processing units), GPUs (graphics processing unit), or TPUs (tensor processing units). Altogether, these form the base for high throughputs of low-precision arithmetic calculation. This flexible architecture allows it to bring together information from various objects, ranging from desktops to clusters or servers and mobiles and edge devices. While difficult and complex to use, TensorFlow is ideal for creating ML models with a level of sophistication that necessitates portable and scalable data management.


Kubeflow executes TensorFlow computational graphs directly from Jupyter notebooks. Jupyter notebooks are container-friendly and can run on Kubernetes or any kind of open source infrastructure. They provide users with environments and resources for ML models to be easily implemented without the overhead of installation and maintenance. Their document-style format embeds both code and markdown in the same files, providing visibility to the computations. JupyterHub allows engineers to execute TensorFlow graphs immediately or store for later use, granting more control over the configuration of TensorFlow models. Kubeflow relies on JupyterHub for collaborative and interactive training.

Kubeflow’s stack incorporates several other solutions that complement the execution of TensorFlow models. Argo is used to schedule workflows, SeldonCore is used for complex inference and non-TensorFlow Python models, and Ambassador is used as a reverse proxy. Integrated with Kubernetes, this stack allows engineers to efficiently develop, train, and deploy ML models at scale.


Kubernetes is a reliable open source container orchestration tool. It standardizes application design into modular, portable, and scalable microservices that deploy complicated workloads in diverse environments. It employs rich APIs that automate numerous operational functions. Kubeflow’s platform leverages Kubernetes to simplify the operations of TensorFlow models and make their execution cloud native.

Portability and Scalability — Kubernetes allows TensorFlow models to be managed modularly as microservices, making them highly portable and scalable. They can be easily moved between different environments, platforms, and cloud providers. Traditionally ML stacks were immobile, and the process for moving models and their associated dependencies from laptops to cloud clusters required significant re-architecture. Kubeflow allows these algorithms to access data as quickly as they are executed.

Automation and Ease of Operations — Kubernetes helps applications adopt end-to-end automation by offering a rich library of declarative APIs for managing microservices. Kubernetes takes care of resource management, job allocation, and other operational problems that have traditionally been time-consuming. Kubeflow allows engineers to focus on writing ML algorithms instead of managing their operations.

There is a vast pool of information available in the cloud, but its extent has not been fully accessible to machine learning. Kubeflow 1.0 promises ML the ability to keep up with the constant growth of data in the cloud. It integrates ML into the layer of container orchestration, providing models with a greater ease of operations, scalability, and portability. It provides a complete, containerized stack that can be quickly and simplistically deployed. Kubeflow 1.0 allows computers to train themselves with many more sets of data using a reliable and comprehensive stack. Sign up for a workshop on machine learning to learn more.

This blog post was written by Syed Ahmed and was originally published on CloudOps’ blog. Find the original blog post here.