Kubernetes Storage Considerations for AI Workloads

Falk Pollok
AI Platforms Research
1 min readNov 20, 2018

Falk Pollok, Parijat Dube

In this 3-part blog post series, we will discuss storage options for AI and particularly deep learning and discuss how to avoid a few common pitfalls, esp. regarding small file support. The structure is as follows:

  1. Fundamentals of IBM Cloud Storage Solutions for Kubernetes including Cloud Object, File and Block Storage as well as Intel’s specialized vck driver
  2. Deploying Large, Data-Intensive AI Applications Using Kubernetes on IBM Cloud
  3. Performance optimization of deep learning training on large datasets consisting of small files

We will first introduce the basics of Kubernetes storage as well as the general procedures to provision it with concrete examples for IBM Cloud. Afterwards, we will take another concrete use case, dataset visualization, and show how to deploy an AI application to a Kubernetes cluster leveraging cloud storage. Finally, we will discuss the aforementioned case of supporting small files in deep learning training and show how to achieve at least a 36x speedup through data transformation and shared memory.

--

--