K8S will not solve your storage problems

And it shouldn’t, so don’t let it

At ProdOps, we get to see and work with a wide range of clients. A general assumption about k8s, is that you can scale anything, to solve any problem. Yes, even if you’re running databases as pods. It’s important to stress the fact that while k8s is a great orchestrator that can manage your applications with resiliency, speed and agility; however, locking mechanisms, sharding or data replication are not among its features.

Many companies live under the impression that k8s can do it all. Whether it’s for running micro-services, periodic tasks and even persistent storage solutions. Having an orchestrator is a great way to manage resiliency of applications but it has its caveats; not all use cases fit, and those that do may encounter unplanned trouble along the way. This doesn’t mean that k8s is bad or incompetent, it does its job pretty f***g great. BUT, it’s not a throw-everything-at-me platform.

K8s adds a few layers for abstracting storage management, so that it can be claimed via API (PersistentVolumeClaim / PVC) and managed on the cluster level for persistence of data. On a deeper level, storage can be described in a deployment for pods to use throughout their lifecycle; whether for sharing amongst containers in the same pod or for a-synchronic data-related jobs that will follow. Having the option to request storage on the fly and manage it on different levels of the orchestration system is a blessing, but it can’t do everything.

One such area that k8s is sometimes trusted for where it shouldn’t be, is solving the problems of a single storage name space — a single file system accessible to one or more instances.
Let’s quickly review the major options for storage solutions when it comes to production applications:

  1. NAS — “Network Attached Storage” is a filesystem that’s based on a network connection. As such, it allows multiple connections from a large amount of compute resources that can simultaneously write to it. An internal locking mechanism makes sure a file that’s being written to is locked to other connections, and by that making it inaccessible.
    Although that’s the obvious desired behaviour, one can’t manage a DB file system where such a mechanism is running. In order to do that, you’d have to implement your own clustering solution.
  2. SAN — “Storage Area Network” is a block level data storage, as such, it can only be attached to a single VM, which de-facto facilitates the same problem encountered with the NAS’s locking mechanism; only one instance can connect and write data to files. If that instance happens to be a DB instance, e.g MySQL node and other nodes are living somewhere else in the cluster, MySQL server should take care of sharding, replications and all clustering aspects of DB management. As this may seem obvious, k8s cannot and shouldn’t solve these problems. The storage abstraction layer it provides helps manage pods basic disk space requirements but not beyond.

The challenges mentioned above have been solved and are being implemented by many great products. It’s only important to keep in mind that the solution should be implemented by the user or the product that is being deployed. Whether it’s a custom implementation of data persistency management or having a full fledged cluster manager (e.g MySQL cluster), you have to bring your own; don’t expect the platform to solve it by scale, non-existing features or by features that are intended to serve other purposes (e.g PVCs).

In my future posts, I’ll address some of the methods to implement storage management systems as pods on k8s, as well as suggestions for the implementation and deployment pipeline best practices.
Follow and stay tuned!


My name is Omer, and I am an engineer at ProdOps — a global consultancy that delivers software in a Reliable, Secure and Simple way by adopting the Devops culture. Let me know your thoughts in the comments below, or connect with me directly on Twitter @omergsr.


Originally published at www.prodops.io.