TL;DR: Performance-critical applications demand the very notion of cloud native be revisited — there is a pressing need to move beyond just microservices.
When it was originally formed some four years ago, the Cloud Native Computing Foundation (CNCF) was decidedly predispositioned towards microservices. Whereas this may have made sense initially, it is now evident that this exclusive orientation around microservices needs to be revisited.
Use cases employing AI convey the most compelling rationale for reconsideration. Whereas many AI use cases are well suited to implementations that resonate soundly with a microservices-based architecture, exponentially increasing numbers of performance-critical workloads are ill suited to this style of deployment. So much so that transforming these performance-critical workloads into refactored implementations more suitable for a microservices-based architecture is impractical at best, and nonsensical at worst.
To exacerbate requirements, increasingly there exists demand for a blend of microservices capabilities in support of use cases involving streamed workloads, real-time analysis, and data pipelining into performance-critical services. These inherently hybrid applications demand the converged infrastructure enabled by teaming microservices with performance-critical capabilities.
Hardly science fiction, and literally demonstrable at the edge, we showcased a badge-detection demo at recent tradeshow events. In detecting objects in real time, this demo not only bridges the analog-digital divide, it resoundingly makes the case for a converged infrastructure. Rather than satisfying some outlier use case this demo, and the requirement for the converged infrastructure it encapsulates, are in fact representative of a broader class of use cases.
Along with microservices, orchestrated containers have been central to the very notion of cloud native since the Foundation’s inception. For those applications amenable to decomposition into implementations consistent with a microservices-based architecture, Kubernetes-based orchestration of containers emerged as the de facto standard. To enable a native integration between Kubernetes and arbitrary container runtimes, a Container Runtime Interface (CRI) was introduced at the end of 2016. The combination of CNCF graduate project Kubernetes, in tandem with the emergence of the Open Containers Initiative (OCI), resulted in standards being factored into this CRI.
Supporting performance critical, as well as emerging hybrid use cases that include microservices, combined to establish the need to incorporate trusted and mobile Singularity containers into cloud-native deployments orchestrated by Kubernetes. Enthusiastically bolstered by its commercial customers, as well as the user, developer, and provider community that embraces open source Singularity, Sylabs championed an integration with Kubernetes in two phases.
By leveraging compliance with the OCI Runtime Specification in its core, Sylabs released a CRI implementation tailored specifically to ensure native interoperability with Singularity, in the first phase. In so doing, support for compute-driven workloads was effectively introduced to Kubernetes via the new Singularity integration. Having deep roots in performance-critical deployments at scale, this Singularity-Kubernetes integration enabled for the first time a performant and secure option for executing all of the AI oriented use cases described above. As an added bonus, Singularity’s emphasis of integration over isolation ensures ease of access when it comes to GPU access alongside distributed parallel processing — all within an unencumbered Kubernetes deployment.
Performance-critical AI use cases bear much in common with traditional use cases from High Performance Computing (HPC). Therefore, in the second phase of better enabling Kubernetes with performance-critical capabilities, Sylabs once again championed an open source project that has resulted in the incorporation of the Slurm workload manager into the Singularity-Kubernetes integration. Briefly, this means that any performance-critical use case can simultaneously leverage ‘HPC affinities’ made available for Singularity containers via Slurm in tandem with Kubernetes — in fact, “kubectl” (the Kubernetes CLI) provides the locus of control for manipulating and monitoring compute ‘jobs’ in Slurm, whether they are containerized via Singularity or not.
Regarding this second-phase integration Bob Killen, Research Cloud Administrator with the Advanced Research Computing Technology Services (ARC-TS) group at the University of Michigan and CNCF Ambassador, states:
“I’m very excited to see all the effort Sylabs has put into creating a hybrid HPC/Kubernetes platform. By implementing native Kubernetes objects to represent SLURM jobs, it paves the way for much better integration with everything in the Kubernetes ecosystem.”
Next Steps …
The entire solution stack — Kubernetes, Singularity itself and its CRI, Slurm and its operator — is open source and currently available. Upon deployment, this solution delivers a converged infrastructure for performance critical as well as hybrid use cases that include microservices. However, what this solution also informs us of, is the compelling need to factor performance-critical use cases into the very notion of cloud native — a notion that needs to be refactored in reach well beyond the ‘traditional restrictions’ implied by a predisposition towards microservices-based architectures.
If you have performance-critical workloads to deploy via Kubernetes, we encourage you to take advantage of this purpose-built integration based around Singularity containers — especially if these use cases involve the need to access special-purpose devices such as GPUs, while processing in parallel across a distributed computing environment. Better enabled for the uptake of performance-critical use cases, we can all look forward to cloud native embracing requirements from applications other than those that are microservices based.