Building a robust GPU cloud in 60 mins. Part 4:A real multi-tenancy cloud user case

Cheng Zhang
4 min readOct 1, 2021

--

Background photo created by Vitalii Krasnoselskyi | Dreamstime.com

Back to two months earlier, my company (Cognitive Quanta) aligned with Powerland and NVIDIA hosted a workshop. The purpose was to create awareness and drive excitement surrounding the RAPIDS solution, including a hands-on demo to showcase features and benefits.

As a certificated trainer, I have rich experience in lab environment building in my previous business project. It started with Microsoft Virtual Server and Cloudbox, and then everything turned into cloud-based, such as Azure or AWS. This time the main challenge is GPU part, as each guest needs at least one GPU-enabled lab.

So it is our GPUaaS cloud show time, in the previous three blog posts, we have built a robust K8S-based platform running on vSphere 7, so this time, let us take a further step to expose our service platform to the end-user. Because K8S is only a resource abstract and orchestration platform, it is still less adequate to a business level platform. We need to involve a professional service platform to consolidate and harden our GPUaaS service delivery quality.

Cognitive Quanta Corporation’s (CQC) Container Platform is a purpose-built extension of the HPE Ezmeral Container Platform featuring pre-built configurations, integrated modern Kubernetes, container tooling and extensions. These customizations enable customers to adopt a modern, enterprise-grade container platform without needing to invest weeks or months training their team on deployment or configuration, and without sacrificing access to the latest Kubernetes ecosystem tools.

If the current latest releases of Kubernetes, with all the latest Kubernetes ecosystem of tools, is Version N, and enterprise as-a-service and platform offerings such as HPE Ezmeral, Amazon, Google, Azure, OpenShift, Tanzu, are N-2 (i.e., two versions behind to guarantee stability, vet performance, and manage integrations), then the CQC Container Platform is N-1. This approach offers our customers performance, stability and security while still providing important updates or feature sets.

Built on the trusted foundation of the HPE Ezmeral Container Platform, the CQC Container Platform provides pre-built configuration templates for common use cases, integration with popular open-source tools, and automation to enable customers to adopt quickly and confidently.

The CQC Container Platform was designed from the ground up to specialize in artificial intelligence (AI) and other container-based workloads that demand tight GPU integration and support for the latest hardware and full access to the features available in the current generation of GPUs. With full support for NVidia’s Multi Instance GPU (MIG) on the NVidia A100 datacenter GPU, and raytracing acceleration available from the NVidia A40 and workstation GPUs, the CQC Container Platform is ready for the next generation of containerized AI and ML applications.

Further, because the CQC Container Platform uses HPE Ezmeral as only one component of its architecture, it can enable hybrid container/VM workloads such as dynamically allocating CPU cores and GPUs to either VDI workstations during the day and offline AI/ML and Media and Entertainment workloads in the off-hours.

This dynamic nature as a first-class feature of the CQC Container Platform is the first of its kind among enterprise-grade container platforms.

So after chose our service delivery platform, the final 15 mins will create an Ezmeral NVIDIA Rapids service and publish to all students.

We can use Kube Director to build this Rapids service, and the complete step by step guide can be referred here.

And Rapids image can be found by NVIDIA NGC website as well.

Here I post some key settings or screenshot from our CQC Container Platform:

Dashboard to show all resources
Dashboard to show all resources

As a multi-tenancy service platform, it is always important to limit resource quota for each user (imagine if student will use GPU to mine cryptocurrency) , which can be set from our service portal:

Resource Quota setting

Once students use their account to login, they can create ‘Rapids App’ and the application (actually it is a GPU pod with NGC Rapids image) will be published to public by gateway:

Useful link:

https://www.cognitivequanta.com/

https://www.hpe.com/us/en/solutions/ezmeral-container-platform.html

--

--

Cheng Zhang

AI and MLOPS Expert at Cognitive Quanta (cognitivequanta.com), Microsoft MCT and 8 times MVP award.