Platform Multitenancy! Learn How We Adopt Capsule to Give Our Developers More Freedom 🚀

Oleksandr Stepanov
EPAM Delivery Platform
9 min readNov 27, 2023

In this post, we'll discuss how to use the Capsule tenancy engine to manage multiple instances of the EPAM Delivery Platform (EDP) on a single Kubernetes cluster. We will explain how deploying multiple instances of EDP can help isolate workloads and enforce governance for user resource consumption. This is particularly useful when multiple isolated teams work on different projects within the same cluster. The Capsule tenancy engine is presented as a solution to this challenge, offering more freedom to developers.

Introduction

We generally deploy one EDP instance for each project, stream, or sub-stream. While it is possible to deploy multiple instances of EDP per Kubernetes cluster, there may be situations where it becomes necessary to isolate workloads or enforce governance for user resource consumption. For example, if multiple teams are working on different projects on the same cluster, deploying numerous instances of EDP may be beneficial to ensure that each team has access to the resources they need without interfering with each other’s work. In such cases, deploying numerous instances of EDP would be a feasible solution.

We use the Capsule tenancy engine to solve this problem.

Tenancy management in EDP with Capsule

Isolation of EDP Tenants

To implement resource isolation for the EPAM Delivery Platform using the capabilities of Capsule, it is essential to establish constraints through the Capsule Tenant approach. We have two tasks: restrict our core component's resources and the resources each deployed environment consumes. The following example demonstrates how to activate the feature for deploying EDP with 2 CPU Cores and 2GB of Memory. Furthermore, it is possible to adjust any limitations in this scenario. To control the environments that the platform can create, you can modify the Capsule Tenant specifications within the cd-pipeline-operator.

EDP core component limits

In this example, the following EDP Tenant characteristics take place:

  • EDP tenant has 2 CPU and 2 GB Memory limits;
  • There are no options to deploy services with externalName, NodePort, or LoadBalancer type;
  • EDP tenant can create a maximum of 15 pods running in parallel;
  • Each Container in the EDP namespace has default resource requests/limits.

Environment limits

The following preferences are applied to the EDP Environments by the customer:

  • EDP developer can deploy a maximum of 2 environments;
  • The total number of pods should not exceed 5;
  • EDP developer cannot create PVC (meet stateless criteria);
  • There are no options to deploy services with externalName, NodePort, or LoadBalancer type;
  • The total consumption of the deployed environment should not exceed 0.5 CPU and 512 MB Memory.

Platform Deployment

Note that if you need to try out the example below, you need an integrated ready-to-go cluster with Capsule installed.

1. Before deploying the EPAM Delivery Platform, establish a Capsule Tenant resource:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
name: edp-tenant
spec:
ingressOptions:
allowWildcardHostnames: false
allowedHostnames:
allowedRegex: ^.*example.com$ # DNSwildcard for namespace
hostnameCollisionScope: Tenant
limitRanges:
items:
- limits:
# The default limits apply to each container unless otherwise specified by default
- default:
cpu: 768m
memory: 768Mi
# The default requests apply to each container unless otherwise specified by default
defaultRequest:
cpu: 256m
memory: 512Mi
type: Container
- limits:
# In case Tekton pipelines need to use volume workspaces. If use emptydir please set it to 0.
- max:
storage: 3Gi
min:
storage: 3Gi
type: PersistentVolumeClaim
# Since EDP uses one namespace, the namespace quota is set to 1
namespaceOptions:
quota: 1
networkPolicies:
items:
- ingress:
- from:
- namespaceSelector:
matchLabels:
capsule.clastix.io/tenant: edp-tenant
- podSelector: {}
- ipBlock:
cidr: 172.32.0.0/16
podSelector: {}
policyTypes:
- Ingress
# Default EDP admins group to make admin users tenant owners
owners:
- kind: Group
name: edp-oidc-admins
resourceQuotas:
items:
# The maximum CPU and Memory capacity for the EDP tenant
- hard:
limits.cpu: '2'
limits.memory: 2Gi
# The maximum number of pods that can be deployed within a namespace
- hard:
pods: '15'
scope: Tenant
serviceOptions:
# Enable the capabilities to create ClusterIP service types only
allowedServices:
externalName: false
loadBalancer: false
nodePort: false

In our case, we obtain access via the `edp-oidc-admins`. Ensure the `edp-oidc-admins` group is presented in the `CapsuleConfiguration` default. Otherwise, ensure the corresponding resource is declared within Capsule:

spec:
userGroups:
- edp-oidc-admins

2. Authenticate as an `edp-oidc-admins` group member (or any other object) and create the`edp` namespace:

kubectl create ns edp

3. To check the quota applied, we use the `kubectl get resourcequota` commands:

kubectl get resourcequota -n edp

4. Create the manifest of EDP with resource restrictions specified in the values.yaml file:

awsRegion: eu-central-1

global:
dnsWildCard: example.com
dockerRegistry:
# -- Define the Image Registry that will be used in Pipelines. It can be ecr (default), harbor
type: "ecr"
# -- Docker Registry endpoint
url: "1234567890.dkr.ecr.eu-central-1.amazonaws.com"

edp-headlamp:
resources:
limits:
cpu: 32m
memory: 32Mi
requests:
cpu: 32m
memory: 32Mi

cd-pipeline-operator:
tenancyEngine: "capsule"
capsuleTenant:
# Enable Capsule Tenant creation as a part of the cd-pipeline-operator deployment.
create: true
spec:
ingressOptions:
allowWildcardHostnames: false
allowedHostnames:
# Enable restriction pattern for ingress hostname creation.
allowedRegex: ^.*example.com$
hostnameCollisionScope: Tenant
limitRanges:
items:
- limits:
# Default limits for the container if not specified in the upstream manifest
- default:
cpu: 256m
memory: 512Mi
# Default requests for the container if not specified in the upstream manifest
defaultRequest:
cpu: 128m
memory: 128Mi
type: Container
# Manage PVC creation
- limits:
- max:
storage: 0Gi
min:
storage: 0Gi
type: PersistentVolumeClaim
# Maximum count of namespace to be created by cd-pipeline-operator
namespaceOptions:
quota: 3
networkPolicies:
items:
- ingress:
- from:
- namespaceSelector:
matchLabels:
# Please fill namespace for match selector
capsule.clastix.io/tenant: <namespace>
- podSelector: {}
- ipBlock:
cidr: 172.16.0.0/16
podSelector: {}
policyTypes:
- Ingress
resourceQuotas:
items:
- hard:
limits.cpu: 512m
limits.memory: 512Mi
- hard:
# Maximum count of pods to be deployed
pods: '5'
scope: Tenant
serviceOptions:
allowedServices:
# Restrict 'externalName', 'LoadBalancer', and 'NodePort' service type creation
externalName: false
loadBalancer: false
nodePort: false
resources:
limits:
memory: 32Mi
cpu: 32m
requests:
cpu: 32m
memory: 32Mi

codebase-operator:
resources:
limits:
memory: 48Mi
cpu: 32m
requests:
cpu: 32m
memory: 48Mi

edp-tekton:
dashboard:
resources:
limits:
cpu: 32m
memory: 32Mi
requests:
cpu: 32m
memory: 32Mi

tekton:
resources:
limits:
cpu: "768m"
memory: "768Mi"
requests:
cpu: "256m"
memory: "512Mi"

kaniko:
roleArn: arn:aws:iam::1234567890:role/AWSIRSATESTEpmKaniko

interceptor:
resources:
limits:
memory: 32Mi
cpu: 32m
requests:
cpu: 32m
memory: 32Mi

eventListener:
resources:
limits:
cpu: "50m"
memory: "64Mi"
requests:
cpu: "50m"
memory: "64Mi"

Deploy the platform by applying the manifest:

helm upgrade --install edp epamedp/edp-install --namespace edp --values values.yaml

As soon as EDP is deployed, check how much resources are currently consumed by the platform. In our case, it looks the following way:

It’s the typical consumption of an idling platform core.

5. Configure necessary integration with third-party tools like Argo CD, GitServer, SonarQube, Nexus, Registry, and DefectDojo.

Important: According to the scope of the task, the main goal is to deploy the platform within shoestring resource limitations. We do not recommend running multiple tasks/pipelines simultaneously since it may cause environmental instability.

Deploying Environments

For now, we have set limitations only for the deployed tenant. Let’s proceed with deployment restrictions.

As a prerequisite, declare a serviceaccount for the edp namespace in the `default` CapsuleConfiguration:

spec:
userGroups:
- system:serviceaccounts:edp

Under the hood, the cd-pipeline-operator creates environments that are namespaces.

Additionally, the Capsule tenant can be created automatically when deploying the cd-pipeline-operator. In this case, its name will follow the “edp-workload-{{ .Release.Namespace }}” pattern.

Alternatively, you can redefine default resource usage using the values.yaml file or manually updating the tenant's custom resource directly in a cluster:

apiVersion: capsule.clastix.io/v1beta2
kind: Tenant
metadata:
name: edp
spec:
ingressOptions:
allowWildcardHostnames: false
allowedHostnames:
allowedRegex: ^.*example.com$
hostnameCollisionScope: Tenant
limitRanges:
items:
- limits:
- default:
cpu: 256m
memory: 512Mi
defaultRequest:
cpu: 128m
memory: 128Mi
type: Container
- limits:
- max:
storage: 0Gi
min:
storage: 0Gi
type: PersistentVolumeClaim
namespaceOptions:
quota: 3
networkPolicies:
items:
- ingress:
- from:
- namespaceSelector:
matchLabels:
capsule.clastix.io/tenant: edp
- podSelector: {}
- ipBlock:
cidr: 172.32.0.0/16
podSelector: {}
policyTypes:
- Ingress
owners:
- clusterRoles:
- admin
- capsule-namespace-deleter
kind: ServiceAccount
name: system:serviceaccount:edp:edp-cd-pipeline-operator
resourceQuotas:
items:
- hard:
limits.cpu: 512m
limits.memory: 512Mi
- hard:
pods: '5'
scope: Tenant
serviceOptions:
allowedServices:
externalName: false
loadBalancer: false
nodePort: false

Isolation at the Argo CD level is implemented via Argo CD projects. More details about Argo CD projects can be found in the official documentation.

Add the new Argo CD project to enable isolation on the Argo CD side:

apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: edp
namespace: argocd
spec:
clusterResourceWhitelist:
- group: ''
kind: Namespace
description: CD pipelines for edp
destinations:
- namespace: edp-*
server: https://kubernetes.default.svc
namespaceResourceBlacklist:
- group: ''
kind: ResourceQuota
- group: ''
kind: LimitRange
- group: ''
kind: NetworkPolicy
namespaceResourceWhitelist:
- group: '*'
kind: '*'
roles:
- description: Users for edp tenant
groups:
- ArgoCD-edp-users
name: developer
policies:
- p, proj:edp:developer, applications, create, edp/*, allow
- p, proj:edp:developer, applications, delete, edp/*, allow
- p, proj:edp:developer, applications, get, edp/*, allow
- p, proj:edp:developer, applications, override, edp/*, allow
- p, proj:edp:developer, applications, sync, edp/*, allow
- p, proj:edp:developer, applications, update, edp/*, allow
- p, proj:edp:developer, repositories, create, edp/*, allow
- p, proj:edp:developer, repositories, delete, edp/*, allow
- p, proj:edp:developer, repositories, update, edp/*, allow
- p, proj:edp:developer, repositories, get, edp/*, allow
sourceNamespaces:
- edp
sourceRepos:
- '*'

How does it work?

That's all about EDP deployment. As a result, we’ve deployed a ready-to-go platform with strict limitations imposed. Restrictions were implemented through Capsule, which can guarantee that these limitations will take effect.

Now, EDP runs in an isolated namespace with restricted resources. Let’s try it out. As soon as you create an application and run the Tekton build pipeline, you will be able to follow the resource usage:

As expected, resource usage is increased while building artifacts.

To try out Capsule at work, let’s create an Environment with several stages.

Environment j11-deploy

As a result, the namespace count for the tenant will be updated similarly to those applied to EDP Core components. Keep in mind that all the quotas are divided by the Environments:

The EDP Tenant

Note that in your case, the Tenant resource will be called according to the “edp-workload-{{ .Release.Namespace }}” pattern.

As expected, the resourceQuota has been applied to the created namespace:

Let’s see what will change if we deploy the Java application:

Deploy Java java1-edp-demo application on the Dev stage of the J11-deploy Pipeline.

The resourceQuota changed accordingly:

Let’s try to push the namespace count usage to its limit. As we remember, the namespace quota is currently set to 3. So, let’s add some more stages:

Add more stages to the deployment pipeline.

The error message shows that Capsule doesn’t allow us to create the third stage since we’ve already met the namespace quota limits. It’s not a surprise for us since we manually created a resourceQuota that restricts the maximum number of namespaces for the platform to 3, so the error message is expected.

What did we get?

As a result, we got a working EDP with strict resource limits. With this approach, you can deploy several EDPs in separate Capsule tenants. Using the Cluster Add-Ons approach, we can quickly deploy dozens of platforms separately by following a simple template, with minimal tampering in the code refactoring.

The experiment demonstrated that the EPAM Delivery Platform is flexibly manageable regarding resources. This feature is implemented by employing Capsule, which allows us to manage resource limitations and isolate platforms from each other.

What’s next?

Now that we can manage multiple tenants within a single cluster, we are set to try out the vcluster approach. We will dive 🤿 deeper into this topic to discover new horizons for working environments. We are looking forward to sharing the results with you. Subscribe to stay tuned.

--

--