Scalable Deployment of Microservices With Cellery

Madusha Gunasekara
wso2-cellery
Published in
6 min readAug 14, 2019

When you deploy an application/service on a host machine, one of the major concerns is how to scale the infrastructure based on application demands. Kubernetes address this concern with Cluster Autoscaler and the cluster will add pods dynamically if the user has enabled the autoscaler.

In Cellery, each component within the cells can be scaled up or down dynamically. Cellery supports auto-scaling with Horizontal pod autoscaler (HPA) and Zero-scaling and each component in a cell may have either HPA or Zero-scaling. In this post let’s take a look at how to deploy an application with HPA in Cellery.

Enable HPA on Cellery runtime

Cellery uses metrics server to obtain resource utilization metrics for pods and nodes and HPA is implemented based on the metrics received from the metrics server. By default, this metrics server is not deployed in local and existing kubernetes cluster environments in cellery. Hence, HPA is disabled in those environments by default. You can check the status of HPA is by using the following command.

$ cellery setup status

You should see an output like below showing the status of cellery system components.

SYSTEM COMPONENT              STATUS
--------------------------- ----------
ApiManager Disabled
Observability Disabled
Scale to zero Disabled
Horizontal pod auto scalar Disabled

If you are in a GCP environment, you might see Scale to zero and HPA is in the enabled state since GCP deploy metrics server by default. However, you can enable HPA by using interactive cellery CLI.

$ cellery setup           
✔ Modify
✔ Autoscaler
✔ Horizontal Pod Autoscaler
✔ Enable
✔ BACK
✔ DONE
Following modifications to the runtime will be applied
Enabling : Horizontal Pod Autoscaler
✔ Yes
✔ Updating cellery runtime
✔ Checking cluster status...
✔ Cluster status...OK
✔ Checking runtime status (Istio)...
✔ Runtime status (Istio)...OK
✔ Checking runtime status (Metrics server)...
✔ Runtime status (Metrics server)...OK
✔ Checking runtime status (Cellery)...
✔ Runtime status (Cellery)...OK

Now, check again the status of the cellery system components by using the $ cellery setup status command. Once HPA enabled in cellery runtime, we are all set to start deploying an application by applying autoscaling policy.

Cellery Syntax for implementing HPA

Rather than specifying autoscaling configurations in a yaml file, cellery allows developers to have a code-first experience even for deployments. Syntax to specify the resources and scaling policy is as follows.

Resources

CPU and memory are each a resource type and each has a base unit. CPU is specified in units of cores, and memory is specified in units of bytes. In Cellery, we can specify the initial resource requirement of a component by using the request field. Kubernetes scheduler uses these resources when initializing a new component. The maximum limit of resources a component can obtain is configured under the limits field.

Scaling

Cellery allows two scaling policies which are AutoScalingPolicy and ZeroScalingPolicy. Here we are using AutoScalingPolicy with a minimum of 1 replica and a maximum of allowable 10 replicas. We can set thresholds for resources under the metrics field and cellery allows interpreting those values as either a percentage or value. In this scenario, a new replica will initialize when memory utilization reaches to 128Mi or CPU utilization reaches to 250m.

Here, the component is defined with scaling policy with override fieldfalse. Therefore the autoscaling policy cannot be altered in the runtime.

Export Policy

Once the above component is wrapped into a cell and deployed in Cellery runtime, the build time autoscaling policy will be applied. Often the build time autoscaling policy is not sufficient for scalable deployment. Based on the runtime environment and available resources, the DevOps may require to re-evaluate the autoscaling policies. Therefore, the exporting policy will be helpful to understand the current autoscaling policy that is applied to the component. This can be performed by using the following command.

$ cellery export-policy autoscale <instance-name> -f /location/to/store/the/file/<my-instance-policy>.yaml

If you skip the optional -f flag to point to the file location, it will create the policy file where you execute the command.

Apply Policy

Once the policy is exported, the policy can be evaluated, and modified based on the requirement. The modified policy can be applied to the running cell instance. Note that in this scenario, we cannot apply autoscaling in the runtime because we did not allow overriding policy.

$ cellery apply-policy autoscale <my-instance-modified>.yaml <instance-name>

Also, cellery allows you to selectively apply the autoscaling policy to selected components. Suppose we have another two components named component1 and component2. We can apply the autoscaling policy only to component1 by using the following command.

$ cellery apply-policy autoscale <my-instance-modified>.yaml <instance-name> -c <component-name>

Apply autoscaling to a cellery component

Now it is time to make our hands dirty with Cellery autoscaling. You can find various samples that can be deployed in cellery runtime in the cellery samples repository. In this article, I will be using the pet-store sample to deploy pet-store backend with autoscaling.

  1. Checkout the Sample

► First, clone the wso2-cellery/samples repository
► Navigate to the advanced pet-store sample

$ cd <SAMPLES_ROOT>/cells/pet-store/advanced/pet-be-auto-scale/

2. Build the pet-store backend cell

The petStoreBackendCell consist of four different components catalog, customer, orders and controller. As you can see in the code, here we are going to apply autoscaling to the controller cell.

$ cellery build pet-be-auto-scale.bal wso2cellery/pet-be:latest

3. Run autoscaling enabled pet-store backend cell

$ cellery run wso2cellery/pet-be-cell:latest -n pet-be

4. Let’s run $ kubectl get hpa command to see the current resource utilization. You may see something like the following.

NAME                                     REFERENCE                                  TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
pet-be--controller-autoscalepolicy-hpa Deployment/pet-be--controller-deployment 2%/40% 1 3 1 2m12s

Currently, the controller component only runs on a single replica since the utilization was yet to meet the threshold.

5. Execute the export-policy command and view the current autoscaling configuration.

$ cellery export-policy autoscale cell pet-be

6. Now run a load generator cell which invokes the pet-be’s catalog component in a high concurrency. There are optional environmental variables that can be passed to the load-gen cell to configure the duration (default 5minutes), concurrency (default 40) of the load test, and pet-store instance name (default pet-be).

Here we are using a pre-build load-gen cell pushed into the cellery-hub. But you can optionally build the load-gen.bal as we built the pet-be cell in the above steps.

$ cellery run wso2cellery/load-gen-cell:latest -n load-gen

OR

$ cellery run wso2cellery/load-gen-cell:latest -e DURATION=10m -e CONCURRENCY=20 -e PET_STORE_INST=pet-be

7. Let’s check the resource utilization after running the load-gen cell by executing the$ kubectl get hpa command again.

NAME                                     REFERENCE                                  TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
pet-be--controller-autoscalepolicy-hpa Deployment/pet-be--controller-deployment 50%/40% 1 3 3 70m

Now we can see the target resources are exceeded and have 3 replicas for the controller component.

8. You can also check the running pods by executing, $ kubectl get pods and you might see 3 replica's of controller components are started.

NAME                                            READY   STATUS    RESTARTS   AGE
load-gen--load-gen-deployment-f7898b7bf-8d2cv 2/2 Running 0 112s
load-gen--sts-deployment-5555b4f58b-tqxqj 3/3 Running 0 112s
pet-be--catalog-deployment-69d85cff88-nj7hv 2/2 Running 0 86m
pet-be--controller-deployment-dc6d6df5b-2f2cd 1/2 Running 0 26s
pet-be--controller-deployment-dc6d6df5b-7s2st 1/2 Running 0 25s
pet-be--controller-deployment-dc6d6df5b-hbhrz 2/2 Running 0 86m
pet-be--customers-deployment-6995787f56-kdcqw 2/2 Running 0 86m
pet-be--gateway-deployment-554896b499-mm6c5 2/2 Running 0 86m
pet-be--orders-deployment-64cc56fb78-vhf55 2/2 Running 0 86m
pet-be--sts-deployment-57c89b77b-kr8p4 3/3 Running 0 86m

Yay !! we have just deployed a scalable component using Cellery.

You can terminate the load-gen cell by running $ cellery terminate load-gen. If you run $ kubectl get hpa again after terminating the load-gen cell, you may see the system has returned back to the initial state and $ kubectl get pods will also ensure that controller pods have been scaled down.

Now you might see how easy to deploy a scalable application with Cellery. By using Cellery you may able to apply HPA only for your desired components only.

Please note that this entire article is based on Cellery version 0.3.0.

Now It is time to try more on Cellery, visit below links to get more ideas about it.

--

--