Real-world Kubernetes stateless applications are designed carefully to separate code from other components like databases or queues, which makes it easier to scale up or down the code component (or so-called the stateless application). There are different scenarios where the application needed to be scaled and that depends on the nature of our business requirements. One scenario can be an application that processes messages in the RabbitMQ server outside the Kubernetes cluster, the queue may have thousands or millions of messages that must be processed as quickly as possible via the application, thus one instance of the application will not be enough to handle all that load. We need to tell Kubernetes somehow that our application must be scaled whenever there are more messages in our queue. And this can be done via Horizontal Pod Autoscaling (HPA) which is one of the most brilliant features that Kubernetes came with.
In this post, we are going to use the Kubernetes HPA feature with KEDA and RabbitMQ to design a stateless application that scales out/in based on the number of messages in the queue.
This technical tutorial assumes that you already have:
- Kubernetes >= 1.19
- Helm 3
- RabbitMQ server, you can either use your own RabbitMQ or you can create a new one using one of the online free RabbitMQ services.
CPU and memory metrics of one pod (or resource metrics) are collected by Kubernetes Metrics Server and then used by HPA to perform scaling. However, in the case of metrics that must be collected from an external source, then we need a solution that facilitates the way we get our metrics from our source (for example A RabbitMQ outside the cluster) and then create the required resources to scale pods based on those metrics. KEDA is doing this for us!
KEDA is a Kubernetes-based Event-Driven AutoScaler that has no dependencies and can be installed on the Kubernetes cluster to support HPA based on specific external metrics/events.
This blog assumes that KEDA is deployed properly on your Kubernetes cluster. Please refer to deployment instructions here. (make sure you target KEDA >= 2.1).
KEDA RabbitMQ Auto-Scaler
RabbitMQ is deployed somewhere outside our Kubernetes cluster, it has one queue named “HPAQueue” that has N messages. These messages must be processed by the application service, the bigger the number of messages the quicker the service must respond. To achieve this, KEDA has been used to scale our application based on the number of messages that are still in the queue. With KEDA we can implement our scaler objects and configure them to listen to our RabbitMQ “HPAQueue” for any messages. For example, if there are 3 messages in “HPAQueue” then the KEDA auto-scaler will automatically scale out the application from zero to one pod while if there are 15 messages then we are going to scale out from zero to 5 pods to handle this load.
Deploying a simple busybox pod!
Obviously, HPA is going to scale a deployment (or StatefulSet or CustomResource), thus, before jumping to create any scaler objects, first we need to create our deployment object which can be any application that may scale based on specific metrics. Here, our deployment contains one pod that has one container “busybox” that prints the date every 5 minutes forever, however, in reality, you probably want to process those messages/events that are awaiting in your messaging queue. But, for now, let us keep it simple, printing a date instead of processing the message is fine!
We named this deployment “my-scaling-deployment”, and we labeled our pod with “my-app”. This will help us later to identify which deployment are we scaling.
Deploy Scaler Object
To deploy RabbitMQ Scaler properly, we need to deploy three different types of resources as mentioned on the KEAD website:
Those resources are going to be deployed in a specific namespace called “hpa-rabbitmq-example”, and the resulting HPA will be responsible for scale up or down our pods according to the messages in the queue.
As mentioned in KEDA RabbitMQ Scaler docs, our RabbitMQ URL needs to be encoded in base64 format. The way to generate this encoded string is:
The reason behind using “echo -n” is to ensure that no newline character will be included in our encoded string and the same for “base64 -w 0” in order to disable line-wrapping.
Now, since we have our base64 encoded string, let’s create our RabbitMQ Secret object called “keda-rabbitmq-secret”:
Then, to create our TriggerAuthentication called “keda-trigger-auth-rabbitmq-conn”
And finally, our scaler object “rabbitmq-scaledobject” will look like this:
Once these resources have been deployed on Kubernetes cluster in the target namespace, you must verify that the scaler has been configured properly, to do that:
kubectl -n hpa-rabbitmq-example get hpa
If you cannot see your scaler here, then you need to check out for any errors that occur in KEDA operators.
kubectl -n keda get podsNAME READY STATUS RESTARTS AGEkeda-operator-7998fdb6cd 1/1 Running 1 9dkeda-operator-metrics-722rw 1/1 Running 2 9d
Then you can check logs:
kubectl -n keda logs keda-operator-7998fdb6cdkubectl -n keda logs keda-operator-metrics-722rw
Then simply check the logs of those containers, which will help you debug your problems. Most of the time, the errors you will get are related to either RabbitMQ URL, RabbitMQ virtual host, or queue name so make sure that you have configured those items properly in your YAML files.
Once everything is okay, you will be able to see that your scaler is active, and your deployment is going to be auto-scaled based on the number of messages waiting in your queue. Below is a screenshot is taken from Lens IDE that shows our HPA is ready to scale deployment:
In the previous deployment, we have not specified how much memory and CPU our application. This can be done by adding the resource requests and limits which is an essential step to consider when deploying applications on Kubernetes especially when you care about cost. Imagine you have different departments each of them has a special namespace on Kubernetes and each of these namespaces is limited with a specific CPU and memory resources. Say that our department has a namespace with the following quota:
- Namespace: hpa-rabbitmq-nsresources
memory: 32 GB
memory: 64 GB
And let us say that we configured our Deployment Pod container (app) to have the following quota:
- Container: appresources
memory: 2 GB
memory: 4 GB
Now, when HPA is triggered, our deployment is going to scale out but of course not forever! because our namespace is limited to a certain threshold. So, what is going to happen is that Kubernetes scheduler tries to run the new Pods on the target node as long as there is enough place for them, however,
The scheduler ensures that the sum of the resource requests of the scheduled Containers is less than the capacity of the node.
So once this threshold is reached, the Kubernetes scheduler will not be able to run any new pod on the target node and you will get the famous error “forbidden exceeded quota”.
Depending on your cloud service provider, a Cluster AutoScaler can be the solution here. It will dynamically scale the number of nodes in the cluster based on the current workload.
Say that we have the following scenario:
- Cluster Node — worker 1- [ 16Gb memory, 8 CPU ]
- Namespace: requests [ 64 Gb memory, 32 CPU ]
- App Container: requests [ 2Gb memory, 2 CPU ]
- queue length: 3
- 100 messages in our RabbitMQ server
- KEDA max replica count equals to 20
Let us try to understand these numbers, our namespace can have up to 4 nodes (16*4=64 & 8*4=32) which means that 4 nodes are the maximum number of nodes that we can have if a Cluster AutoScaler is triggered. Each node can have up to 3 containers only, why not 4? Because not only the pods will consume CPU/memory on the node but there are other running services that also need some resources.
To sum up, we are going to have 12 pods in total scheduled in our cluster on 4 different nodes, as in the following:
- Node 1 runs 3 pods
- Node 2 runs 3 pods
- Node 3 runs 3 pods
- Node 4 runs 3 pods
Once the messages are processed successfully and no more messages left in the queue, the Cluster AutoScaler will scale down the number of nodes to 1 as they underutilized, and no more pods scheduled there, thus they will be removed from the cluster. Also, HPA will scale in the number of pods to 0 again.
In this post, we have learned how to implement a simple Kubernetes deployment and some KEDA objects which achieve altogether our HPA goal. We have seen how HPA & KEDA will scale out the number of pods based on the number of queue messages in our RabbitMQ, and how to perform quick troubleshooting using KEDA operators. Also, we learned how to configure and manage pod resources properly, and how Kubernetes Cluster AutoScaler is going to scale out the number of nodes based on current utilization.