We know that autoscaling is one of the favorite features of Kubernetes. When we talk about Kubernetes autoscaling features, Horizontal Pod Autoscaler (HPA) will automatically come into mind. HPA is responsible for scaling pods based on CPU or RAM usage.
But in the case of complex and distributed applications integrating with different components outside the containers. Those components are Kafka topic, RabbitMQ, SQS, Pub/Sub message count, Redis Stream or custom metrics from Prometheus, New relic, or 3rd party monitoring tools. The HPA cannot scale the pods based on metrics from these components.
To facilitate such situations KEDA comes with extended functionalities on top of the HPA. With KEDA you can integrate 3rd party monitoring tools or messaging services that you’ve already used, and scale / downscale your services 🎉.
How do KEDA works?
Keda serves two key roles within Kubernetes.
- Operator: KEDA de/activates Kubernetes Deployments to scale to or from zero on no events. That is installed when you install the KEDA.
- Metrics: Keda acts as a Kubernetes Metrics Server that exposes rich event data like queue size, RPM metrics, or stream lag to the HPA to drive scale-out.
Architecture
KEDA uses three components to fulfill its tasks:
- Scaler: Connects to an external component like Prometheus, SQS, or Kafka and fetches metrics from external components.
- Operator/Controller: Responsible for activating a Deployment and creating an HPA.
- Metrics Adapter: Presents metrics from external sources to the Horizontal Pod Autoscaler
How can you use KEDA?
Let’s imagine a worker microservice that collects SQS messages and writes to the database.
When the SQS message count goes high, we need to upscale our services, and then when the message count returns to the baseline we should downscale to the normal.
They said one image can explain better thousands of words.
Configuration
Before installing KEDA we need the IAM role that will assign to the service account.
IAM Policy calledKEDASQSPolicy
that is for access to the test SQS topic.
We need to create IAM Role that is using the policy that we’ve just created.
The important thing is Trusted Entities you should add your Kubernetes OICD Provider to IAM’s trusted entities.
Note that this is one of three roles you need to create.
- Producer Role: it needs access to send messages to SQS. ❌
- KEDA Operator Role: we’ve just created. ✅
- Consumer Role: it needs access to send, receive and delete messages to the queue. ❌
Deploying KEDA
To deploy KEDA with the role and the security context, We had to modify the default values. To obtain the default chart values, you can execute these commands:
$ helm repo add kedacore https://kedacore.github.io/charts
$ helm update repo
$ helm show values kedacore/keda > values.yaml
Now we just add:
eks.amazonaws.com/role-arn: <KEDA_OPERATOR_ROLE_ARN>
asserviceAccount
annotation
After installing KEDA you’re ready to see keda components in the keda namespace.
$ kubectl get pods --namespace keda
Let’s jump into real-life usage
For testing, we can use nginx
a very old deployment. First of all, let’s create a new namespace called keda-playground
.
$ kubectl create ns keda-playground
$ kubectl config set-context --current --namespace=keda-playground
Let’s deploy the Nginx.
$ kubectl create deployment nginx-deployment --image nginx
Yey! 🎉 We’re ready to configure ScaledObject and TriggerAuthentication from KEDA.
ScaledObject: Sets our new HPA rules. We’ll use SQS as a scaler.
TriggerAuthentication: Explain to Scaledobject, how to authenticate to AWS.
After that, we need to create a trigger authentication manifest.
We’re ready to deploy these custom resources! 👊
$ kubectl apply -f trigger-auth.yaml
$ kubectl apply -f example-scaledobject.yaml
Now we are ready for testing!
On AWS Console, you can send and receive messages for the SQS topic.
You should send 3 messages for upscaling to your services. Because we set 2 for queue length value on scaled object manifest file.
If you send 3 messages, after 10 seconds (we set the pollingInterval value) KEDA will scale our nginx
deployment to 2.
$ kubectl get pods -wNAME READY STATUS RESTARTS AGE nginx-deployment-75675f5897-7ci7o 1/1 Running 0 15s nginx-deployment-75675f5897-qqcnt 1/1 Running 0 3m
🎉 Our scaler works! But I want to mention the metric-type feature in ScaledObjects. Metric type is an important flag for us because if we want to scale metric/pod count ratio we can use Average which is the default value of metric type. But if we need to scale the exact value for scaling, we should use Value. Here are two examples from KEDA documentation.
- With
AverageValue
metric type, we can control how many messages, on average, each replica will handle. If our metric is the queue size, the threshold is 5 messages, and the current message count in the queue is 20, HPA will scale the deployment to 20 / 5 = 4 replicas, regardless of the current replica count. - The
Value
metric type, on the other hand, can be used when we don’t want to take the average of the given metric across all replicas. For example, with theValue
type, we can control the average time of messages in the queue. If our metric is the average time in the queue, the threshold is 5 milliseconds, and the current average time is 20 milliseconds, HPA will scale the deployment to 3 * 20 / 5 = 12.
That’s all, thank you!
Special thanks to Mutlu and Jose, who supported me in writing this article.
If you have any questions or feedback, please feel free to share them with me on Twitter or have your comments on my Github commits!