KEDA: New Era of Autoscaling

5 min readJun 26, 2022

We know that autoscaling is one of the favorite features of Kubernetes. When we talk about Kubernetes autoscaling features, Horizontal Pod Autoscaler (HPA) will automatically come into mind. HPA is responsible for scaling pods based on CPU or RAM usage.

But in the case of complex and distributed applications integrating with different components outside the containers. Those components are Kafka topic, RabbitMQ, SQS, Pub/Sub message count, Redis Stream or custom metrics from Prometheus, New relic, or 3rd party monitoring tools. The HPA cannot scale the pods based on metrics from these components.

To facilitate such situations KEDA comes with extended functionalities on top of the HPA. With KEDA you can integrate 3rd party monitoring tools or messaging services that you’ve already used, and scale / downscale your services 🎉.

How do KEDA works?

Keda serves two key roles within Kubernetes.

Operator: KEDA de/activates Kubernetes Deployments to scale to or from zero on no events. That is installed when you install the KEDA.
Metrics: Keda acts as a Kubernetes Metrics Server that exposes rich event data like queue size, RPM metrics, or stream lag to the HPA to drive scale-out.

Architecture

KEDA uses three components to fulfill its tasks:

Scaler: Connects to an external component like Prometheus, SQS, or Kafka and fetches metrics from external components.
Operator/Controller: Responsible for activating a Deployment and creating an HPA.
Metrics Adapter: Presents metrics from external sources to the Horizontal Pod Autoscaler

How can you use KEDA?

Let’s imagine a worker microservice that collects SQS messages and writes to the database.

When the SQS message count goes high, we need to upscale our services, and then when the message count returns to the baseline we should downscale to the normal.

They said one image can explain better thousands of words.

Configuration

Before installing KEDA we need the IAM role that will assign to the service account.

IAM Policy calledKEDASQSPolicy that is for access to the test SQS topic.

We need to create IAM Role that is using the policy that we’ve just created.

The important thing is Trusted Entities you should add your Kubernetes OICD Provider to IAM’s trusted entities.

Note that this is one of three roles you need to create.

Producer Role: it needs access to send messages to SQS. ❌
KEDA Operator Role: we’ve just created. ✅
Consumer Role: it needs access to send, receive and delete messages to the queue. ❌

Deploying KEDA

To deploy KEDA with the role and the security context, We had to modify the default values. To obtain the default chart values, you can execute these commands:

$ helm repo add kedacore https://kedacore.github.io/charts 
$ helm update repo 
$ helm show values kedacore/keda > values.yaml

Now we just add:

eks.amazonaws.com/role-arn: <KEDA_OPERATOR_ROLE_ARN> as serviceAccount annotation

After installing KEDA you’re ready to see keda components in the keda namespace.

$ kubectl get pods --namespace keda

Let’s jump into real-life usage

For testing, we can use nginx a very old deployment. First of all, let’s create a new namespace called keda-playground .

$ kubectl create ns keda-playground
$ kubectl config set-context --current --namespace=keda-playground

Let’s deploy the Nginx.

$ kubectl create deployment nginx-deployment --image nginx

Yey! 🎉 We’re ready to configure ScaledObject and TriggerAuthentication from KEDA.

ScaledObject: Sets our new HPA rules. We’ll use SQS as a scaler.

TriggerAuthentication: Explain to Scaledobject, how to authenticate to AWS.

After that, we need to create a trigger authentication manifest.

We’re ready to deploy these custom resources! 👊

$ kubectl apply -f trigger-auth.yaml
$ kubectl apply -f example-scaledobject.yaml

Now we are ready for testing!

On AWS Console, you can send and receive messages for the SQS topic.

You should send 3 messages for upscaling to your services. Because we set 2 for queue length value on scaled object manifest file.

If you send 3 messages, after 10 seconds (we set the pollingInterval value) KEDA will scale our nginx deployment to 2.

$ kubectl get pods -wNAME                                READY   STATUS    RESTARTS   AGE nginx-deployment-75675f5897-7ci7o   1/1     Running   0          15s nginx-deployment-75675f5897-qqcnt   1/1     Running   0          3m

🎉 Our scaler works! But I want to mention the metric-type feature in ScaledObjects. Metric type is an important flag for us because if we want to scale metric/pod count ratio we can use Average which is the default value of metric type. But if we need to scale the exact value for scaling, we should use Value. Here are two examples from KEDA documentation.

With AverageValue metric type, we can control how many messages, on average, each replica will handle. If our metric is the queue size, the threshold is 5 messages, and the current message count in the queue is 20, HPA will scale the deployment to 20 / 5 = 4 replicas, regardless of the current replica count.
The Value metric type, on the other hand, can be used when we don’t want to take the average of the given metric across all replicas. For example, with the Value type, we can control the average time of messages in the queue. If our metric is the average time in the queue, the threshold is 5 milliseconds, and the current average time is 20 milliseconds, HPA will scale the deployment to 3 * 20 / 5 = 12.

That’s all, thank you!

Special thanks to Mutlu and Jose, who supported me in writing this article.

If you have any questions or feedback, please feel free to share them with me on Twitter or have your comments on my Github commits!