Unlocking Efficiency and Scale with KEDA Autoscaling

Published in

cwan-engineering

6 min readFeb 12, 2024

In the kingdom of modern software architecture, the demand for scalability and optimal resource utilization continues to increase. Applications should be able to scale up and down seamlessly, ensuring optimal performance and cost efficiency. This is where KEDA steps in and provides the elegant solution around this.

Imagine this scenario: you have an application running in Kubernetes that is processing thousands of messages from AWS SQS, where each message takes some time to process before getting removed from the queue or acknowledged. When we experience a sudden spike in incoming messages from our ETL tool, it also slows down processing for the consumer. Traditional scaling methods, such as manually scaling the pods using the kubectl scale command or changing the replica parameter in deployment might not suffice, lead to over-provisioning and increased costs. KEDA, however, presents a dynamic, event-driven approach to scaling, allowing your applications to respond precisely to workload demands (scale on a number of messages in SQS in this case).

Kubernetes by default comes with HPA (horizontal pod autoscaler), however it is limited to scale on memory or CPU utilization. When we want to scale on some external metrics, such as cloud watch alarms, database tables etc., we have to use KEDA.

Introduction

Kubernetes, being a leading container orchestration platform, offers robust scaling capabilities, but traditional scaling approaches often fall short when it comes to event-driven architectures. Enter KEDA, short for Kubernetes Event-Driven Autoscaling. KEDA is an open-source project designed to address the need for efficient and event-driven scaling within Kubernetes environments. It enables automatic scaling of your applications and workloads based on various event sources, allowing your applications to dynamically adjust resources to handle changes in event load.

1. Prerequisites

Before diving into the setup, ensure you have the following prerequisites in place:

· An AWS account with access to Amazon SQS.
· A Kubernetes cluster up and running, or a local Kubernetes running
· KEDA installed and configured within your Kubernetes cluster.

If you already have an AWS account and SQS queue setup, install KEDA in your cluster following the link https://keda.sh/docs/2.10/deploy/ . We will focus on how we can integrate KEDA with AWS SQS.

2. Integrating KEDA with AWS SQS

· Define a KEDA ScaledObject, specifying the AWS SQS trigger and the desired scaling behavior.
· Configure the Kubernetes Deployment or Pod that you want to scale using KEDA.
· Deploy the ScaledObject and the application pod.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: aws-sqs-queue-scaledobject
spec:
  scaleTargetRef:
    name: your-deployment-name
    kind: Deployment
    apiVersion: apps/v1
  pollingInterval: 15    # KEDA will check the queueLength every 15 seconds, and scale the resource up or down accordingly.
  minReplicaCount: 0
  maxReplicaCount: 5
  cooldownPeriod: 300
  advanced:
    restoreToOriginalReplicaCount: true
  triggers:
    - type: aws-sqs-queue
      authenticationRef:
        name: keda-trigger-auth-aws-credentials
      metadata:
        queueURL: https://sqs.us-west-2.amazonaws.com/XXXXX/your-queue-name
        queueLength: "300"
        awsRegion: "us-west-2"
        identityOwner: pod
        scaleOnInFlight: "false"

Let’s break down the above yaml file. The trigger section defines the type on which to scale the deployment and metadata about that type. In this case it is aws-sqs-queue i.e., to trigger on SQS.

queueLength: "300"

Target value for queue length passed to the scaler. Example: if one pod can handle 300 messages, set the queue length target to 300. If the actual messages in the SQS Queue is 600, the scaler scales to 2 pods.

scaleOnInFlight: "false"

Indication whether or not to scale on queued messages or to include in-flight messages as well. False means that the total number of messages will not include the messages in flight. If we need to include the messages in flight as well, set it to true.

identityOwner: pod

This indicates that we are trying to use Pod identity-based authentication, which means the pod will authenticate to AWS using IRSA (IAM Roles for service account) i.e., the service account role on the pod should have the policy attach to it to grant SQS access. When identityOwner is set to operator - the only requirement is that the KEDA operator has the correct IAM permissions on the SQS queue.

authenticationRef:

        name: keda-trigger-auth-aws-credentials

These parameters are relevant only when identityOwner is set to pod. You can use TriggerAuthentication CRD to configure the authenticate by providing either a role ARN or a set of IAM credentials.

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-trigger-auth-aws-credentials
spec:
  podIdentity:
    provider: aws-eks

Needs to be set to aws-eks on the TriggerAuthentication and the pod/service account must be configured correctly for your pod identity provider.

3. How KEDA works

KEDA works alongside standard Kubernetes components like the Horizontal Pod Autoscaler and can extend functionality without overwriting or duplication.

At a very high level,

KEDA only scales from 0 to 1 pods or scales down from 1 to 0 pods. This feature is very important since HPA only scales from 1 to n and vice versa. Scaling to 0 is vital in saving cost. Scaling from 1 to n and vice versa is still performed by HPA , i.e., KEDA sends the metrics information to HPA to perform scaling.

We at Clearwater, use the rancher tool that uses Fleet to deploy its objects. When you describe the scaled object using kubectl describe scaledobjects <name> you will get this

- message: Scaling is not performed because triggers are not active
    reason: ScalerNotActive
    status: "False"
    type: Active
  - status: Unknown
    type: Fallback

This is the right state for the scaled object to be in. It says that your scaler is ready to increase the pods, but the trigger (SQS in this case) is not active i.e., messages in the queue are very few or less than 300. In our case, however, rancher treats this state indifferently.

To make the bundles in ready state in Rancher, we have to apply the patch on json. In your fleet.yaml file add the below

    targetCustomizations:
  - name: beta
    diff:
      comparePatches:
        - name: aws-sqs-queue-scaledobject
          namespace: amigos-dataintake
          kind: ScaledObject
          apiVersion: keda.sh/v1alpha1
          operations:
            - {"op":"remove", "path":"/status/conditions"}
   ......
ignore:
  conditions:
    - type: Active
      status: "False"
      reason: ScalerNotActive

This will make the bundle ready as rancher will ignore the state of scaled object.

4. Verifying Scaling

Once configured, monitor the scaling behavior by sending messages to the SQS queue and observe how KEDA scales the application based on the queue depth.

5. Conclusion

Integrating Amazon SQS with Kubernetes Event-Driven Autoscaling (KEDA) offers a powerful solution to dynamically scale your applications based on the queue depth and message count in your SQS queues. This dynamic scaling ensures that your pods efficiently handle varying workloads, optimizing performance and cost-effectiveness.

About the Author

Aditya Gupta is a senior software development Engineer at Clearwater Analytics, bringing over 13 years of expertise in crafting distributed scalable solutions. Specializing in AWS, Kubernetes, and core Java development. Outside of coding, Aditya finds joy in exploring new destinations, experimenting in the kitchen, and hitting the gym.