Scaling the Coherence cluster on OKE using KEDA and Horizontal Pod Autoscaler

Published in

Oracle Developers

7 min readMay 12, 2023

In the previous article in this series, we looked at how to monitor a stretched Coherence cluster deployed in multiple OCI regions. In this article, we’ll use KEDA and Horizontal Pod Autoscaler (HPA) to use the captured metrics to automatially scale the Coherence cluster.

But first, a brief recap as to where we are:

Deployment and monitoring of Coherence in multiple regions

What we want to see is when there’s an increase in load, the entire system is able to respond and scale gracefully. The neat thing is that the Coherence Operator already supports HPA. Now, we want to see the operator use the captured metrics to adjust the size of the if it’s. Since I’ve been meaning to write about KEDA for a while, we’ll use that instead of the Prometheus adapter as in the documentation. The diagram below is what we’ll try to achieve:

Scaling Coherence with Prometheus and KEDA within a Kubernetes cluster

We’ll use JMeter to run a performance test to increase requests to the Coherence cluster. When the test is run, Prometheus will see an increase in some Coherence metrics and we want to see those metrics used by KEDA to trigger an autoscaling event through the HPA. In turn, when the Coherence cluster size has been updated by the HPA, the Coherence Operator will scale the number of StatefulSets and we should see this captured in the metrics by Prometheus and we’ll analyze this in Grafana.

Whilst we are running a global Coherence cluster, the Coherence operator that will handle the scaling runs locally within a cluster. So, we’ll need to repeat this in all the connected clusters:

Installing KEDA

Since we already installed Prometheus before, we only need to install KEDA. Add the helm chart repo and download the values manifest for editing:

helm repo add kedacore https://kedacore.github.io/charts
helm show values kedacore/keda > keda.yaml

Edit the keda.yaml and change the following to allow us to also monitor KEDA:

prometheus:
  operator:
    enabled: true
    serviceMonitor:
      enabled: true
    podMonitor: 
      enabled: true

Next, install KEDA:

for cluster in paris amsterdam frankfurt; do
  kubectx $cluster
  helm install keda --namespace keda kedacore/keda -f keda.yaml --create-namespace
done

Next, we’ll define a KEDA ScaledObject using the Prometheus Scaler:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: scaledcoherence
  namespace: coherence-test
spec:
  scaleTargetRef:
    apiVersion: coherence.oracle.com/v1
    kind: Coherence
    name: storage-${CLUSTER}
  pollingInterval:  30                               # Optional. Default: 30 seconds
  cooldownPeriod:   300                              # Optional. Default: 300 seconds
  minReplicaCount:  2                                # Optional. Default: 0
  maxReplicaCount:  20                                # Optional. Default: 100
  advanced:                                          # Optional. Section to specify advanced options
    restoreToOriginalReplicaCount: true              # Optional. Default: false
    horizontalPodAutoscalerConfig:                   # Optional. Section to specify HPA related options
      behavior:                                      # Optional. Use to modify HPA's scaling behavior
        scaleDown:
          stabilizationWindowSeconds: 300
          policies:
          - type: Percent
            value: 50
            periodSeconds: 120
        scaleUp:
          stabilizationWindowSeconds: 180
          policies:
          - type: Percent
            value: 200
            periodSeconds: 60
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://kps-kube-prometheus-stack-prometheus.monitoring:9090/
      query: sum(vendor:coherence_memory_heap_memory_usage_used{cluster="storage"}) / sum(vendor:coherence_memory_heap_memory_usage_max{cluster="storage"}) * 100
      threshold: '10.0'
      activationThreshold: '20.0'
      namespace: monitoring

You’ll notice we use Coherence’s API version as scaleTargetRef and the kind to be Coherence. We also provide the name of the Coherence cluster in Paris. Here we’ve parameterized it.

Under triggers, we set the type to be prometheus and we use the cluster’s address of Prometheus. We also specify a Prometheus query in PromQL that will return us a value to be evaluated against the activationThreshold and the target threshold values. What you use for you query, threshold and activationThreshold depends on your application, your use case and perhaps non-functional requirements. The values use in this example is meant to illustrate how you would go about doing it. Note that you can also have a list of triggers that will cause a scaling event based on different queries. But for now, let’s get our first query and trigger working.

For testing, we’ll be loading a large amount of data into the Coherence grid and we want to ensure that it has enough memory to store all the data. To achieve this, we use the Prometheus Query to look at the heap usage and calculate a percentage:

sum(vendor:coherence_memory_heap_memory_usage_used{cluster="storage"}) / sum(vendor:coherence_memory_heap_memory_usage_max{cluster="storage"}) * 100

In the above example, if this value reaches 20%, this will trigger a scaling event and more pods will get created to spread the data out. If we start clearing the data, when this value goes down to below 10%, some of the pods will automatically get deleted. Let’s apply this and run the test:

for cluster in paris amsterdam frankfurt; do
  kubectx $cluster
  CLUSTER=$cluster envsubst < scaledcoherence.yaml | kubectl apply -f -
done

Verify the scalers in each region has been correctly defined and values are appropriate e.g. you would want your threshold to be less than activationThreshold or your minReplicaCount to be less than maxReplicaCount:

for cluster in paris amsterdam frankfurt; do
  kubectx $cluster
  kubectl -n coherence-test get scaledobject scaledcoherence
done

✔ Switched to context "paris".
NAME              SCALETARGETKIND                     SCALETARGETNAME   MIN   MAX   TRIGGERS     AUTHENTICATION   READY   ACTIVE   FALLBACK   AGE
scaledcoherence   coherence.oracle.com/v1.Coherence   storage-paris     2     20    prometheus                    True    False    False      2m58s
✔ Switched to context "amsterdam".
NAME              SCALETARGETKIND                     SCALETARGETNAME     MIN   MAX   TRIGGERS     AUTHENTICATION   READY   ACTIVE   FALLBACK   AGE
scaledcoherence   coherence.oracle.com/v1.Coherence   storage-amsterdam   2     20    prometheus                    True    False    False      3m2s
✔ Switched to context "frankfurt".
NAME              SCALETARGETKIND                     SCALETARGETNAME     MIN   MAX   TRIGGERS     AUTHENTICATION   READY   ACTIVE   FALLBACK   AGE
scaledcoherence   coherence.oracle.com/v1.Coherence   storage-frankfurt   2     20    prometheus                    True    False    False      3m7s

You can also look at ScaledObject in more detail:

kubectl describe scaledobject scaledcoherence

and you should see something like the following:

....
Status:
  Conditions:
    Message:  ScaledObject is defined correctly and is ready for scaling
    Reason:   ScaledObjectReady
    Status:   True
    Type:     Ready
    Message:  Scaling is not performed because triggers are not active
    Reason:   ScalerNotActive
    Status:   False
    Type:     Active
    Message:  No fallbacks are active on this scaled object
    Reason:   NoFallbackFound
    Status:   False
    Type:     Fallback
  External Metric Names:
    s0-prometheus-prometheus
  Health:
    s0-prometheus-prometheus:
      Number Of Failures:  0
      Status:              Happy
  Hpa Name:                keda-hpa-coherence-scaledobject
  Original Replica Count:  3
  Scale Target GVKR:
    Group:            coherence.oracle.com
    Kind:             Coherence
    Resource:         coherence
    Version:          v1
  Scale Target Kind:  coherence.oracle.com/v1.Coherence
Events:
  Type    Reason              Age                From           Message
  ----    ------              ----               ----           -------
  Normal  KEDAScalersStarted  36s                keda-operator  Started scalers watch
  Normal  ScaledObjectReady   21s (x2 over 36s)  keda-operator  ScaledObject is ready for scaling

We are now ready to test the scalability of Coherence within a Kubernetes cluster using metrics.

Triggering the scaling event

To trigger the scaling event, we’ll use JMeter to generate the load:

kubectl apply -f coherence-jmeter.yaml --namespace coherence-test

Let it run for a while and meanwhile, we can observe the proceedings in Grafana:

Observing Coherence Cluster size changes

We can see that once the heap size reaches 20% utilization, the cluster size changes. These events are marked on the graph above. Similarly, we can also look at the overall cluster heap as the cluster size increases and we can see that an addition of a cluster member corresponds to an increase in the overall heap available:

Increase in heap size with addition of new Coherence cluster members

The addition of a member is also followed by a partition transfer as Coherence seeks to distribute the data:

On the Coherence Summary dashboard, we can see an increase in the number of Coherence cluster members at various heap utilization:

On the Cache Summary Dashboard, we can see a steady increase in Cache Data memory usage:

Finally, on the Keda dashboard, we can also see the number of HPA replicas created by KEDA:

Scaling down

So, we’ve shown that Coherence can scale out based on the metrics captured by Prometheus and using KEDA. But can the cluster be scaled back in?

Let’s stop the JMeter Put performance test and run another one that will truncate the data instead:

kubectl delete -f coherence-jmeter.yaml
kubectl apply -f coherence-jmeter-truncate.yaml

After stopping the performance test, this is how the cluster looks like before it starts to scale down:

As KEDA starts scaling down the cluster, we see other members departing:

Note that the metrics and Prometheus query used here are to illustrate using KEDA, Prometheus and Coherence together over 3 different regions. You should not use these to determine how to autoscale Coherence when running on Kubernetes. As I mentioned before, your Prometheus query, values etc depend on your use case.

Conclusion

In this article, we show that even though the Coherence members are spread over 3 geographically-separate regions, we are still able to scale based on load using metrics from Prometheus and triggered by KEDA. As the load increases, the Coherence cluster is scaled by adding new members and as the load decreases, the cluster size is correspondingly reduced. We can also choose separate frequencies for scaling out and scaling back.

I would like to conclude here and again thank my colleagues Jonathan Knight, Tim Middleton, Avi Miller, Julian Ortiz and Sherwood Zern for their contributions and ideas to this article.

Scaling the Coherence cluster on OKE using KEDA and Horizontal Pod Autoscaler

Installing KEDA

Triggering the scaling event

Scaling down

Conclusion

Written by Ali Mukadam