Monitoring and performance testing a regionally distributed Coherence Cluster on Kubernetes and OCI

Ali Mukadam
Oracle Developers
Published in
15 min readMay 5, 2023

In a previous post, we used the Coherence Operator to deploy a geographically-distributed Coherence cluster on multiple Kubernetes (OKE) clusters in several regions on OCI. As Coherence uses StatefulSets and headless ClusterIP services, we also included Submariner to perform cross-cluster service discovery. Well, that was mostly to figure out the mechanics of getting OKE, Coherence and Submariner to play nicely with each other. In this article, we will run a few performance tests and monitor how well Coherence behaves in this architecture when experiencing sustained load.

Monitoring 1 Coherence cluster with Prometheus is easy, especially if you are using the Prometheus Operator. All you need to do is enable the metrics and the service monitor and Prometheus will start scraping the metrics:

apiVersion: coherence.oracle.com/v1
kind: Coherence
metadata:
name: metrics-cluster
spec:
coherence:
metrics:
enabled: true
ports:
- name: metrics
serviceMonitor:
enabled: true

It’s slightly more challenging when you’ve got 3 clusters in 3 regions. However, we’ve done something similar before by monitoring 3 workloads in different clusters and regions with Thanos. Since we are using Submariner, we also want to capture its metrics alongside Coherence’s:

Monitoring a distributed Coherence cluster over several OCI regions

Thus, we have 4 components here:

  • Coherence which is our actual workload
  • Submariner for cross-cluster service discovery of Coherence
  • Prometheus for capturing metrics in each region
  • Thanos for aggregating and providing a global view of our clusters

Since Coherence is quite sensitive to latency, we also want to avoid 1 tool affecting another. So, when creating our clusters using the Terraform OKE module, we’ll create 4 node pools of varying sizes in 3 different regions:

  nodepools = {
gateway = {
shape = "VM.Standard.E4.Flex",
ocpus = 2,
memory = 32,
node_pool_size = 1,
boot_volume_size = 150,
label = { "submariner.io/gateway" = "true" }
}
coherence = {
shape = "VM.Standard.E4.Flex",
ocpus = 2,
memory = 32,
node_pool_size = 3,
boot_volume_size = 150,
label = { "coherence" = "true" }
}
prometheus = {
shape = "VM.Standard.E4.Flex",
ocpus = 2,
memory = 32,
autoscale = true,
node_pool_size = 1,
max_node_pool_size = 3,
boot_volume_size = 150,
label = { app = "prometheus", pool = "prometheus"},
node_defined_tags = { "cn.role" = "prometheus" }
}
thanos = {
shape = "VM.Standard.E4.Flex",
ocpus = 2,
memory = 32,
autoscale = true,
node_pool_size = 1,
max_node_pool_size = 3,
boot_volume_size = 150,
label = { app = "thanos", pool = "thanos", "submariner.io/gateway" = "false" },
node_defined_tags = { "cn.role" = "thanos" }
}
}

As you can see from above, each node pool is configured with initial node labels so that each is responsible for 1 component. Let’s deploy them.

Infrastructure

Follow the instructions as in the previous post to create 3 OKE clusters in 3 different regions. Note that you can now also get a configurable number of Remote Peering Connections (RPCs) to be created by the OKE module instead of creating them manually. Once, the clusters and the RPCs are created, you can peer them. For example, from Paris, you should be able to see the following in the OCI Network Visualizer:

Network Map of Connected VCNs

Pick a region that will function as a command center. In this region (mine is Paris), you’ll enable the operator host and download the kubeconfigs of each clusters. Verify that you have connectivity to all of the clusters:

for cluster in paris frankfurt amsterdam; do 
kubectx $cluster
kubectl get nodes
done

✔ Switched to context "paris".
NAME STATUS ROLES AGE VERSION
10.0.106.32 Ready node 14d v1.25.4
10.0.117.165 Ready node 14d v1.25.4
10.0.121.123 Ready node 14d v1.25.4
10.0.121.31 Ready node 14d v1.25.4
10.0.75.157 Ready node 14d v1.25.4
10.0.99.166 Ready node 14d v1.25.4
✔ Switched to context "frankfurt".
NAME STATUS ROLES AGE VERSION
10.10.112.246 Ready node 14d v1.25.4
10.10.73.77 Ready node 14d v1.25.4
10.10.79.214 Ready node 14d v1.25.4
10.10.80.208 Ready node 14d v1.25.4
10.10.83.174 Ready node 14d v1.25.4
10.10.88.155 Ready node 14d v1.25.4
✔ Switched to context "amsterdam".
NAME STATUS ROLES AGE VERSION
10.9.101.19 Ready node 14d v1.25.4
10.9.108.96 Ready node 14d v1.25.4
10.9.111.236 Ready node 14d v1.25.4
10.9.88.245 Ready node 14d v1.25.4
10.9.93.79 Ready node 14d v1.25.4
10.9.99.121 Ready node 14d v1.25.4

We are now ready to deploy.

Deploying Thanos

In each region, create an OCI Object Storage bucket called ‘thanos’. Ensure you have also followed the steps in the previous article to set up a dynamic group and the corresponding policy for the dynamic group to write to OCI Object Storage. This will ensure that the node where Thanos will be running the Receive component can use instance_principal for authentication purposes.

Next, create a file called storage.yaml:

type: OCI
config:
provider: "instance-principal"
bucket: "thanos"
compartment_ocid: "ocid1.compartment.oc1.."

For each region, created a Secret to hold this file:

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
kubectl -n monitoring create secret generic thanos-objstore-config --from-file=objstore.yml=storage.yaml
done

Add the bitnami Thanos helm chart repo and generate the values file for a the Frankfurt cluster:

helm repo add bitnami https://charts.bitnami.com/bitnami

helm show values bitnami/thanos > thanos-frankfurt.yaml

Edit the thanos-frankfurt.yaml file and change the following:

existingObjstoreSecret: "thanos-objstore-config"    
query:
enabled: true
serviceGrpc:
type: LoadBalancer
annotations:
# nsg of private load balancer
oci.oraclecloud.com/oci-network-security-groups: "ocid1.networksecuritygroup..."
service.beta.kubernetes.io/oci-load-balancer-shape: "flexible"
service.beta.kubernetes.io/oci-load-balancer-shape-flex-min: "20"
service.beta.kubernetes.io/oci-load-balancer-shape-flex-max: "30"
# subnet of private load balancer
service.beta.kubernetes.io/oci-load-balancer-subnet1: "ocid1.subnet..."
service.beta.kubernetes.io/oci-load-balancer-internal: "true"
service.beta.kubernetes.io/oci-load-balancer-security-list-management-mode: "None"
queryFrontend:
enabled: true
bucketweb:
enabled: true
compactor:
enabled: true
storegateway:
enabled: true
receive:
enabled: true

Locate every nodeSelector and set the following:

  nodeSelector:
app: thanos

Copy the thanos-frankfurt.yaml file to thanos-amsterdam.yaml:

cp thanos-frankfurt.yaml thanos-amsterdam.yaml

and change the ocids of the NSG and the load balancer subnet to use those in Amsterdam:

existingObjstoreSecret: "thanos-objstore-config"    
query:
enabled: true
serviceGrpc:
type: ClusterIP
annotations: {}
queryFrontend:
enabled: true
bucketweb:
enabled: true
compactor:
enabled: true
storegateway:
enabled: true
receive:
enabled: true

This is to expose the Thanos Queries in Amsterdam and Frankfurt through an internal Load Balancer that the Thanos Query in Paris can then reach. You can now deploy Thanos in each region:

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
helm install thanos bitnami/thanos --namespace monitoring -f thanos.yaml --create-namespace
done

Verify that the Thanos pods are all running:

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
kubectl -n monitoring get pods
done

Finally, you need to update Thanos in your designated ‘command’ region (Paris in this example) and add the different query stores. The query stores in this case are the private IP addresses of the load balancers created when you set the query.serviceGrpc.type to LoadBalancer:

query:  
stores:
- 10.9.2.18:10901
- 10.10.2.12:10901

In both observed regions’ private load balancer NSGs, add an ingress rule to accept TCP traffic over port 10901 from the command VCN’s CIDR.

Update Thanos for the command region again:

kubectx paris
helm upgrade thanos bitnami/thanos --namespace monitoring -f thanos.yaml

If everything works well, when you port-forward into Thanos Query, you should see all stores:

Thanos Stores

In the above screenshot, you can see the Thanos Query able to reach the other Thanos Queries over Remote Peering Connections and private Load Balancers. The Receive and Store that you see are for its own Thanos Receive and Store instance in Paris.

Deploying Prometheus

Now that Thanos is deployed, we can start shipping metrics to it. Follow the steps in the previous article to deploy kube-prometheus-stack:

  1. Generate the manifest for each cluster
helm repo add kps https://prometheus-community.github.io/helm-charts
helm show values kps/kube-prometheus-stack > kps.yaml

2. Edit the kps.yaml file and set the following:

prometheus
prometheusSpec
serviceMonitorSelectorNilUsesHelmValues: false
podMonitorSelectorNilUsesHelmValues: false
remoteWrite:
- url: http://thanos-receive.monitoring.svc.cluster.local:19291/api/v1/receive

3. Locate every nodeSelector and set the following:

nodeSelector:
app: prometheus

4. Make a manifest for each file

mv kps.yaml kps-paris.yaml
cp kps-paris.yaml kps-frankfurt.yaml
cp kps-amsterdam.yaml kps-amsterdam.yaml

5. Edit each kps.yaml and set the externalLabels as per the cluster name:

    externalLabels:
cluster: "paris"

6. Deploy Prometheus:

for cluster in paris amsterdam frankfurt; do
ktx $cluster
helm install kps kps/kube-prometheus-stack --namespace monitoring -f kps-$cluster.yaml
done

After sometime, you should start seeing your metrics.

Deploying Submariner

Follow the steps in the previous article to deploy Submariner.

Install the Submariner CLI:

curl -Ls https://get.submariner.io | bash
export PATH=$PATH:~/.local/bin
echo export PATH=\$PATH:~/.local/bin >> ~/.profile

Deploy the Submariner broker in 1 of the clusters:

subctl --context paris deploy-broker

Join the clusters:

for cluster in paris amsterdam frankfurt; do
subctl join --context $cluster broker-info.subm --cluster $cluster --clusterid $cluster --air-gapped
done

Verify connectivity between them:

subctl show connections
Cluster "frankfurt"
✓ Showing Connections
GATEWAY CLUSTER REMOTE IP NAT CABLE DRIVER SUBNETS STATUS RTT avg.
oke-cbsjojgl6ca-n5duxz3q4uq-sc amsterdam 10.9.99.121 no libreswan 10.109.0.0/16, 10.209.0.0/16 connected 14.1032ms
oke-chh5fe32dca-nxq7ggsi7aq-sh paris 10.0.99.166 no libreswan 10.100.0.0/16, 10.200.0.0/16 connected 8.193527ms

Cluster "paris"
✓ Showing Connections
GATEWAY CLUSTER REMOTE IP NAT CABLE DRIVER SUBNETS STATUS RTT avg.
oke-cbsjojgl6ca-n5duxz3q4uq-sc amsterdam 10.9.99.121 no libreswan 10.109.0.0/16, 10.209.0.0/16 connected 8.214783ms
oke-cc5mqdaua2q-n5bo623ouna-sl frankfurt 10.10.73.77 no libreswan 10.110.0.0/16, 10.210.0.0/16 connected 8.209237ms

Cluster "amsterdam"
✓ Showing Connections
GATEWAY CLUSTER REMOTE IP NAT CABLE DRIVER SUBNETS STATUS RTT avg.
oke-cc5mqdaua2q-n5bo623ouna-sl frankfurt 10.10.73.77 no libreswan 10.110.0.0/16, 10.210.0.0/16 connected 14.171729ms
oke-chh5fe32dca-nxq7ggsi7aq-sh paris 10.0.99.166 no libreswan 10.100.0.0/16, 10.200.0.0/16 connected 8.282868ms

Now that we’ve got connectivity between the clusters, we can deploy Coherence.

Deploying Coherence

  1. Install the Coherence Operator as before:
helm repo add coherence https://oracle.github.io/coherence-operator/charts
helm repo update

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
helm install coherence-operator --namespace coherence-operator coherence/coherence-operator --create-namespace
done

Once the Coherence Operator is successfully deployed, create a namespace for our Coherence cluster deployment in each cluster:

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
kubectl create ns coherence-test
done

Create the WKA services for each cluster and export them

apiVersion: v1
kind: Service
metadata:
name: submariner-wka-${CLUSTER}
namespace: coherence-test
labels:
coherenceCluster: storage
coherenceComponent: coherenceWkaService
coherenceDeployment: storage
coherenceRole: storage
spec:
type: ClusterIP
clusterIP: None
ports:
- name: tcp-coherence
port: 7
protocol: TCP
targetPort: 7
publishNotReadyAddresses: true
selector:
coherenceCluster: storage
coherenceComponent: coherencePod
coherenceWKAMember: "true"

Create the WKA service in each cluster:

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
CLUSTER=$cluster envsubst < submariner-wka-svc.yaml | kubectl apply -f -
done

And export them to make the Coherence service discoverable:

for cluster in paris amsterdam frankfurt; do
subctl --context $cluster export service --namespace coherence-test submariner-wka-$cluster
done

Let’s now deploy an updated version of our Coherence cluster with metrics enabled:

apiVersion: coherence.oracle.com/v1
kind: Coherence
metadata:
name: storage-${CLUSTER}
namespace: coherence-test
spec:
nodeSelector:
coherence: "true"
cluster: storage
coherence:
logLevel: 5
metrics:
enabled: true
management:
enabled: true
ports:
- name: metrics
serviceMonitor:
enabled: true
- name: management
replicas: 3
readinessProbe:
initialDelaySeconds: 30
jvm:
args:
- "-Dcoherence.wka=submariner-wka-paris.coherence-test.svc.clusterset.local,submariner-wka-frankfurt.coherence-test.svc.clusterset.local,submariner-wka-amsterdam.coherence-test.svc.clusterset.local"
memory:
initialHeapSize: 4g
maxHeapSize: 4g

Create the first cluster in Paris:

kubectx paris
CLUSTER=paris envsubst < coherence-cluster.yaml | kubectl apply -f -

Wait for the pods to be ready and repeat for Frankfurt and Amsterdam:

for cluster in frankfurt amsterdam; do
kubectx $cluster
CLUSTER=$cluster envsubst < coherence-cluster.yaml | kubectl apply -f -
done

You can also tail any of the pod and ensure that they have joined to form a global cluster as we did previously:

 kubetl -n coherence-test logs -f storage-paris-0

We can now move on to complete the monitoring setup.

Importing Grafana Dashboards

The Coherence team have created a fantastic set of Grafana dashboards. So let’s import them. Login to Grafana by using port-forward:

kubectl -n monitoring port-forward svc/kps-grafana 3000:80

On another terminal, download the dashboards:

curl https://oracle.github.io/coherence-operator/dashboards/latest/coherence-dashboards.tar.gz \
-o coherence-dashboards.tar.gz
tar -zxvf coherence-dashboards.tar.gz

curl -Lo grafana-import.sh https://github.com/oracle/coherence-operator/raw/main/hack/grafana-import.sh
chmod +x grafana-import.sh
./grafana-import.sh -u <GRAFANA-USER> -w <GRAFANA_PWD> -d dashboards/grafana -t localhost:3000

Access Grafana in your browser and you should be able to see the Coherence Dashboards:

Coherence Dashboards in Grafana after import

Verify your metrics are coming in by accessing some of these dashboards:

Coherence Main Dashboard

From the above, you can see that my cluster has 9 members and offering a total Cluster Heap of 38.7 GB which you can use to keep data in memory for fast access.

Click on the Available Dashboards on the right and you’ll be able to navigate to other Coherence dashboards e.g. the Members Summary Dashboard:

Members Summary Dashboards

In the above, you can see the Coherence members distributed across 3 sites in Amsterdam, Frankfurt and Paris. If you click on a member name, you can view the performance of that Coherence member:

Let’s check how Submariner is doing:

Latency to other connected clusters via Submariner

We are now ready to run performance tests.

Running a load test

We are going to do 3 types of performance testing:

  • smoke test: is usually performed between iterations of a project to ensure a basic level of sanity of the system.
  • load test: is usually performed to prove that a system can sustain full operation.
  • soak test: is a load test but performed over a longer duration to uncover potential issues which are harder to uncover during short load or stress tests.

By running these tests, we want to find the following about the multi-regional deployment:

  • Is it stable when faced with significant requests?
  • Can the solution scale? What kind of scalability are we getting?
  • Are there any long term issues that might affect operational stability that are not usually visible during short performance tests?
  • Is it able to handle failures within region? e.g. what happens when 1 or more Coherence members in a region fail?
  • Is it able to handle failure when an entire region goes dark?

Naturally, you may have your own questions you want to answer depending on your application, your architecture, your constraints etc. but I hope this shows you a method to find these.

For the purpose of this article, we are going to combine the smoke and load tests but you can get more systematic depending on your workload. My colleagues Tim and Jonathan are using the venerable JMeter to define some Coherence samplers:

Coherence Long Running Test with JMeter

In this test, we have defined 10 threads to do gets, puts and remove randomly over 10 million keys:

Save your test locally. We then need to create a ConfigMap to store the JMeter files:

kubectl -n coherence-test create configmap jmeter-tests --from-file jmeter-tests

The ConfigMap will then be used by the JMeter runners to create the test:

apiVersion: coherence.oracle.com/v1
kind: Coherence
metadata:
name: jmeter-runners
spec:
cluster: jmeter-cluster
replicas: 1
image: "coherence-jmeter:1.0.0-SNAPSHOT"
application:
main: "com.oracle.coherence.jmeter.JMeterRunner"
coherence:
storageEnabled: false
metrics:
enabled: true
jvm:
classpath:
- "/app/libs/*"
- "/apache-jmeter/lib/*"
- "/apache-jmeter/lib/ext/*"
env:
- name: "COHERENCE_JMETER_TEST"
value: "/jmeter-tests/LongRunningLoadTest.jmx"
- name: "COHERENCE_JMETER_HOST"
valueFrom:
fieldRef:
apiVersion: "v1"
fieldPath: "metadata.name"
ports:
- name: metrics
serviceMonitor:
enabled: true
volumeMounts:
- name: test-files
mountPath: /jmeter-tests
volumes:
- name: test-files
configMap:
name: jmeter-tests
optional: true

and run it:

kubectl apply -f coherence-jmeter.yaml

Let it run for about 15–30 mins:

Running Performance test

You can see that at 17:56, the marker indicates the cluster size has changed. That’s the time when we started the performance test and JMeter creates an instance (with storage disabled) to join the cluster. On the Members Summary page, we can see the success rate of exchanging data between the cluster members is at 100%:

On the Performance Metrics Dashboard, we can see things are quite stable too, albeit a bit slow for my liking.

The minimum and mean latencies stand at 8 and 16ms respectively while there’s a bit of variance on the maximum latency.

What happens when we scale the number cluster members in each region? Let’s edit the coherence-cluster.yaml and increase the number of Coherence members per region to 10:

apiVersion: coherence.oracle.com/v1
kind: Coherence
metadata:
name: storage-${CLUSTER}
namespace: coherence-test
spec:
nodeSelector:
coherence: "true"
cluster: storage
coherence:
logLevel: 5
metrics:
enabled: true
management:
enabled: true
ports:
- name: metrics
serviceMonitor:
enabled: true
- name: management
replicas: 10
readinessProbe:
initialDelaySeconds: 30
jvm:
args:
- "-Dcoherence.wka=submariner-wka-paris.coherence-test.svc.clusterset.local,submariner-wka-frankfurt.coherence-test.svc.clusterset.local,submariner-wka-amsterdam.coherence-test.svc.clusterset.local"
memory:
initialHeapSize: 4g
maxHeapSize: 4g

and apply it:

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
CLUSTER=$cluster envsubst < coherence-cluster.yaml | kubectl apply -f -
done

We can see the global cluster members seamlessly scale to 30:

as does the global heap size. So, we have a basic confidence that by adding more Coherence members, the global cluster scales quite smoothly.

But in order to measure any other benefits besides regional availability, we must run a test against a real workload and before doing that, we must ensure there are no stability or hidden issues in the longer run.

Running a soak test

So let’s do a soak test by running JMeter for a longer duration to uncover potential issues which are usually hard to uncover during relatively short load or stress tests. In this case, we let the soak test run for a couple of days as you can see the graph below:

Running a soak test

It turns out there is an issue: after some time, CoreDNS resyncs and the Submariner configuration that was added when we connected the clusters are gone. Because of this, Coherence members in each region are suddenly unable to find the imported WKA services and therefore the members can only see those running within their own OKE clusters.

In the graph above, the solid part to the left is when we had just configured Submariner and started the performance test. Things were running fine until we start seeing the regular gaps in the middle. If you tail any of the Coherence pods, you’ll see it complaining about being unable to resolve the imported WKA services.

We need a way to store the CoreDNS configuration required by Submariner permanently so that the Coherence pods can look up those in other regions. Fortunately, OKE provides you with 1 such way allows you to add your own custom DNS configuration.

Configuring CoreDNS

For each region, create a file coredns-custom-{region}.yaml:

apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
clusterset.server: | # All custom server files must have a “.server” file extension.
# Change example.com to the domain you wish to forward.
clusterset.local:53 {
# Change 1.1.1.1 to your customer DNS resolver.
forward . 10.100.148.98
}

The 10.100.148.98 IP address you see above is the ClusterIP address of the submariner-lighthouse-coredns service:

kubectx paris
✔ Switched to context "paris".

kubectl -n submariner-operator get svc -o custom-columns="NAME":.metadata.name,"IP":.spec.clusterIP
NAME IP
submariner-gateway-metrics 10.100.163.49
submariner-lighthouse-agent-metrics 10.100.133.179
submariner-lighthouse-coredns 10.100.148.98
submariner-lighthouse-coredns-metrics 10.100.142.206
submariner-operator-metrics 10.100.210.103

Also, note that you must:

  • name the ConfigMap “coredns-customer”
  • declare it in the kube-system namespace

Repeat the same exercise and change the ClusterIP address in the coredns-custom-{region}.yaml for each region. Then, apply them:

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
kubectl apply -f coredns-custom-$cluster.yaml
done

Finally, we need to restart the CoreDNS pods in each cluster for the changes to take effect:

for cluster in paris amsterdam frankfurt; do
kubectx $cluster
kubectl delete pod --namespace kube-system -l k8s-app=kube-dns
done

And we can see the effect (ignore the big gap in the middle as it was due to a cosmetic and unrelated change during our experiment):

You can see after the gap on 04/20 at 18:00 (which is when we applied the CoreDNS configuration), the graph is solid again and there are no gaps, indicating no Coherence operation failed due to DNS lookup failures. We can confirm this by looking into the Coherence logs:

kubectl -n coherence-test logs -f storage-paris-0

Conclusion

In this article, we set up monitoring using Prometheus and Thanos so we can observe the behaviour of Coherence and Submariner when deployed across 3 different OKE clusters in 3 OCI regions. We then ran different performance tests (smoke, load and soak) to determine any stability or latency issues as well as any improvement.

What we can conclude so far is that there are no stability issues. Similarly, the current intra-cluster latency does not seem to be affecting normal Coherence cluster operations but then we are not exactly hammering Coherence enough.

The one issue that could have potentially derailed the system was due to CoreDNS and we discovered it using a soak test. However, the OKE team had thoughtfully documented this potential scenario and we were able to apply it. This also shows that it’s not enough to show that your system can meet certain performance targets and that different types of testing, even for infrastructure is important to uncover potential issues.

Stay tuned for Part 3 of this series when we’ll look at scalability and reliability.

I would like to conclude here and thank my colleagues Jonathan Knight, Tim Middleton, Avi Miller, Julian Ortiz and Sherwood Zern for their contributions and ideas to this article.

--

--