Clustering Akka in Kubernetes with Statefulset and Deployment

I spoke at JFokus 2017 and continued to see a lot of interests in Akka. I wanted to learn more about it too. During JFokus speakers unconference, I had a chance to meet Johan Andrén from the Akka team, and Ola Petersson who’s using Akka and Lagom. I took the opportunity to learn Akka from both of them.

Aside from the basics, I was very interested in clustering multiple Akka nodes and seeing how that will work in Kubernetes. Johan, Ola, and I sat down after dinner one evening, and started to work on that. In the end, I was able to setup an Akka cluster in Kubernetes. You can find the sample code and configuration on GitHub: https://github.com/saturnism/akka-kubernetes-example

This article captures what I learned.

Akka Clustering Basics

Akka cluster nodes need to register themselves with the seed nodes in order to discover other nodes. Seed nodes need to be a stable ordered set of nodes (at least 1, but 2 or more for redundancy). Each new Akka node will attempt to register with the first seed node in the list. If the seed node order changes frequently, then there is an off chance of creating a split-brained cluster. Once a stable set of seed nodes are clustered, and as more Akka nodes joined the cluster, any additional Akka nodes can join via any of the Akka nodes programmatically, using Cluster(System).join().

Secondly, Akka node addresses (e.g., akka.tcp://cluster@seed-node-1:2551) must use an externally addressable host name/IP. For example, if an Akka node has an IP address of 10.100.0.23, and a host name of akka-23, and it registers itself as akka.tcp://cluster@10.100.0.23:2551, then you won’t be able to address it as akka.tcp://cluster@akka-23:2551. Likewise, if the Akka node registers itself as akka.tcp://cluster@akka-23:2551, then you cannot address it as akka.tcp://cluster@10.100.0.23:2551 either.

This creates some difficulties when you deploy in a container system and expect to expose Akka on the host machine because the container IP is different from the host IP. In this case, you’ll need to use hostname and port configuration to specify the combination addressable from other nodes. Then, use bind-hostname and bind-port to specify the local/internal hostname and ports. For example:

akka {
actor {
provider = "cluster"
}
remote {
netty.tcp {
hostname = 192.168.0.1 # machine IP
port = 32551 # machine port
bind-hostname = 10.100.0.23 # container IP
bind-port = 2551 # container port
}
}
}

Fortunately it’s much simpler with Kubernetes’ network architecture. Each Akka node will have its own IP address that’s reachable by any other Akka nodes even when Akka node containers are running on different machines. In Kubernetes, you won’t need to deal wtih host/container IP/port mapping.

How to Create an Akka Cluster in Kubernetes?

I initially thought I could take the same approach I used to cluster Infinispan (JGroup) and Hazelcast. But given the above constraints, creating a scalable Akka cluster in Kubernetes can be a bit more challenging. I’ll examine a few options and my solution.

Headless Service and DNS Discovery

There are a couple of Akka clustering examples in Kubernetes using Headless Service:

These examples use headless services to create a DNS name entry, e.g., akka-peers. All of the Akka nodes will be able to use this DNS name entry (or the Kubernetes API) to get a list of IP addresses of all of the other Akka nodes in the cluster. All of the discovered IP addresses can be used as seed nodes. Got to watch out though. Kubernetes headless service will by default return round-robin DNS entry. Whenever you refer to the akka-peers DNS name, you will receive the same set of IP addresses in different orders:

root@akka-peer-ki9f:/# dig +search akka-peer
...
;; ANSWER SECTION:
akka-peer.default.svc.cluster.local. 24 IN A 10.44.3.16
akka-peer.default.svc.cluster.local. 24 IN A 10.44.2.7
akka-peer.default.svc.cluster.local. 24 IN A 10.44.0.13
akka-peer.default.svc.cluster.local. 24 IN A 10.44.3.11
...
root@akka-peer-ki9f:/# dig +search akka-peer
...
;; ANSWER SECTION:
akka-peer.default.svc.cluster.local. 26 IN A 10.44.2.7
akka-peer.default.svc.cluster.local. 26 IN A 10.44.0.13
akka-peer.default.svc.cluster.local. 26 IN A 10.44.3.11
akka-peer.default.svc.cluster.local. 26 IN A 10.44.2.8
akka-peer.default.svc.cluster.local. 26 IN A 10.44.1.18

This is might not work for determining Akka seed nodes. Furthermore, if there are 1,000 Akka nodes in an Akka cluster, the initial seed node list may be composed of all 1,000 entries when you really only need a few of them.

Klusterd works by first sorting the list of IPs, and then taking only the first 5 (see Klusterd’s startup script). But, as the IP addresses come and go, the sorted set may slowly change over time. This is potentially fine as long as the original bootstrapped instances are still there.

Statefulset

Given the need for a stable and ordered list of seed nodes, my initial thought was to deploy an entire Akka cluster with Statefulset. This will allow each Akka node to have a sequential and stable DNS name. For example, the first Akka node will have a stable DNS name of akka-0, and the second Akka node will be akka-1, and so on. Then, I can always use the first two or three nodes as seed nodes. I can also scale out by increasing the number of replicas.

However, Statefulsets are scaled sequentially rather than in parallel. When I scale a Statefulset from one Akka node to five Akka nodes, rather than starting four more Akka nodes in parallel, it’ll start akka-1, wait for it to start successfully, then start akka-2, wait, start akka-3, … and repeat until it’s done. This is not ideal as it may take a while to start hundreds of Akka nodes.

Hybrid Solution

My final thought was to use both Kubernetes Statefulset and Deployment. I can use Statefulset for the seed nodes, and Deployment for scaling out worker nodes. This way, I can maintain stable and ordered DNS names for the seed nodes, while I retain the ability to start multiple Akka nodes in parallel when I scale with Deployment.

The full example is on my GitHub:

To use Statefulset for the seed nodes, I first create a headless service. The resource definition looks like a regular Service, but with clusterIP set to None:

apiVersion: v1
kind: Service
metadata:
name: akka-seed
spec:
ports:
- port: 2551
protocol: TCP
targetPort: 2551
selector:
run: akka-seed
clusterIP: None

Every Akka seed node instance in the Statefulset can be addressed via DNS as ${SEED_NODE_NAME}.akka-seed.

Then, I create a Statefulset of seed nodes:

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
labels:
run: akka-seed
name: akka-seed
spec:
serviceName: akka-seed
replicas: 2
selector:
matchLabels:
run: akka-seed
template:
metadata:
labels:
run: akka-seed
spec:
containers:
- name: akka-seed
image: saturnism/akka-cluster-example
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: SEED_NODES
value: akka-seed-0.akka-seed,akka-seed-1.akka-seed
command: ["/bin/sh", "-c", "HOST_NAME=${POD_NAME}.akka-seed java -jar /app/app.jar"]
livenessProbe:
tcpSocket:
port: 2551
ports:
- containerPort: 2551
protocol: TCP

First of all, every seed node will have a stable name (e.g., akka-seed-0, akka-seed-1, etc). This name is known as the Kubernetes pod name. I used the Kubernetes Downward API to expose the Pod Name as an environment variable, and then use it to construct the addressable DNS name, such as akka-seed-0.akka-seed.

For SEED_NODE, I can simply give it the stable DNS names of each of the seed nodes.

To deploy the Akka seed nodes:

$ kubectl apply -f akka-seeds.yaml

And see theAkka seeds that were created sequentially:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
akka-seed-0 1/1 Running 0 15h
akka-seed-1 1/1 Running 0 15h

Next, I can deploy the worker nodes using Deployment. It also uses the Downward API to assign the Pod IP address as the Akka node’s hostname.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
containers:
- name: akka-worker
image: saturnism/akka-cluster-example
env:
- name: SEED_NODES
value: akka-seed-0.akka-seed,akka-seed-1.akka-seed
- name: HOST_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
image: saturnism/akka-cluster-example
...

Once deployed, you can validate that all nodes are up and running:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
akka-seed-0 1/1 Running 0 8s
akka-seed-1 1/1 Running 0 6s
akka-worker-2263404214-8c266 1/1 Running 0 8s
akka-worker-2263404214-9ws3k 1/1 Running 0 8s
akka-worker-2263404214-f2tp3 1/1 Running 0 8s
akka-worker-2263404214-lkvz3 1/1 Running 0 8s

You can validate that the nodes have joined the cluster by inspecting the logs of any of the Akka nodes. For example, to tail the first seed node’s log:

$ kubect logs -f akka-seed-0
[INFO] [02/12/2017 15:36:53.568] [main] [akka.remote.Remoting] Starting remoting
[INFO] [02/12/2017 15:36:53.707] [main] [akka.remote.Remoting] Remoting started; listening on addresses :[akka.tcp://ClusterSystem@akka-seed-0.akka-seed:2551]
...
[INFO] [02/12/2017 15:37:05.101] [ClusterSystem-akka.actor.default-dispatcher-16] [akka.cluster.Cluster(akka://ClusterSystem)] Cluster Node [akka.tcp://ClusterSystem@akka-seed-0.akka-seed:2551] - Node [akka.tcp://ClusterSystem@akka-seed-1.akka-seed:2551] is JOINING, roles []
[INFO] [02/12/2017 15:37:06.854] [ClusterSystem-akka.actor.default-dispatcher-16] [akka://ClusterSystem/user/$a] Member is Up: Member(address = akka.tcp://ClusterSystem@10.44.2.10:2551, status = Up)
...

To un-deploy everything, simply delete the Services, StatefulSet, and Deployment. Or, in one shot:

$ kubectl delete -f kubernetes/

Give It a try!

You can find the code and configuration on GitHub. You can try this in any Kubernetes 1.5 cluster. One of easiest ways to start a multi-node cluster is on Google Container Engine, but starting from one node with Minikube is great for local development.

Special thanks to Johan and Ola for helping to learn Akka and reviewing this post.