Distributed Cache with Akka cluster-sharding and Akka HTTP on Kubernetes

Rohit Sharma
Congruence Labs
Published in
3 min readFeb 25, 2018

This article captures the implementation of an application serving data over HTTP which is stored in cluster-sharded actors and deployed on Kubernetes.

Use case:

An application, serving data over HTTP and with a high request rate and the latency of order of 10ms with limited database IOPS available.

My initial idea was to cache it in memory, which worked pretty fine for some time. But, this meant larger instances due to duplication of cached data in the instances behind the load balancer. As an alternative I wanted to use Kubernetes for this problem and do a PoC of a distributed cache with Akka cluster-sharding and Akka-HTTP on Kubernetes.

This article is by no means a complete tutorial to Akka cluster-sharding or Kubernetes. It is my knowledge sharing which I gained while doing this PoC.

The code for this PoC can be found at here.

Let’s dig in the details of this implementation.

To form an Akka Cluster, there needs to a pre-defined ordered set of contact points often called as seed nodes. Each Akka node will try to register itself with the first node from the list of seed nodes. Once, all the seed nodes have joined the cluster, any new node can join the cluster programmatically.

The ordered part is important here, because if the first seed node changes frequently then the chances of split-brain increases. More info about Akka Clustering can be found here.

So, the challenge here with Kubernetes was the ordered set of predefined nodes and here comes StatefulSet(s) and Headless Services to the rescue.

StatefulSet guarantees stable and ordered pod creation which satisfies the requirement of our seed nodes and Headless service is responsible for their deterministic discovery in the network. So, the first node will be “<application>-0” and second will be “<application>-1” and so on.

  • <application> is replaced by the actual name of the application

The DNS for the seed nodes will be of the form:

<application-name>-<ordinal>.<service-name>.<namespace>.svc.cluster.local

Steps:

  1. Start with creating the Kubernetes resources. First, the Headless Service, which is responsible for deterministic discovery of seed nodes(Pods), can be created using below manifest
kind: Service
apiVersion: v1
metadata:
name:
distributed-cache
labels:
app:
distributed-cache
spec:
clusterIP:
None
selector:
app:
distributed-cache
ports:
- port: 2551
targetPort: 2551
protocol: TCP

Note, that the “clusterIP” is set to None. Which indicates it’s a headless service.

2. Create a StatefulSet. Which is a manifest for ordered pod creation

apiVersion: "apps/v1beta2"
kind:
StatefulSet
metadata:
name:
distributed-cache
spec:
selector:
matchLabels:
app:
distributed-cache
serviceName: distributed-cache
replicas: 3
template:
metadata:
labels:
app:
distributed-cache
spec:
containers:
- name: distributed-cache
image: "localhost:5000/distributed-cache-on-k8s-poc:1.0"
env:
- name: AKKA_ACTOR_SYSTEM_NAME
value: "distributed-cache-system"
- name: AKKA_REMOTING_BIND_PORT
value: "2551"
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath:
metadata.name
- name: AKKA_REMOTING_BIND_DOMAIN
value: "distributed-cache.default.svc.cluster.local"
- name: AKKA_SEED_NODES
value: "distributed-cache-0.distributed-cache.default.svc.cluster.local:2551,distributed-cache-1.distributed-cache.default.svc.cluster.local:2551,distributed-cache-2.distributed-cache.default.svc.cluster.local:2551"
ports:
- containerPort: 2551
readinessProbe:
httpGet:
port:
9000
path: /health

3. Create a service. Which will be responsible for redirecting outside internet traffic to pods

apiVersion: v1
kind: Service
metadata:
labels:
app:
distributed-cache
name: distributed-cache-service
spec:
selector:
app:
distributed-cache
type: ClusterIP
ports:
- port: 80
protocol: TCP
# this needs to match your container port
targetPort: 9000

4. Create an Ingress. Which is responsible for defining a set of rules to route traffic from outside internet to services

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name:
distributed-cache-ingress
spec:
rules:
# DNS name your application should be exposed on
- host: "distributed-cache.com"
http:
paths:
- backend:
serviceName:
distributed-cache-service
servicePort: 80

Give it a try (all steps to run in locally are mentioned in the Readme)

And the distributed cache is ready to use:

--

--

Rohit Sharma
Congruence Labs

Scala | Kubernetes | AWS | Distributed Systems | Microservices | Cassandra | Cryptocurrency