Better Programming

Advice for programmers.

Notes on Clustering Elixir Applications

--

Several years ago I was experimenting with clustering Elixir/Erlang nodes. I remember it didn’t go smoothly, either because of a lack of good libraries/documentation or, simply, because of my limited experience in the area. Last week I gave it the second chance and was surprised at how simple it is.

The article contains short notes on cluster formation in Docker and Kubernetes with the help of the libcluster library. You’ll find short descriptions of different strategies I’ve tried, K8s and docker-compose files, an example application, and other useful information.

Example application

Here is a simple cluster application. There are 2 simple apps (app_a and app_b), the cluster_tests app that I used for testing (can be ignored for now), the docker-compose.yaml file, and a bunch of K8s folders.

Run locally and connect manually.

Run apps from the corresponding directories:

cd apps/app_a
START_SERVER=true iex --name --cookie secret app_a@127.0.0.1
cd apps/app_b
START_SERVER=true iex --name --cookie secret app_a@127.0.0.1

And connect them with Node.connect(:"app_b@127.0.0.1") .

Nodes are connected and now it’s possible to do :rpc.call from one node to another:

:rpc.call(:"app_b@127.0.0.1", AppB, :hello, [])
# :world

The START_SERVER=true env variable tells the apps to start a simple web server. So if you go to localhost:4001 , you’ll see:

Hey from AppA :"app_a@127.0.0.1"! Node.list: [:"app_b@127.0.0.1"]

That shows that the app_a node is connected to the app_b one.

Run locally with LocalEpmd

Now it’s time for the libcluster library. It does just two simple things — discover available nodes and automatically connect them to form a cluster.

Check config/runtime.exs file. There are several topology configurations. We are starting with LocalEpmd .

After starting apps:

LIBCLUSTER_STRATEGY=local_epmd iex --name app_a@127.0.0.1 --cookie secret -S mix
LIBCLUSTER_STRATEGY=local_epmd iex --name app_b@127.0.0.1 --cookie secret -S mix

Nodes are automatically connected:

iex(app_a@127.0.0.1)1> Node.list()
[:"app_b@127.0.0.1"]

Since everything works locally, it’s time to scale apps and put them into containers.

Docker-compose setup

Here I’m using the simplest Gossip libcluster strategy. It doesn’t require any additional configuration. Details are here.

Check the docker-compose.yml file. There are three copies of the app_a app, three copies of the app_b app, and also the livebook app. There are some things worth mentioning.

First, to see each other, all the containers must be put into one network — cluster-network in my example.

Second. Check, for example, app-a1 configuration:

  app-a1:
build: ./apps/app_a/
command: elixir --name "app-a1@app-a1.docker" --erl '-kernel inet_dist_listen_min 9000 inet_dist_listen_max 9000' --cookie secret --no-halt -S mix run
hostname: app-a1.docker
ports:
- "9000:9000"
- "4001:4001"
environment:
- LIBCLUSTER_STRATEGY=gossip
- PORT=4001
- START_SERVER=true
networks:
- cluster-network

I use an explicit hostname for each container (app-a1.docker for app-a1 ). And this hostname is the same as in the--name argument.

There is also an argument: — erl ‘-kernel inet_dist_listen_min 9000 inet_dist_listen_max 9000’ . It explicitly tells Erlang to use the 9000 port inside the container. We will use that fact later with for the “remote shell” connection. The 9000 port is exposed.

Livebook is a great alternative to a remote shell. The configuration is straightforward:

  livebook:
image: livebook/livebook:edge
networks:
- cluster-network
environment:
- LIVEBOOK_PASSWORD=securesecret
- LIVEBOOK_DISTRIBUTION=name
- LIVEBOOK_PORT=8080
- LIVEBOOK_NODE=livebook@livebook.docker
- LIVEBOOK_COOKIE=secret
- LIVEBOOK_DEFAULT_RUNTIME=attached:livebook@livebook.docker:secret
hostname: livebook.docker
ports:
- "8080:8080"
- "8081:8081"

I use the “attached” mode, so the notebook is connected directly to the livebook node.

docker-compose up --build

Check http://localhost:4001/ and see all 6 nodes connected.

Hey from AppA :"app-a1@app-a1.docker"!
Node.list: [
:"app-a3@app-a3.docker",
:"app-b1@app-b1.docker",
:"app-b3@app-b3.docker",
:"app-a2@app-a2.docker",
:"app-b2@app-b2.docker"
]

Check “livebook” on http://localhost:8080/ (securesecret is the password). You can connect to any node in the cluster.

To connect the local node to the cluster, one needs to modify the /ect/hosts file. E.g. to connect to app-a1 , be sure that the 127.0.0.1 app-a1.docker alias exists in the file.

Then start the local node:

iex --name local@app-a1.docker --cookie secret --erl '-dist_listen false -erl_epmd_port 9000'

The --erl options that tell erlang to listen to 9000 port (-erl_epmd_port 9000) and also keeps the node from binding to the port (-dist_listen false). Read this post for more details.

Now you can connect the app-a1 node:

iex(local@app-a1.docker)1> Node.connect(:"app-a1@app-a1.docker")
true
iex(local@app-a1.docker)2> :rpc.call(:"app-a1@app-a1.docker", AppA, :hello, [])
:world

One can run :observer.start and connect the app-a1@app-a1.docker node.

And, finally, you can connect the remote shell to the node (with --remsh app-a1@app-a1.docker):

iex --name local@app-a1.docker --cookie secret --erl '-dist_listen false -erl_epmd_port 9000' --remsh app-a1@app-a1.docker

Now it’s time to use Kubernetes.

Kubernetes setup

I used minikube for my experiments. There were three libcluster strategies: Gossip , Kubernetes , and Kubernetes.DNS . I used both Kubernetes Deployment and Kubernetes Statefulstate (the last one allows to specify an explicit hostname for each node). See the k8s_* directories in the repo.

Gossip strategy with K8s Deployments

The simplest approach is using the Gossip strategy. Check the k8s_gossip_deployment folder. There are 2 deployments (with corresponding LoadBalancer services to expose web servers), and the “livebook” pod (and service).

apiVersion: apps/v1
kind: Deployment
metadata:
name: app-a
spec:
replicas: 3
selector:
matchLabels:
component: app-a
template:
metadata:
labels:
component: app-a
cluster: beam
spec:
containers:
- name: app-a
image: app-a
imagePullPolicy: Never
ports:
- containerPort: 4001
- containerPort: 9000
env:
- name: LIBCLUSTER_STRATEGY
value: "gossip"
- name: PORT
value: "4001"
- name: START_SERVER
value: "true"
- name: ERLANG_COOKIE
value: "secret"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
command: [ "elixir" ]
args: [
"--name",
"app-a@$(POD_IP)",
"--cookie","$(ERLANG_COOKIE)",
"--erl",
"-kernel inet_dist_listen_min 9000 inet_dist_listen_max 9000",
"--no-halt",
"-S","mix",
"run"
]

---
apiVersion: v1
kind: Service
metadata:
name: app-a
spec:
type: LoadBalancer
ports:
- port: 4001
targetPort: 4001
selector:
component: app-a

Nothing special. Note that app-a@$(POD_IP) is used as a node name.

kubectl apply -f k8s_gossip_deployment

Check the app-a service:

minikube service app-a

And see all the nodes connected:

Hey from AppA :"app-a@10.244.7.65"!
Node.list: [
:"app-a@10.244.7.61",
:"app-b@10.244.7.63",
:"app-b@10.244.7.62",
:"app-b@10.244.7.66",
:"app-a@10.244.7.64"
]

Run the Livebook service:

minikube service livebook

And attach any node from the list above.

The main problem with the “deployment” setup is that it’s not easy to connect to the cluster from the local host since pods don’t have specific hostnames.

So, if having Livebook is not enough, and you want to connect to the nodes locally, then you can use StatefulSet setup.

Gossip strategy with K8s StatefulSets

Check the k8s_gossip_statefulset folder.

The only difference is that instead of Deployment I use StatefulSet :

apiVersion: apps/v1
kind: StatefulSet
metadata:
name: app-a
spec:
serviceName: "beam-cluster"
replicas: 3
selector:
matchLabels:
component: app-a
template:
metadata:
labels:
component: app-a
cluster: beam
spec:
containers:
- name: app-a
image: app-a
imagePullPolicy: Never
ports:
- containerPort: 4001
- containerPort: 9000
env:
- name: LIBCLUSTER_STRATEGY
value: "gossip"
- name: PORT
value: "4001"
- name: START_SERVER
value: "true"
- name: ERLANG_COOKIE
value: "secret"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
command: [ "elixir" ]
args: [
"--name",
"app-a@$(POD_NAME).beam-cluster.default.svc.cluster.local",
"--cookie","$(ERLANG_COOKIE)",
"--erl",
"-kernel inet_dist_listen_min 9000 inet_dist_listen_max 9000",
"--no-halt",
"-S","mix",
"run"
]

Note how the node name is configured: app-a@$(POD_NAME).beam-cluster.default.svc.cluster.local .

The important part of the setup is the headless service:

apiVersion: v1
kind: Service

metadata:
name: beam-cluster
labels:
cluster: beam
spec:
clusterIP: None
selector:
cluster: beam

The service controls the “beam-cluster” domain of the matched services (cluster: beam).

kubectl apply -f k8s_gossip_statefulset
minikube service app-a

One can see that each node has a predefined hostname:

Hey from AppA :"app-a@app-a-0.beam-cluster.default.svc.cluster.local"! 
Node.list: [
:"app-b@app-b-0.beam-cluster.default.svc.cluster.local",
:"app-b@app-b-1.beam-cluster.default.svc.cluster.local",
:"app-a@app-a-1.beam-cluster.default.svc.cluster.local",
:"app-b@app-b-2.beam-cluster.default.svc.cluster.local",
:"app-a@app-a-2.beam-cluster.default.svc.cluster.local"
]

After adding proper aliases to /etc/hosts, for example,

127.0.0.1       app-a-0.beam-cluster.default.svc.cluster.local

one can connect to the node from the local host. First, you need to forward the 9000 port.

kubectl port-forward app-a-0 9000
iex --cookie secret --name local@app-a-0.beam-cluster.default.svc.cluster.local --erl '-dist_listen false -erl_epmd_port 9000'
iex(local@app-a-0.beam-cluster.default.svc.cluster.local)1> Node.connect(:"app-a@app-a-0.beam-cluster.default.svc.cluster.local")
true
iex(local@app-a-0.beam-cluster.default.svc.cluster.local)2> :rpc.call(:"app-a@app-a-0.beam-cluster.default.svc.cluster.local", AppA, :hello, [])
:world

More granular clusters with Kubernetes strategies

The Gossip strategy is very simple and doesn’t require any extra configurations. The libcluster library just finds all the available nodes and connects them. However, sometimes, you might need more granular control about what should be inside the cluster and what should not. Or, for example, one might need to create a couple of clusters in the same Kubernetes cluster.

The Kubernetes and Kubernetes.DNS strategies allow more detailed configuration. I’m describing these strategies briefly.

Kubernetes strategy with K8s StatefulSets

If you check the config/runtime.exs file, you’ll see:

config :libcluster,
topologies: [
k8s: [
strategy: Cluster.Strategy.Kubernetes,
config: [
mode: :hostname,
kubernetes_ip_lookup_mode: :pods,
kubernetes_service_name: "beam-cluster",
kubernetes_node_basename: "cluster-app",
kubernetes_selector: "cluster=beam",
kubernetes_namespace: "default"
]
]
]

So one must specify explicitly the kubernetes_service_name , kubernetes_node_basename , and kubernetes_selector . And there are a couple of naming conventions that must be followed. See the k8s_statefulset folder.

  • There must be a headless service with kubernetes_service_name . See the service definition.
  • The service must use the cluster: :beam selector.
  • All the nodes' base names must be the same as kubernetes_node_basename , e.g. cluster-app@$(POD_NAME).beam-cluster.default.svc.cluster.local .

Also, the strategy requires access to the Kubernetes API to list the pods under the service. See the roles.yaml file.

Kubernetes.DNS strategy with K8s Deployments

It is a simplified version of the “Kubernetes” strategy mentioned above.

It the config/runtime.exs :

config :libcluster,
topologies: [
k8s_dns: [
strategy: Cluster.Strategy.Kubernetes.DNS,
config: [
service: "beam-cluster",
application_name: "cluster-app"
]
]
]

So, again, one must have a headless “beam-cluster” service, and the same “cluster-app” basename for all the Elixir nodes.

See the k8s_dns_deployment folder.

Explicitly register nodes with the NodeRegistry library

Having your Elixir nodes connected you can do :rpc calls using the nodes' names. It might be useful to separate the “low-level” node names you use in the Kubernetes configurations from their “logical” names that can be set explicitly in the applications themselves.

I’ve created the simple NodeRegistry library that uses Erlang’s :global module for registering nodes explicitly. Available in hex.pm.

The usage is simple. One adds the {NodeRegistry, :my_service_name} to the superviros’ children list. When the application is started it explicitly registers the name in the registry. The :global module takes care of deleting the registered process from the registry when the node is disconnected.

In the example application mentioned above one can find:

{NodeRegistry, "app_a_#{postfix()}"} # in app_a
{NodeRegistry, "app_b_#{postfix()}"} # in app_b

So all the nodes that run app_a (or app_b) code will be registered with the same prefix.

Then one can get all the app_a nodes with NodeRegisty.nodes_with_prefix("app_a") . Or any random node with app_a : NodeRegisty.ramdom_node_with_prefix("app_a") .

Thanks for reading! Hope these notes will be useful not only for me.

--

--

No responses yet