How-to deploy a Highly Available JBoss cluster on Kubernetes with dynamic node discovery — part 1

There are payload which are really easy to scale up on Kubernetes, like those very simple stateless web servers which only connect to replicated databases or external services and add-up a little processing over it.

At times, you get some more difficult payloads to deploy and today, we are going to explore how to deploy a HA JBoss cluster (multiple nodes for high availability) on Kubernetes. We will configure our deployment so it benefits from both the JBoss clustering functionalities (shared state, EJB distribution accross nodes…) and Kubernetes advantages (cluster elasticity, auto-healing, monitoring and so on).

As a result we will get a fully elastic HA JBoss cluster on Kubernetes. This is perfect for lifting your Java EE application on the cloud (whether it be private or public, and even hybrid !).

Note : this article focuses on deploying a JBoss cluster, the same goes for deploying a Keycloak cluster (which was the original target of my work).

How does a JBoss cluster work ?

A cluster is a group of computers (nodes) serving the same application. The software running on those nodes needs to coordinate the payloads and data state. To acheive scalability benefit, the application should run faster or support more users by just adding new nodes in the cluster. That’s where difficulty arises : as more nodes are added to the cluster, coordination between nodes requires more and more operations, thus lowering the overall benefit of adding new servers. Usually one can expect between a linear and an asymptotical performance progression as a function of the number of nodes in the cluster.

Add to that that coordinating state in a cluster is very difficult, it is no surprise that more and more developers go to use managed services (which are developped to handle the state sharing and coordination problems) and implement their applications as a thin stateless layer over. This allows them to completely abstract from coordination and state sharing.

JBoss clusters are kind of intermediate in the sense that they provide high level distributed services out-of-the-box (sessions shared accross the cluster, distributed db cache, EJB distribution, HA singleton, distributed transactions, etc).

JBoss Application Server can work in three modes (regarding the clustering). Those are :

  • Standalone mode : the JBoss AS works as the only node serving the application. No clustering at all in this mode.
  • Standalone HA mode : AS nodes are connected and share the application workload. Each node has got its own configuration, which should be consistent with other nodes of the cluster.
  • Domain mode : AS nodes are connected and a central instance manages configuration for all other nodes which are then slaves.

You can read detailed information on how a JBoss cluster works in the (offical documentation)[https://docs.jboss.org/author/display/WFLY10/High+Availability+Guide].

Simply speaking, JBoss AS provides high level cluster functionalities by relying on the (Infinispan)[http://infinispan.org/] distributed in-memory key-value cache.

In turn, Infinispan (relies)[https://docs.jboss.org/jbossclustering/cluster_guide/5.1/html/jgroups.chapt.html] on (JGroups)[http://www.jgroups.org/] for node discovery and (message transport)[https://docs.jboss.org/jbossclustering/cluster_guide/5.1/html/clustering-blocks.chapt.html#clustering-blocks-jgroups] in theh cluster.

JGroups needs to :

  • discover nodes : meaning it should known at any time what are the up and running nodes in the cluster.
  • route and transport messages : meaning it has to choose an underlying communication protocol.

JGroups is very configurable and uses a default protocol stack, which can be customized according to specific requirements (that’s what we are going to do !).

By default, the JGroups protocol stack is composed of :

  • multicast UDP for message transport. When the network supports this protocol, messages are broadcasted on all nodes at the same time, speeding up communication.
  • multicast PING for node discovery. The default configuration of JGroups assumes that nodes can broadcast ping requests for discovering running nodes.
  • other protocols used to reconciliate splitted clusters (due to network partitions) and other stuff like that. We won’t go into much details on this stack part because it is the first two which need configuration when running a JBoss AS cluster on Kubernetes on GCP.

So, to sum up what we said up to now :

  • JBoss AS has a clustering functionality which allows you to scale your application by adding new nodes in your cluster,
  • It uses Infinispan to provide cluster functionalities, which in turn uses JGroups to establish cluster membership and message transport,
  • The standard stack uses multicast UDP for transport and node discovery,
  • The problem is that this kind of networking protocol (multicast UDP) is not always available on public cloud platform (as an example it is not available on GCP).
  • This forces us to customize the JGroups network stack if we want a JBoss AS cluster to work on a K8S GCP platform.

The JBoss AS cluster is able to provide a frontal load-balancer to divide the application load evenly accross nodes. We will disable this functionality as it is natively provided by Kubernetes in the form of Services.

A first solution

In this first approach, we will use (JDBC_PING)[https://developer.jboss.org/wiki/JDBCPING] for nodes discovery, and use TCP instead of multicast UDP for message transport between the cluster nodes.

This is a very simple solution, not the optimal one but easy to start with and it works ! We will in a next article explore two other (and better) ways to do that.

Pros : JDBC_PING is included by default in all JBoss distribution (so configuration is really easy).

Cons : you need to deploy a SQL database (if you already have a JEE datasource, you can reuse it).

The idea is that instead of using multicast UDP for node discovery and membership (that is, answering to the question : “what are the IPs of the nodes we want in the cluster ?”), JDBC_PING will use a SQL table in which each node will publish its IP and fetch the other nodes’ IPs.

Deploying a SQL server

Any SQL server with a JDBC connection can do. In this example, we will use a Postgres one, but this would work with MySQL, MariaDB, SQL Server, …

Deploy a SQL (we take a Postgres one in this example) server which should be accessible from your cluster’s nodes, be it a managed service or not.

Create a user on this server for the JBoss AS nodes running on k8s to publish and access the nodes information table.

JBoss AS configuration

The file specifying the cluster configuration is standalone-ha.xml, in your JBoss server.

Let’s start with the standalone-ha.xml file that you find out-of-the box. Here are the changes that need to be made :

Ensure that infinispan and jgroups extensions are loaded

Be sure to have those lines at the beginning :

https://gist.github.com/59b68f204b553fe62f9f47710da40d97

And be sure that this line is commented (or removed), because we don’t use modcluster (which uses Apache web server for load balancing)

https://gist.github.com/0f8496b3ab73506b7a9d23b665d97e30

Also if you have a <subsystem xmlns="urn:jboss:domain:modcluster:3.0"> node in your configuration, you can simply delete it.

Enable log

If you feel like seeing a lot of logs, you can see everything that jgroups, infinispan and clustering functionalities are doing :

https://gist.github.com/6407b5a4727d89f95bfe07c267779e14

Add a JDBC Datasource to access the SQL database

In my example, we use a Postgres database, so here is the datasource configuration :

https://gist.github.com/facdb18d745c95a31d7b56bef28f9ee9

As you can see, there is really nothing to it : the connection url, the used driver and credentials for connecting to the database.

As you probably guessed, jbossclustering is the name of the database that will be used for sharing node informations.

Configure the JGroups subsytem

https://gist.github.com/3f15eb8f9e48f0b8bf3a3b63987e0893

The HOST_IP environment variable contains the IP of the pod on which the JBoss AS instance is deployed. It is filled by kuberenetes itself when we configure the deployment like this :

https://gist.github.com/43fd46a76fce375ab1747299d1a3817b

Then you need to configure the jgroups-binding socket for nodes to communicate (you may have noticed that we refered to it in the JGroups configuration). I will here use the public network interface because we are in a kubernetes cluster and our nodes are protected by the Kubernetes infrastructure anyway.

https://gist.github.com/2711818b33d2422ba6bd480c8d0ada19

Deploy yous JBoss AS instances

Then you just have to pack JBoss AS, its configuration file (standalone-ha.xml) and your application binaries in a Docker image and deploy it with kubernetes (don't forget to launch JBoss in standalone HA mode by using standalone-ha.xml instead of standalone.xml).

Here are the main parts of my deployment :

https://gist.github.com/bd679c586382d619d9e9586d63072be9

Now you have an elastic HA JBoss working cluster !

You can scale it easily with for example :

kubectl scale deployment my-jboss-application --replicas 5

Nodes will be added to your deployment, they will then fill the SQL table and find their friends on the network, effectively discovering up and running nodes (don’t forget to configure health and liveness probes, it’s really really important).

You can even configure (pod)[https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/] and (cluster autoscaler)[https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler] to deploy new JBoss AS instances when traffic is higher and destroy some of them when they are not required anymore. That’s really enjoyable and something you don’t find out-of-the-box in the JBoss distribution !

It is a win-win between JBoss and K8s !

Conclusion and follow up

We have configured JGroups, the underlying communication library used by the clustering functionalities of JBoss AS :

  • use of JDBC PING for node discovery (uses a SQL table to maintain cluster nodes information).
  • use TCP instead of multicast UDP (because multicast UDP is unavailable on GCP)
  • disable LoadBalancer (we will access the JBoss cluster through the K8S service, so we really don’t need it)

This wasn’t really complicated, just so many software stacks to cross ! For this to work I had to read a lot of docs and source code, so I offer it to you in the expectation that it will be easier for you than it was for me. Please if you encounter difficulties, send me a comment, so that I can improve the article ;)

In the next part of this series, we will introduce two other solutions :

  • using the K8s API to get the cluster nodes (by using KUBE_PING)
  • using an ad-hoc internal service (or etcd ?) to implement the node discovery service

Your comments (positive, negative and neutral) are very much welcome !

Arnaud Tournier, at your service.


Originally published at gist.github.com.