Auto Scaling Microservice and its Custom Cluster Scheduler Singleton

Problem: When micro-services are designed to auto-scale, the same application is, running on multiple nodes inside a cluster. Ya? Determining to run a business component only from one of these nodes within or across multiple clusters, is a problem to solve.

Multiple Clusters of Same Application — A Reactive Microservice handling 50%-50% Traffic (ignored ALB’s)

Ground Breakers: Kafka/ReactiveKafka, SpringBoot Rest/Jersey Rest

Solution: Publisher and Consumer

Lets talk about publisher first, in terms of what it publishes and to where?

A simple custom message that can be published to a Kafka Topic on to a single partition. Lets call this message as a heartbeat.

Create an object that carries a message having the below set of fields.

  1. Node Identifier [Machine/HostName, Application Name, Node UUID, Application Version]
  2. Current Scheduler Status [Running, Not Running]

The key note is: Kafka Topic has Partitions and we always publish the message to a single partition. This enables us to utilize the inbuilt mechanism of managing the heartbeats order aka knows as offset [number based like 0, 1, 2 etc] while multiple nodes are racing to publish.

The same application’s scheduler inside every node running at periodically configurable intervals will now start publishing a heartbeat.

Almost there: In addition to the above mechanism, we left with only one piece — The consumer of these heartbeats. YaY!

Key Note: Consumer consumes the heartbeats. So what? In addition to the heartbeat, every consumer also knows the offset of the corresponding heartbeat inside the partition, including its own. This offset is going to be a key later.

So Lets talk about consumer internals now:

Consumer is required to maintain a thread safe expirable map storage to hold the incoming heartbeats. For that you can use: PassiveExpiringMap from apache common collections(4).

So out of this storage of heartbeats that will be keep deleting from the map on a configurable intervals, the scheduler that runs inside the consumer finds and fetches the originator that had the lowest offset. If that originator is self, that’s it. That’s the node runs the show.

What we did is building a self intelligence mechanism so that the node can take a decision independently. This self intelligence is coming out of the expirable model data that we are building on each node is coming upon consumption of the heartbeats.

Ting, Tings:

  1. Heartbeats can be overwritten when the expiry time is too high than the publisher intervals. Which might be okay, but, don’t forget to give a test with your custom configured numbers.
  2. You need to make the expirable map operations thread safe because by default they are not. (If you are using Java, use Synchronize)
  3. Based on the requirement, decreasing the heartbeat frequency will reduce the performance impact to the application. Good!
  4. No direct performance impact on the application as the scheduler is supposed to run on its own thread. @Scheduled(fixedRate = 5000) [Click to an example]
  5. Design and implementation is patented. If you want to use the coded library — Obtain a license.

Thanks to my partner David Gilliam for capturing and limit the content at its simplicity.

Diagrammatic representation of above text:

A basic Flow diagram explains how it works
http://allibilli.com/

--

--