Dynamic node discovery using Consul

Having many teams working across different environments is a challenge when it comes to networking and node discovery. We at VRT — Digital Products are using Consul as our main service networking tool to solve this complexity.

Bogdan Katishev
VRT Digital Products
3 min readJul 20, 2022

--

Intro

We have 4 big environments where most of the development activity is present. A lot of new nodes are being brought up and at the same time many old nodes are being going down.

Some of our tech stacks also have dependencies on each other. An example: “Node A” can only be provisioned when “Node B” is fully functional. The same way goes for our deployments, some of our services can not be deployed if some specific nodes are not present/fully functional in the current environment.

Problem

Like I said in the intro, a lot of movement is happening across these environments, we need to know when a new node is entering/leaving the environment. Security is also to be kept in mind, not any random node can just join our environment.

We also want to keep an inventory of how many nodes are present, how many are healthy, how many are unhealthy, etc…

The same way goes for our services, we want to keep track of any new services joining the environment, if they are healthy/unhealthy etc…

It can be pretty overwhelming to manually keep track of all of this.

Solution

After a lot of research, we came to the conclusion that Consul was our tool to go, we are using Consul to:

  • discover new nodes/services
  • access/monitor services
  • secure networking
  • automate networking

Discovering new nodes in Consul is pretty simple, once they are connected to a datacenter, they will be listed in the Consul members list. We can then view this list using the consul members in the CLI. We can also check the state of a node in this list (alive/left/failed).

Accessing and monitoring services in Consul is also easy. We only need to write a check definition which the Consul agent can register, you can read the documentation page about Consul service checks if you want to know more about them: https://www.consul.io/docs/discovery/checks

When it comes to securing our networking: we use Transport Layer Security (TLS) to encrypt the communication between the consul nodes.

Network infrastructure automation is also a big help in Consul. The 2 key pointers that we use are:

  • Health Checks Visibility
Consul enables operators to gain real-time insights into the service  definitions, health, and location of applications supported by the  network.
  • Flexible Architecture
Consul can be deployed in any environment, across any cloud or runtime.

Here you can view the final architectural result of our current Consul datacenter setup which contains 4 datacenters.

Architectural overview

Consul Datacenter setup

Conclusion

All in all, Consul helps us in many ways. When Consul is bootstrapped and is set up correctly, Consul manages itself. We spend little to no time on Consul networking issues. Because of that, Dev and Ops can focus on more important work.

--

--