Consul, Watches, and Brokers, Oh My!

Published in

HashiCorp Solutions Engineering Blog

8 min readFeb 10, 2021

Not long ago, I had an interesting conversation with a partner related to HashiCorp’s Consul and service discovery. We both knew that Consul provided a great solution (well, I believe it’s great) to automatically discover services deployed within various datacenters and share information about the status and location of those services to consumers. However, this particular implementation required a distribution of that service information across a message broker (RabbitMQ in particular). I thought this was an interesting use case, and as I sometimes do, I headed down a path to see if it would work. Please note that although I’m using ActiveMQ in this example, I suspect a similar solution could apply to RabbitMQ or other mechanisms that allow input via HTTP.

System Overview

All of the files and scripts necessary are available in this public repository. So if you want to fork it and build your own to help follow along, please do so!

Using Terraform, I created an arrangement of three Consul servers and four Consul clients. Note that Consul has been baked into the image using Packer (Details in the Image-Creation directory within the repository). On a separate machine, I’ve deployed Apache ActiveMQ running with a few minor configuration changes. I’m sure those more familiar with ActiveMQ can point out all of the areas where this can be improved, but all I was trying to do was get an AMQ server running that can process requests. If you already have an intimate knowledge of ActiveMQ, please skip to the Consul section so as not to think less of my intellect and hackery.

ActiveMQ Processes

Of course I wanted to make this entire process as repeatable as possible, so I’m provisioning the machines and applications using the Terraform remote-exec provisioner. I’m doing this because it’s easy for me, although in a production environment you may prefer to use one of the software provisioning tools (Ansible, Chef, Puppet, etc.).

The files directory of the repository includes a pre-download ActiveMQ tarball (apache-activemq-5.16.0-bin.tar.gz), which is getting transferred to the ActiveMQ machine as part of the null_resource provisioning-activemq. Within that same resource, I’m unzipping this tarball, however, there are some specific file changes that need to be made in order to get ActiveMQ running for this test. These configuration changes are to overcome the following challenges I found in my process…

ActiveMQ Broker listening on IPV6 Interface
ActiveMQ Web Server (Jetty) only listening on localhost

ActiveMQ Broker Listening on IPV6 Interface

The parameter to control which interface the broker was listening on is found in /etc/apache-activemq-5.16.0/bin/env, specifically ACTIVEMQ_QUEUEMANAGERURL, which is on line 85 within this system. Using sed and remote exec I’m able to modify this parameter to match the private IP address for my EC2 instance running ActiveMQ. The resulting section appears as follows:

# Specify the queue manager URL for using "browse" option of sysv initscript
if [ -z "$ACTIVEMQ_QUEUEMANAGERURL" ]; then
    ACTIVEMQ_QUEUEMANAGERURL=" - amqurl tcp://192.168.100.138:61616"
Fi

ActiveMQ Web Server (Jetty) Only Listening on localhost

I suppose for security reasons the Jetty process within ActiveMQ, the process that receives HTTP posts for topics and queues, only listens on localhost. The Jetty configuration file in this system is /etc/apache-activemq-5.16.0/conf/jetty.xml. But be forewarned, there is no fishing from this Jetty. The jettyPort is the parameter we need to adjust to set the host IP address as the private IP address of your ActiveMQ EC2 Instance. This is defined on line 119 in this environment. The completed setting would look like this:

<bean id=”jettyPort” class=”org.apache.activemq.web.WebConsolePort” init-method=”start”>
   <! — the default port number for the web console →
   <property name=”host” value=”192.168.100.138"/>
   <property name=”port” value=”8161"/>
</bean>

With those two edits completed, the Terraform provisioner starts ActiveMQ:

sudo /etc/apache-activemq-5.16.0/bin/activemq console &

If you want to validate that the ActiveMQ server came up in a happy and healthy state (usually a good idea), you can visit the web interface for ActiveMQ using the ActiveMQ_Server URL output with the default credentials admin/admin.

The Apache ActiveMQ package also includes several example configurations, including a simple AMQ consumer. After all, if Consul is going to produce something, it would be nice to have something consuming it. Otherwise Consul might get sad and run off with its friend Nomad. More on usage of the example consumer later.

Working with Consul

As part of the Terraform script, the entire Consul cluster has been deployed and configured. We have three Consul servers, and although Manly P. Hall stated that “three is the equilibrium of the unities,” it is the bare minimum for RAFT consensus. For redundancy, at least 5–7 server nodes are required. We’re also deploying four Consul client nodes, one for each Platonic element. Using your favorite web browser (sorry, Lynx and Netscape aren’t officially supported), the Consul UI is available using the URL provided in the Terraform output for Consul_Server.

Within the UI “Nodes” section, you can see the three server nodes, and four client nodes.

By looking at the services (the “Services” section), we can see that both Consul and HTTP are running as available services.

On each of those four clients, we’ve set up a service called httpd. Establishing a service on Consul can be quite simple, as demonstrated in the JSON file we are using for the httpd service. However, there are a plethora of other options that can be utilized as detailed in the documentation.

{
   “service”: {
      “name”: “httpd”,
      “port”: 8080,
      “check”: {
         “args”: [
            “curl”,
            “localhost:8080”
         ],
      “interval”: “10s”
      }
   }
}

For this service, we have a name, a port on which that service runs, an accompanying check to make sure our service is alive, and an interval for that check. That’s it…pretty simple (just like me). Of course, if we don’t have anything running on those clients that correspond with this service, those checks will consistently fail, and we’ll all be sad.

Now, the point of this exercise isn’t to create some giant and fanciful web server, so we’re just going to initiate a very simple Python HTTP server…ironically using the http.server module provided as part of the Python3 package. Our Terraform scripts also run this process as part of the remote-exec provisioner on the clients.

python3 -m http.server 8080

Note that I’ve added a ‘sleep’ command within the remote-exec just to make sure our HTTP server is running before the remote-exec script bails.

The Marriage of Consul and ActiveMQ

Alright Rob, how are we going to get these two pieces communicating? If you recall nothing else from this ridiculous diatribe, take note of the use of the Consul Watch functionality. If you do a search for Consul Watches, you may get some very expensive watches from Hermes showing up in the results, which I always found distracting. I still don’t understand why a Greek God would need a watch, but here I am chasing another squirrel.

Regardless, utilizing a Consul Watch, the Consul application monitors specified areas within the Consul system and takes some action. For our purposes we are watching the value of Checks, specifically, the health check for the ‘httpd’ service we defined previously. In addition to watching Checks, Consul can monitor Keys, Services, Nodes, Events, and a couple other items detailed within the documentation. The construction of our Watch is quite simple, identifying the Watch type (checks), the service associated with the Checks (httpd), and the arguments associated. One of the cool things here, at least in my biased opinion, is the flexibility of the arguments. Based on the results of the Watch, you can do just about anything that can be written in script.

watches = [
   {
      type = “checks”
      service = “httpd”
      args = [“/etc/consul/consul.d/check-handler.py”]
   }
]

The Watch is defined as part of the Consul server configuration, which we are implementing through a Terraform template file (files/server_template.tpl). Our simple Watch has merely one argument, a Python Script that is called any time the check associated with the httpd service is run. However, as I am neither a zoologist nor a member of house Slytherin, the Python transformed into an albatross around my neck. Thankfully my colleague Yash Khemani came to my rescue with some magic (and water) to help. This resulted in the check_handler python script that does all of the real work here.

#!/usr/bin/env python3
import json, sys, requests 
from requests.auth import HTTPBasicAuthdef main():
   amq_url = “http://amq_public_address:8161/api/message/consul.checks?type=topic"
   consul_watch = json.load(sys.stdin)
   for i in consul_watch:
      check_info = {
         “Node”: i[‘Node’],
         “Check”: i[‘CheckID’],
         “Status”: i[‘Status’]
         }
      check_json = json.dumps(check_info)
      print(check_json)
      requests.post(amq_url, json=check_json, auth=HTTPBasicAuth(‘admin’, ‘admin’))if __name__ == “__main__”:
   main()

During the Terraform initialization of the infrastructure, this Python script is moved to the Consul Servers, and made to be executable. The script waits to be called, given some sort of input (sys.stdin), pulls some of that input out into a relatively pretty format, and pushes the json to the ActiveMQ server. Note that I’ve hardcoded the default credentials for ActiveMQ; utilizing Vault to manage these credentials would certainly be wiser, but I’m not writing about Vault (today).

In the immortal words of Inigo Montoya, “Let me explain… No, there is too much. Let me sum up.” We now have an ActiveMQ server running, a Consul Server cluster consisting of three nodes, and four Consul clients monitoring an HTTP server on each client. So let’s see this thing in action!

By now, you should already see messages enqueued within your ActiveMQ system. To view this, select “Topics” in the menu. As shown below, we can see that there is a topic called “consul.checks” with 8 messages enqueued, but nobody is reading them! Kind of like my blog posts….

So, what we are going to do is run an activeMQ consumer to, well, consume the messages. If you recall from earlier in the document, I mentioned an ActiveMQ consumer that is available from the installed tarball. This does have to be run from the ActiveMQ server, however, so once you connect to the server (ActiveMQ_Server_IP) via SSH, you execute the following command:

sudo /etc/apache-activemq-5.16.0/bin/activemq consumer --destination topic://consul.checks --bytesAsText true

Within a minute or two, you should start to see the Consul service check information consumed from the ActiveMQ system.

INFO | Connecting to URL: failover://tcp://localhost:61616 as user: null
INFO | Consuming topic://consul.checks
INFO | Sleeping between receives 0 ms
INFO | Running 1 parallel threads
INFO | Successfully connected to tcp://localhost:61616
INFO | consumer-1 wait until 1000 messages are consumed
INFO | consumer-1 Received “{\”Node\”: \”ip-192–168–100–103\”, \”Check\”: \”service:httpd\”, \”Status\”: \”passing\”}”
INFO | consumer-1 Received “{\”Node\”: \”ip-192–168–100–71\”, \”Check\”: \”service:httpd\”, \”Status\”: \”passing\”}”
INFO | consumer-1 Received “{\”Node\”: \”ip-192–168–100–85\”, \”Check\”: \”service:httpd\”, \”Status\”: \”passing\”}”
INFO | consumer-1 Received “{\”Node\”: \”ip-192–168–100–88\”, \”Check\”: \”service:httpd\”, \”Status\”: \”passing\”}”

In Summary…

Running through this exercise was pretty fun for me (with exception to the Python stranglehold), and I’m hoping the model can be extended for multiple use cases. I’ve always thought of Consul as a very simple solution for service discovery, providing a simple DNS interface to locate where consumers can access the necessary services. Before this discussion, I didn’t really think about how this service lookup solution can be extended to other environments, and other methods of communicating service location and availability. I will leave you with a simple question…what needs to know about the availability of the services on your network?