Unfolding distributed messaging architecture, Message store, Topic & Queue

Kamal Maiti
23 min readSep 9, 2022

--

Using author’s own camera

Overview

Distributed messaging architecture plays a crucial role in today’s computing world. Multiple services interact with each other by exchanging messages, either synchronously or asynchronously. Whether it’s real-time trading events or placing orders on e-commerce sites, all of them utilize a messaging platform. The messages are later processed by the consumers. The primary goal of the messaging platform is to prevent message loss and buffer them when processors or consumers are not capable of handling them.

This article’s purpose is not to offer information on operating and deploying the platform, but rather to provide an in-depth understanding from an architectural and programming perspective.

Motivation Behind

I was curious about the ways to tamper with a message. If someone modifies or changes the attributes of a message, what security measures should be taken? You might wonder why I would think about changing the message attributes or introducing new ones. Here’s the answer: once, while building a query to fetch streaming data based on windowing concepts, I noticed that events generated from IO devices arrived late due to network lag. This raised the question of how the KPI dashboard would process those late-arriving messages at the consumer level, considering that the window had already slipped away. This naturally led me to understand the message’s structure, schema, and attributes, as well as adding additional attributes such as a timestamp as a watermark in GCP Dataflow.

What is distributed messaging all about

In distributed computing, messages are queued asynchronously in messaging systems, providing the benefits of reliability, scalability, and persistence. This messaging system is called a broker.

In the past, client-server communication happened synchronously, which caused issues when the processing speeds of either the client or server varied. This often led to dropped messages and jeopardized message order.

Messaging for tightly coupled communication

Nowadays, messaging platforms are used as a mediator between client systems. If the rate of incoming messages changes, and the consumer systems are unable to process at the same speed, the messaging platform buffers the messages for a short period of time, ensuring that messages are not dropped off.

Messaging for loosely coupled communication

Various Messaging Protocols

The lack of a standardized way to interact with message brokers has been a known issue in messaging technology for many years. The AMQP protocol was designed by the main messaging actors, companies, and software producers to overcome this limitation. This section provides an overview of the most common standard protocols supported today by the main messaging systems. Protocol choice is a crucial design decision for message-oriented architecture due to the strong coupling it has within the application.

AMQP: It is the result of a standardization effort by the major contributors in the messaging scene (e.g. Cisco, Microsoft, Red Hat, banks). It is designed for interoperability between different messaging systems. It provides the definition for a binary wire protocol and a complete delivery semantic, allowing, theoretically, for an AMQP messaging client to be able to interact seamlessly with different brokers implementations which are AMQP compliant.

STOMP: A text-based protocol meant to be simple and widely-interoperable. It is mainly a wire protocol, it comes with very basic messaging semantic built-in, requiring appropriate configuration at the message system level.

MQTT: It is lightweight protocol designed originally from IBM. It is meant for low bandwidth, high-latency networks. It defines a compact binary format with very limited overhead on the communication, few tens of Bytes, which makes it suitable for Internet of Things style applications in a simple produce-and-forget scenario.

Various Messaging Technologies

Message-oriented middleware has been developed for more than a decade to what today is a rich and solid ecosystem of services and libraries. Message brokers, as intermediate standalone services which offer messaging capabilities to distributed applications, are the most common type of messaging systems. Message brokers have been extensively used over the years for implementing communication and integration in distributed system , with the exception of data-intensive and high-performance use cases, where the existence of an intermediate entity is not a suitable option. In recent years, a new generation of messaging systems has appeared, with a focus on low-latency and high-performance use cases, pushing the boundaries of messaging applications. That said it doesn’t intend to keep large size of message in the pipe, rather smaller one should be kept temporarily. Every system has their own limitation. So, there is trade off between performance vs latency. Small size of messages always have low latency of delivery.

Messaging Broker

A message broker is standalone intermediate entity which offers messaging functionality via standard or custom protocols. Many message brokers exist, different in capabilities, protocols, implementation languages, platform support. The focus of this review is on open-source solutions, but many exist as part of enterprise commercial software too. In general brokers manage flow control, buffering , monitoring, error & failure management etc.

There are two types of messages concept. One-to-one & one-to-many. In One-to-one, single producer produces message and then message is stored in queue based data structure residing either in the memory or on the disk. In topic based concept, one producer produces message and then it is delivered to broker. Multiple consumers subscribes listen to the topic and consume the message from that topic.

Queue: One-to-One or Point-to-point(P2P)

Point-to-point

Topic: One-to-Many( Pub/Sub )

pub-sub

Messaging Technology by cloud providers

Amazon SQS: Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. SQS eliminates the complexity and overhead associated with managing and operating message-oriented middleware, and empowers developers to focus on differentiating work. Using SQS, you can send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available.

Amazon MQ: is a managed message broker service for Apache ActiveMQ and RabbitMQ that makes it easy to set up and operate message brokers on AWS. Amazon MQ reduces your operational responsibilities by managing the provisioning, setup, and maintenance of message brokers for you. Because Amazon MQ connects to your current applications with industry-standard APIs and protocols, you can easily migrate to AWS without having to rewrite code.

Google pub/sub: Google’s GCP platform has messaging service called Pub/Sub. Google prefers calling it “an asynchronous messaging service.”

Google Pub/Sub is an excellent solution for enterprises with a complex ecosystem and hundreds of microservices and apps. Each of the microservices and apps can work independently and still be a part of the complex system via Google Pub/Sub, where they can exchange data through a dynamic data pipeline. Pub/sub, Dataflow & Bigquery take an important role in the Data analytics domain in GCP.

Azure’s message bus: Azure has multiple such services like “Service Bus”, “Event Grid”, “Event Hubs”, “Queue Storage” etc. For details, you can use as per your requirement. For further details, you can refer here.

Messaging Technology by Enterprise level company

JVM is a java messaging service that provides an API to send messages from one application to another. It works on the model of publish-subscribe and provides a sure of message delivery.

TIBCO(uses JMS): It is a top enterprise platform in the distributed messages niches. It’s a reputable world-class software company that specializes in delivering solutions for monitoring, managing, and integrating enterprise apps and information delivery. While they offer many different solutions, the one that interests us is the platform for managing distributed systems. It helps businesses identify and leverage opportunities hidden in their real-time data streams.

TIBCO’s platform for managing distributed systems is flexible, scalable, and reliable. TIBCO Messaging resolves the issue of integrating incompatible distributed systems, thus unifying data streams into several easy to manage data pipelines.

I had a chance to work in Tibco platform in a large telco organization. They digital platform’s all inbound calls were being in queued in such messaging platform and later messages were being consumed by different consumers.

Confluent — Confluent is a completely managed Kafka service. In fact, it’s created by the developer who initially worked on Apache Kafka. It‘s an event stream processing platform aimed to cater to the needs of large enterprises. It supports real-time data streaming for AWS, GCP, and Azure.

IBM Stream — IBM Streams is IBMs turn on data streams. Streams is a robust software platform enabling developers to build apps that use information in data streams. IBM Streams is built to support different types of data in the data stream ranging from text and video to geospatial and sensor data.

Amazon Kinesis — Amazon created Kinesis to help businesses collect, manage, and process big chunks of data. Amazon Kinesis is equipped to help developers store huge data despite its type, including IoT telemetry data. This data can then be used for analytics, ML, or used by other apps.

Through Kinesis, Amazon offers companies infrastructure and software to reduce the workload and cut down expenses. The data can be hosted on Amazon servers, but Kinesis also supports integration with other storage service providers, such as S3, Redshift, and DynamoDB.

Messaging Technologies by OSS Community

ActiveMQ — ActiveMQ is one of the most widely adopted open-source message broker. It is an Apache project, it is written in Java and it is commercially supported by Red Hat. ActiveMQ has an extensive protocol support (e.g. AMQP, STOMP, MQTT, Openwire, HTTP and many others), it provides many cross language clients and it is fully JMS compliant. ActiveMQ offers many advanced capabilities, such as rich deliver semantic (e.g. virtual queues, composite destination, wildcards), JDBC message store (e.g. to persist messages in any JDBC compliant database) and advanced clustering configuration (e.g. master-slave, network of brokers). ActiveMQ is a feature-complete messaging solution, which can be used to implement many communication and integration patterns. I’ll go into more depth on this technology later in this article.

Apache Kafka — Apache Kafka is an open-source project originally from LinkedIn, now part of the Apache foundation. It has been developed for real-time activity stream analytics, to solve the need for an effective way to move big amount of data (e.g. user metrics, computer farm monitoring) from the producers to many potential consumers. The scale and the data size (billions of messages and hundreds of gigabytes per day) and the time constraint makes the use case not suitable for standard brokers, as by the comparison in The innovative idea of Kafka is to be a stateless broker, so to not retain any information about consumers. A consumer has to retain its own state (e.g. the information about the last data read) and poll Kafka for new data when needed. This allows Kafka to persist a single message copy independently from the number of consumers (e.g. messages are not removed on consumption, but by retention period or other policy), with a resulting high-throughput for read and write operations. Kafka persistence is implemented as a distributed commit log, as shown below, designed as distributed system easy to scale out (based on Zookeeper), which allows for automatic balancing of consumer/producer/broker.

In contrast with standard message brokers, Kafka provides limited messaging capabilities (e.g. mainly topic semantic, file-system as unique persistent storage, strict guaranteed ordering). Although many client libraries are available, it only supports its custom binary format over TCP. Kafka is an optimal solution for data-movement, frequently adopted as pipe to different processing systems (e.g. Hadoop, Storm). I have used this technology in multiple projects(For notification, spark job queue, query queuing etc.).

Apache Pulsar — Apache Pulsar is a cloud-native, multi-tenant, high-performance solution for server-to-server messaging and queuing built on the publisher-subscribe (pub-sub) pattern. Pulsar combines the best features of a traditional messaging system like RabbitMQ with those of a pub-sub system like Apache Kafka — scaling up or down dynamically without downtime. It’s used by thousands of companies for high-performance data pipelines, microservices, instant messaging, data integrations, and more.

RabbitMQ — RabbitMQ is a lightweight open-source message broker written in Erlang which profits from the message passing capabilities from the language underneath. RabbitMQ architecture is deeply modular, it mainly supports AMQP and STOMP but additional protocols can be loaded as plug-in (e.g. MQTT, HTTP). It supports the main messaging capabilities, such as persistence, clustering, high-availability and federation. RabbitMQ remains a lightweight messaging solution which can be found embedded into several projects (e.g. Logstash) for its simplicity and reliability.

ZeroMQ — ZeroMQ is an asynchronous messaging system that is used in distributed and concurrent applications. It can run without a broker.

ActiveMQ Architecture

ActiveMQ has two flavor s — a) Artemis b) Classic. We’ll mention the architecture in general. Also, message flow is pretty straight forward. I’m not going to describe it in detail.

ActiveMQ Architecture

One important point that has to mention here is that in point-to-point messaging, the broker sends a message to an address configured with the anycast routing type, and the message is placed into a queue where it will be retrieved by a single consumer. In the case of pub/sub messaging, the address contains a queue for each topic subscription and the broker uses the multicast routing type to send a copy of each message to each subscription queue.

With point-to-point messaging, there can be many consumers on the queue but a particular message will only ever be consumed by a maximum of one of them. Senders (also known as producers) to the queue are completely decoupled from receivers (also known as consumers) of the queue — they do not know of each other’s existence. Example: order processing.

In pub-sub, each subscription receives a copy of each message sent to the topic. Subscriptions can optionally be durable which means they retain a copy of each message sent to the topic until the subscriber consumes them — even if the server crashes or is restarted in between. Non-durable subscriptions only last a maximum of the lifetime of the connection that created them. Example: News Feed, Video, Audio subscribed notification feed.

ActiveMQ Memory & Storage

ActiveMQ uses memory to store messages awaiting dispatch to consumers. Each message occupies some memory (how much depends on the size of the message) until it is dequeued and delivered to a consumer. At that point, ActiveMQ frees up the memory that had been used for that message. When producers are faster than consumers — there’s more enqueuing than dequeuing over a given time period — ActiveMQ’s memory use increases.

ActiveMQ also writes messages to disk for storage. Classic and Artemis both use paging to move messages to disk when memory is exhausted. When ActiveMQ needs to send those messages, they’re paged from disk back into memory. Paging messages to and from disk adds latency, but it allows ActiveMQ to process a large volume of messages without requiring enough memory to hold them all. Paging is enabled by default, but is optional — you can configure an address to discard messages when there is no memory available to store them.

Storing messages in Memory & on Disk

We’ll discuss this from ActiveMQ’s point of view. As it is an opensource technology, so we’ll have a chance to unfold the source codes and see how messages are stored in data structure, which embedded DB is used to keep the metadata & actual data or other transactional information etc.

The AMQ Message Store is an embeddable transactional message storage solution that is extremely fast and reliable. The message commands are written to a transactional journal — which consists of rolling data logs — which means writing is extremely fast and the state of the store is easily recoverable.

Messages themselves are persisted in the data logs of the journal — with references to their location being held by a reference store (by default Kaha) for fast retrieval.

Reference Store, Cache & Data logs

References to messages are held in memory, and periodically inserted into the reference store to improve performance.

The messages are stored in data logs, which are individual files, typically 32mb in size (though this is configurable, they can be larger if the size of a message is large than the file size). When all the messages in a data log have been successfully consumed, the data log file is marked as being ready to be deleted — or archived — which will happen at the next clean up period.

In the data directory defined for the AMQ Store there is the following directory structure:

Journal, Datalog etc

Top level

The message broker’s name is used to distinguish its directory of message data. By default, the broker name is local host.
Below this top level directory are the following sub directories:

archive

Message data logs are moved here when they are discarded. Note that this directory only exists when the property archiveDataLogs is enabled

journal

Used to hold the message data logs.

kr-store

The directory structure of the Kaha reference store (if used)

  • data

The indexes used to reference the message data logs in the journal for fast retrieval

  • state

The state of the store — i.e. names of durable subscribers.

tmp-storage

Used to hold data files for transient messages that may be stored on disk to alleviate memory consumption — e.g. non-persistent topic messages awaiting delivery to an active, but slow subscriber.

Recovery from Failure

If the message broker does not shutdown properly, then the reference store indexes are cleaned and the message data files (which contain messages/acknowledgements and transactional boundaries) are replayed to rebuild up the message store state. It is possible to force automatic recovery if using the Kaha reference store (the default) by deleting the kr-store/state/ directory.

KahaDB Message Store in ActiveMQ

The recommended message store to use for general purpose message since ActiveMQ version 5.3 is KahaDB. This is a file-based message store that combines a transactional journal for very reliable message storage and recovery with good performance and scalability.

The KahaDB store is a file-based transactional store that has been tuned and designed for a very fast storage of messages. The aim of the KahaDB store is to be easy to use and as fast as possible. Its use of a file-based message database means there is no prerequisite for a third-party database. This message store enables ActiveMQ to be downloaded and running in literally minutes. In addition, the structure of the KahaDB store has been streamlined especially for the requirements of a message broker. The KahaDB message store uses a transactional log for its indexes and only uses one index file for all its destinations. It has been used in production environments with 10,000 active connections, each connection having a separate queue. The configurability of the KahaDB store means that it can be tuned for most usage scenarios, from high throughput applications (trading platforms) to storing very large amounts of messages (GPS tracking). To enable the KahaDB store for ActiveMQ, you need to configure the <persistenceAdapter> element in the activemq.xml configuration file. Below is a minimal configuration for the KahaDB message store: From broker-config.xml or activemq.xml

The KahaDB Message Store Internals

Kaha DB internals

The diagram above provides a view of the three distinct parts of the KahaDB message store including:

The Cache: holds messages for fast retrieval in the memory after they have been written to the data logs. Because messages are always persisted in the data logs first, ActiveMQ does not suffer from message loss on a sudden machine or broker failure. The cache will periodically update the reference store with its current message IDs and location in the data logs. This process is known as performing a checkpoint. Once the reference store has been updated, messages can be safely removed from the cache. The period of time between the cache updates to the reference store is configurable and can be set by the checkpointInterval property. A checkpoint will also occur if the ActiveMQ message broker is deemed to be reaching the ActiveMQ system usage memory limit, which can be set in the ActiveMQ broker configuration. When the amount of memory used for messages passes 70% of this memory limit, a checkpoint will also occur.

The Data Logs: act as a message journal that consists of a rolling log of messages and commands (such as transactional boundaries and message deletions) stored in data files of a certain length. When the maximum length of the currently used data file has been reached, a new data file is created. All the messages in a data file are reference counted so that, once every message in that data file is no longer required, the data file can be removed or archived. In the data logs, messages are only appended to the end of the current data file, so storage is very fast.

The BTree indexes: hold references to the messages in the data logs that are indexed by their message IDs. It is actually the indexes that maintain the FIFO data structure for queues and the durable subscriber pointers to their topic messages. The redo log is used only if the ActiveMQ broker has not shut down cleanly and to insure the BTree index’s integrity is maintained.

KahaDB uses different files on disk for its data logs and indexes so we will show what a typical KahaDB directory structure should look like.

The KahaDB Message Store Directory Structure:

Kaha DB Directory Structure

Inside the KahaDB directory, following directory and file structures can be found:

db log files — KahaDB stores messages into data log files named db-<Number>.log of a predefined size. When a data log is full, a new one will be created, the log number being incremented. When there are no more references to any of the messages in the data log file, it will be deleted or archived.

The archive directory — Exists only if archiving is enabled. The archive is used to store data logs that are no longer needed by KahaDB, making it possible to replay messages from the archived data logs at a later point. If archiving is not enabled (the default), data logs that are no longer in use are deleted from the file system.

db.data — This file contains the persistent BTree indexes to the messages held in the message data logs.

db.redo — This is the redo file used for recovering the BTree indexes if the KahaDB message store starts after a hard stop. Now that the basics of the KahaDB store have been covered, the next step is to review its configuration.

The KahaDB message store can be configured in the ActiveMQ broker configuration file. Its configuration options control the different tuning parameters

Scaling the Depth of a Queue

Nearly all messaging systems (certainly open source ones) hold either a copy of a persistent message or a reference to a persisted message in memory. This is primarily to try and improve performance, but it also can significantly decrease the complexity of implementation. In fact ActiveMQ version 4 and below worked this — way — by holding references to persisted messages in memory.

However there is a limitation to this approach, no matter how much memory you have at your disposal, you will hit a limit to the number persistent messages a broker can handle at any particular time.

To get around this limitation, ActiveMQ introduced a paging cache — for all message stores (except the memory store) to get the best of both worlds — great performance and the ability to hold 100s of millions of messages in persistent store. ActiveMQ is not limited by memory availability, but by the size of the disk available to hold the persistent messages.

Unfolding few parts of the architecture from the programming mindset

We got to know topic & queue concept above. We’ll try to see how they are being coded in ActiveMQ framework. We’ll get some in depth understanding. Purpose is not provide extensive knowledge of the framework here rather will take you through the various important pointers so that you can quickly full fill your satisfaction of curiosity.

Queue:

Queue is nothing but Linked list data structure. This is java class extended from “BaseDestination” and which also implements few interfaces. Parent class has many attributes of the messages.

BaseDestination class

and the Queue is a List of MessageEntry objects that are dispatched to matching subscriptions. Queue class looks like

Queue class

One important class you can see highlighted is LinkedHasMap which stores the messages.

Topic: This is also a java class extended from “BaseDestination” that implements “Task” interface. The Topic is a destination that sends a copy of a message to every active Subscription registered. It handles subscriptions, message delivery(send, cleanup etc.). For storing message, it uses “TopicMessageStore” which extends “MessageStore” class.

topic class

“TopicMessageStore” class above implements “MessageStore” which has blueprint of various methods to handle the messages.

Message in Memory Store

Memory: keeps message in RAM buffer, if ram fills up then message are copied to disk.

Buffering: Both ArrayList & LinkedList are used.

This buffer is used to store message in LinkedList — a doubly queue with list.

Messages are synchronized in memory after adding or removing from queue below:

ArrayList is used to retrieve all message that is faster(for updating individual message or retrieving, LinkedList is better & faster):

There are other two mechanism for buffering message in memory ie “OrderBasedMessageBuffer.java” & “SizeBasedMessageBuffer.java”. In orderBased, oldest message will be removed first whereas in sizeBased, largest sized message will be evicted first.

MemoryMessageStore

An implementation of MessageStore

MemoryTransactionStore

Provides a TransactionStore implementation that can create transaction aware MessageStore objects from non transaction aware MessageStore objects.

LinkedHashMap is used for LRU caching the information in memory for temporary store and faster retrieving.

For MemoryStore for topic, simple MAP data structure is used and reference is then cached:

Amount of size to be allocated for memory & storage through config:

Message Store on disk [code /class]

Store and temporary disk limits are set for the Broker on startup based on configuration and available space. Sometimes other processes (such as logs) can grow and reduce the available disk space enough that the limits detected at start up no longer have any effect. Since ActiveMQ version 5.12.0, it’s possible to configure the Broker to periodically check disk space and reconfigure the limits accordingly using the broker attribute schedulePeriodForDiskUsageCheck > 0.

For example a configuration looks like

<broker xmlns="http://activemq.apache.org/schema/core" schedulePeriodForDiskUsageCheck="60000">
...
</broker>

How client communicates to the broker

They use transport over default port 61616 to connect to the broker:

Below shows transport details with different protocols:

Delivery Guarantees

A key feature of most messaging systems is reliable messaging. With reliable messaging the server gives a guarantee that the message will be delivered once and only once to each consumer of a queue or each durable subscription of a topic, even in the event of system failure. This is crucial for many businesses; e.g. you don’t want your orders fulfilled more than once or any of your orders to be lost.

Durability

Messages are either durable or non durable. Durable messages will be persisted in permanent storage and will survive server failure or restart. Non durable messages will not survive server failure or restart. Examples of durable messages might be orders or trades, where they cannot be lost. An example of a non durable message might be a stock price update which is transitory and doesn’t need to survive a restart.

Clusters

Many messaging systems allow you to create groups of messaging servers called clusters. Clusters allow the load of sending and consuming messages to be spread over many servers. This allows your system to scale horizontally by adding new servers to the cluster.

Degrees of support for clusters varies between messaging systems, with some systems having fairly basic clusters with the cluster members being hardly aware of each other.

Apache ActiveMQ Artemis provides very configurable state-of-the-art clustering model where messages can be intelligently load balanced between the servers in the cluster, according to the number of consumers on each node, and whether they are ready for messages.

Apache ActiveMQ Artemis also has the ability to automatically redistribute messages between nodes of a cluster to prevent starvation on any particular node.

Message

Each message ActiveMQ sends is based on the JMS specification, and is made up of headers, optional properties, and a body.

Message Headers

JMS message headers contain metadata about the message. Headers are defined in the JMS specification, and their values are set either when the producer creates the message, or when ActiveMQ sends it.

Headers convey qualities of the message that affect how the broker and clients behave. Let’s take a look at two key characteristics that ActiveMQ takes into account when delivering messages: expiration and persistence.

Message Expiration

Depending on its content and purpose, a message may lose its value after a certain amount of time. When a producer creates a message, it can set an expiration value in the message header. If it does not, the header value remains empty and the message never expires.

ActiveMQ discards any expired messages from its queues and topics rather than delivering them, and consumer code is expected to disregard any message that remains unprocessed after its expiration.

Message Persistence

ActiveMQ messages are persistent by default, but you can configure persistence on a per-message or per-producer basis. When you send a persistent message, the broker saves the message to disk before attempting delivery. If the broker were to crash at that point, a copy of the message would remain and the process of sending the message could recover when the broker restarted. A non-persistent message, on the other hand, usually exists only in the broker’s memory and would be lost in an event that caused the broker to restart.

Sending non-persistent messages is usually faster because it doesn’t require the broker to execute expensive write operations. Non-persistent messaging is appropriate for short-lived data that gets replaced at frequent intervals, such as a once-a-minute update of an item’s location.

Message Properties

Properties provide a way of adding optional metadata to a message. ActiveMQ supports some properties that are defined in the JMS spec, and also implements some properties that aren’t part of the spec.

Producers can also define properties — arbitrarily and outside the JMS spec — and apply them to each message. Consumers can implement selectors to filter messages based on values present in the message properties. For example, you can configure an ActiveMQ producer to attach a coin property to each message, with a value of either heads or tails, and send them all to the same topic. You can write two consumers—a heads consumer and a tails consumer—that subscribe to that topic but that only receive messages with their selected value of the coin property.

In order to understand this, we need to explore the message attributes available by default in framework. Also if need to check if framework does allow custom attributes.

There are multiple attributes as properties of message available in AMQ. Some of them are mentioned below:

JMS Defined:

ActiveMQ Defined:

Message Body

The content of an ActiveMQ message is the body. The body of a message can be text or binary data. (It’s also acceptable for a message’s body to be empty.) The value of the JMSType message header, which is set explicitly by the producer when the message is created, determines what can be carried in the body of the message: a file, a byte stream, a Java object, a stream of Java primitives, a set of name-value pairs, or a string of text.

Security

Authentication

Simple Authentication Plugin

If you have modest authentication requirements (or just want to quickly set up your testing environment) you can use SimpleAuthenticationPlugin. With this plugin you can define users and groups directly in the broker’s XML configuration. Take a look at the following snippet for example:

LDAP Authentication Using the JAAS Plugin

A new/better ldap authorization module is available since 5.6. See Cached LDAP Authorization Module for more info.

Authorization

Encryption: ActiveMQ Artemis does not support data encryption at rest. You should either encrypt your data end-to-end or use file-system based mechanism.

Communication between the ActiveMQ broker and the Data Aggregator or Data Collector java process is not encrypted or authenticated.

In conclusion

This is a lengthy article, and you may choose to skip some sections and only focus on what is relevant to you. Alternatively, you can review the highlighted sections. I have attempted to explain a few critical aspects of ActiveMQ, but you may need to explore the code base to gain a deeper understanding of a particular component.

References

https://activemq.apache.org/amq-message-store
https://activemq.apache.org/

https://www.datadoghq.com/blog/activemq-architecture-and-metrics/

https://iopscience.iop.org/article/10.1088/1742-6596/608/1/012038/pdf

“http://www.idevnews.com/images/emailers/110127_ProgressFUSE/WhitePapers/ActiveMQinActionCH05.pdf"

“https://www.sciencedirect.com/topics/computer-science/buffer-management

--

--

Kamal Maiti

Bridging the Gap Between Numbers and Narrative, Helping Organizations Harness the Power of Data, Helping Businesses Make Better Decisions