Nifty tool-chain for CQRS application development with read-model projection

Simon Dietschi (ELCA)
ELCA IT
Published in
4 min readJan 4, 2022

Event-driven applications are constantly gaining in popularity across industries, especially in areas where large legacy systems provide the core domain of a business, such systems often cannot be replaced, but still, need to interact with modern architectures. To achieve this, a well-known pattern is the command query responsibility segregation (CQRS). Because this pattern adds extra complexity due to the asynchronous behavior, in a recent project here at ELCA we were looking for a good solution on how to support local development, testing, and debugging through a nifty tool-chain.

Photo by Tom Conway on Unsplash

We have selected Spring Cloud Stream for the event processing, Kafka as the streaming platform, and MongoDB for storing the read models. In this post, I would like to outline some interesting implementation details based on a condensed scenario focusing on the read model projection part of CQRS.

Illustrative architecture (Image by Author)

A closer look at the topology

Condensed from the real project, there is one Kafka topic receiving change events as soon as a person record is being modified in the core system, where the particular event is being generated through change data capture. Because the core system stores one or more addresses separately and there is no guarantee that a person has at least one address created at the same time, we need to join the persons and their addresses creating a root aggregate for further read model projections. Also worth mentioning is each event is expected to contain not only the state change but also the full history of previous changes (maybe not exactly by the book).

For those folks being interested, the join method may look very similar to the one below:

As mentioned, every event also contains the history, hence we can optimize and configure the involved Kafka topics to be compacted (no event sourcing needed) — this enables switching from KTables to KStreams in our join method, doing an in-memory conversion to internal KTables where no more physical helper-topics and unnecessary data duplication will need to be created by Spring Cloud Streams.

To boost performance with lightweight scaling, as well as overcoming the latency for message schema validation against the registry and writing the models into MongoDB within the particular worker thread, the co-partitioning feature is being used to have multiple threads processing the person-address joins. Here is a short look into the Spring settings configuration:

The tool-chain

For the impatient 😃, an example declaration of the tool-chain can be found here: docker-compose.yml

Coming back to the illustrative architecture view from above — As the change data capture part heavily depends on a large legacy system is reconsidered too expensive to be set up locally, a custom message producer tool has been implemented. Using the same local Avro schema registry, this custom tool integrating the Avro client library allows producing new events locally as well as being flexible in testing changes to the event schemas not to be published to the real registry right now. Concerning the Avro schemas, these are being curated in a separate code repository where for local development a separate maven plugin is being used to upload the schema files into the local registry.

For an easy exploration of the Kafka topics without having to query them per CLI, Kafdrop has been added to the tool-chain. Furthermore, this tool provides useful insights about topic partitions, offsets, events, and much more.

Kafdrop with topics (Image by Author)

To be able to verify the read models being projected and stored in the MongoDB, the official Mongo Express tool is being used, offering interesting document and query statistics, but also a wide range of other useful features making it very easy to analyze and debug stored information.

Mongo Express (Image by Author)

Another may be rather small but useful detail to mention, is about the creation of the initial topics in Kafka — so once the local infrastructure is up and running, a simple text file containing the needed topic configurations can be used to declare these topics as compacted:

And last but not least, the Makefile to specify the needed helper command for topic creation:

Conclusion

Even if this blog post contains project-specific aspects and adoptions, it may still provide a good starting point for your own streaming application development. However, regarding testing and debugging, unit and integration tests often are not enough. Due to the complexity of asynchronous messaging, a supporting tool-chain is essential. Furthermore, such a tool-chain could also be deployed into one of the available container orchestration platforms, like OpenShift which is widely used at ELCA.

For those folks looking for a working demonstrator, while not caring too much if it is Java or Kotlin, please have a look at my read-model-projections example.

--

--

Simon Dietschi (ELCA)
ELCA IT
Writer for

Software Architect, Tech Enthusiast, Roadbiker, Drone Pilot. Find out more about me on https://ch.linkedin.com/in/sdietschi