Building Scalable Architecture

Published in

Koinex Crunch

6 min readFeb 5, 2019

“Architectural refactoring is hard, and we’re still ignorant of its full costs, but it isn’t impossible.” — Martin Fowler

I recently read about LMAX Architecture written by the prolific author Martin Fowler. It was one of the most delightful and eye-opening experiences of my life as a programmer. There were many key takeaways from his work and I highly recommend it to anyone who wants to improve their thoughts on architectural design.

The kick-start

In fast-growing financial systems importance of clean and scalable architecture is important for better scalability and faster execution. LMAX architecture helped me in designing multiple real-time application at Koinex which were not significantly transactional.

Though it is not possible to summarise the entire architecture in depth, I am listing some key learnings that can improve the quality of the scalable architectures that we design for our products. Let’s get started!

The Architecture

Components of Architecture

Input Disruptors: Responsible for I/O operations from the network, un-marshalling, replication and journaling.
Output Disruptors: Helps in Marshalling messages and responses received after processing.
Business Logic Processor: Handles all the business logic in the application.

How to Design a Business Logic Processor?

Business Processor is the state machine of the events that are coming from the input disruptors. They help in processing the input and producing output.

In-Memory Logic Processing: Processed data after reading from input disruptors should be stored in an in-memory data structures. This reduces latency in I/O operations performed by output disruptors. No database, no persistent store is required.

Sequential Input Stream: Input read from the input disruptors must follow a sequential path, like, it should either be FIFO, LIFO or some other sequence.

Clustered Business Processors: Since the system is heavily in-memory, hence input data stream should be fed into multiple business processors, so that if any of processor fails or is inconsistent then the traffic can be switched to the other working one. However, at any given point of time, only a single processor is linked to the output disruptor.

Input Sourcing Database: The input stream events should be stored in the event system like Kafka, ActiveMQ which helps in recovery of the malfunctioned processor and mitigates the risk of system downtime.

Better Programming Model: Better error handling and defining better interaction with external services asynchronously is the key programming model which one should follow while designing such real-time architecture.

Key Points in designing Output and Input Disruptors

Input and output disruptors work on wired messages and hence they are required to be stored in a durable manner and replicated as per the clustered business processors. Since the data is stored, replicated and consumed by the business processors independently, this results in a producer-consumer problem in the distributed system as a business processor should not read the data which is getting replicated.

For instance, if data replication fails and in meanwhile business processor processed the data then this will cause inconsistency across the clustered business processor as described above.

Concurrent Data Structure to rescue

Martin proposes the concept of ‘ring buffer’, a type of concurrent data structure. Ring Buffer is designed on the principle of shared sequence counters which are used by producers and consumers to handle concurrency. For your architecture, you can use or design any such concurrent data structure to deal with the producer-consumer problem.

What is Concurrent Ring Buffer and how it works?

Ring Buffer is the multicast network of queue data structure. It is referred as multicast network as whenever the producer puts an object all the consumer are notified for parallel consumption with the help of downstream queues.

Producer and Consumers both handles concurrency with the help of sequence counter. Every Producer writes its own sequence counter but before writing it reads all other sequence counters of consumers and producers in order to avoid parallel writes and preserve consistency. Sequence Counter can be considered as the locking variable. Similarly consumers also keeps watch on the sequence counter before consuming so as to ensure that they read consistent data.

In above scenario, Journaler, replicators and business logic processors act as consumers while unmarshaller act as producer.

Journaling: In order to recover the faulty business processor it is required to have the stored data events. Journaling helps in storing data from live streaming in-memory data to disk while preserving the sequence of data.

Disaster Recovery

While considering any architecture to its full usage we must also analyse how easily and speedily the system can recover from a disaster. The LMAX architecture helps us to mitigate risks due to its independent component which are clustered.

What are the risks?

If the data on which business processors are running becomes corrupted due to communication issues with the input disruptors.
If journaling operation of the streamed data fails and affects replication of data.

Risk Mitigation using LMAX

As mentioned above, Ring Buffer is a concurrent data structure which ensures mitigation of the risk of data corruption across business processors by solving the producer-consumer problem.

The running Business Processors are clustered and hence multiple processors are kept in sync with the input data in a way that if any business processor goes down, the other can be considered as the master with zero downtime.

The journal data helps in recovery of the corrupted processor by streaming the events that have been recorded and also replicates the situation which caused the failure of the processor.

When should you use this architecture?

Concluding the above points, you can use LMAX pattern in your architecture for designing scalable and fast system if:

The traditional model of concurrent sessions surrounding a transactional database is becoming difficult to manage.
The business problem you are solving is not heavily transactional. With transactions that are more independent of each other, there’s less need to coordinate, so using separate processors running in parallel becomes more attractive.
The Backend systems are not interactive with the users and may use asynchronous communication and programming model as mentioned above.
You require good separation of concerns, allowing people to focus on Domain-Driven Design and also want to keep much of the platform complexity well separated.

PS: The Trade Engine at Koinex is designed using many principles that are mentioned above. This makes us capable of taking significant loads at peak times.

Trade Engine at Koinex & LMAX

As mentioned earlier, LMAX architecture helped us in designing a stable and high throughput Trade Engine for Koinex. We implemented some principles of LMAX by developing our business processors in-memory, keeping the design concerns of input and output stream separate. However, we didn’t follow the fundamental of clustered business processors which restricted us from having zero downtime for the recovery process. Although the event streaming nature of the engine helped us to recover the system in a faster manner.

That’ll be it!

To be honest, it is going to take a lot of practice and many hours of coding to drill these principles in your subconscious. The more you read, the better you will learn. And yes, keep the practice always going.

Until next time, Happy Coding!

Building Scalable Architecture

Written by Ratikesh Mishra