Building Blocks Of Scalable Enterprise Integration Engines

Published in

Walmart Global Tech Blog

5 min readNov 3, 2021

Service to the Customer is one of the core values which drives global supply chains and we apply our heart and soul for it. Many people and systems located across the world come together every day in a man-machine harmony for providing the best services possible. We also communicate with many partners and service providers to procure and deliver products across the world.

Integration engines play a major role in organizations based in industries such as e-commerce where so many entities both on and off the company are working together. They range from very small, integrating just a couple of systems to very large, handling the load of almost all the major processes. In this article, let us discuss how an enterprise-level integration engine can be designed they manage and scale to serve important use cases.

The Problem

Any e-commerce service provider has to integrate multiple systems which handle specific parts of the e-commerce use cases. Let’s take order fulfillment as an example. From the outside, it might seem a simple process to select and place an order and wait for it to get delivered, but many systems and teams work together for it. The order is placed on a website which itself is one of the systems. The checkout system is responsible for verifying and generating the actual customer order which is propagated to other systems. There is a system that reserves the customer’s items and decides on a delivery date based on the nearest available shipping locations. Another system takes care of orders that are to be sent to the respective partners and a system that integrates all of these and talks to partners outside the intranet. There are many more systems that directly and indirectly contribute to fulfillment such as monitoring, compliance, rule processors, etc.

What do we need?

We base our integration engine on a message-based architecture. What we mean by this is that the smallest unit of work in our system is a message. Messages can be passed from one component to the other as well as can be monitored, tracked and reprocessed.

The recipe for the perfect integration engine would need the following categories of ingredients:

Integration mediums
Processing Engine
Event processors
Communication mediums
Data storage mediums
Databases
Search Engine
Monitoring systems

Since our integration engine has to work at scale, we need to account for the distribution of all these components. There are many tools available. Our state-of-the-art cloud management application, Oneops is perfect for this job. With its built-in support for load balancers, Oneops provides distribution of load, managing deployments, creating and managing virtual machines and much more. Let’s take a look at the tech stack requirements that we can consider,

File-based integration, HTTP based integration, Messaging queue-based integration
A distributed application to process messages
High-velocity event processing engine
Distributed messaging queues as a communication medium for both inside the engine and outside integrations
An Object store for files and media storage
No-SQL DB to store historical data and configurations, relational DB to store transactional data
A search engine to index and search data for monitoring
Monitoring system with UI

There are several technologies available to fill in these requirements. Any of them can be fit in and our integration engine is ready to serve.

Distributed Architecture

Distribution can happen at multiple places. Firstly, we use distributed databases and messaging queues. What we need to solve is to distribute the computation. To do this, we have components deployed on various virtual machines across data centers. We can use any load balancing method to distribute the incoming load to our servers into these VMs. Once the message is received in one Virtual Machines on the server, the computation can be further divided and distributed inside the processing engine. The distributed messaging queue plays a major role here as each node can process a different message and send it to its nearest database node to be stored. This gives us fast and reliable scalability.

High Availability

Distribution in itself ensures high availability. Oneops enables us to monitor the health of the Virtual Machines so that we can repair or replace them in time. We also distribute our application across multiple data centers to account for disaster recoverability. Fault tolerance is further ensured by the monitoring system. All of these factors enable us to create an enterprise-level integration system that is ready to serve the huge amount of transactions a typical e-commerce service serves every day with minimal chances of failure.

Primary Use Cases and How to Solve Them

Below are some of the use cases that our Integration Engine can solve.

Transformation

We get data from various systems, merge it, convert it into a format supported by external partners and relay it to them. The major challenge here is that each partner may be expecting data in a different format. Our processing engine should support anything and everything. To implement this, we use a rule-based transformation framework. This makes it easy to plug in new rules for any new format which needs to be supported.

Aggregation

We aggregate data before sending it to partners depending on configurable batch size and duration. This is achieved by accumulating it into our transactional database until our event processor generates events for processing them. The events can be based on the amount of data or time.

Unified Message Processing

We receive messages from many systems. Instead of coding consumers for each of these systems, we use an integrated, configuration-based listener which supports many popular messaging queues. Details of the queue can be added in the configuration and the system should be ready to consume from it.

Monitoring

Monitoring is a crucial part of the system. The monitoring system is an application that displays data stored in our databases and object store through the search engine for various stakeholders to view the status of messages. It has options to help support teams troubleshoot problems and reprocess failed messages. It also has graphs for monitoring the health of all the components of the tech stack and an alerting system for letting the stakeholders know if anything is not working.

Custom Use Cases

Since this is a generic integration engine, we support plug-and-play use cases. Any functionality can be coded into it as required if it needs integration between systems. The integration engine provides the infrastructure and resources for any use case while the core logic can be coded and added into it.

Integrate Away…

With these components in place, we can start integrating systems. Our integration engine should be able to scale, take care of multiple formats, aggregate, process messages, be fault-tolerant and resilient to new requirements.