6 Enterprise Application Integration Styles

File Transfer, Shared Database, RPC, Messaging, Managed Finite-State Machine & Data Warehouse

Abhinav Kapoor
CodeX
8 min readJan 12, 2023

--

Photo by Will Porada on Unsplash

An enterprise has multiple applications owned by different departments, business domains, teams and projects.

On the technical front, the applications are built independently, with different languages, and platforms. Even more, these applications can be from different eras of software development (for example a 1970s mainframe integrated with a modern mobile application).

Often the applications need to cross boundaries either to share information or to invoke functionality. The goal of such integrations is to collaborate in a timely & consistent manner to support business workflows while keeping the applications decoupled during the execution & development as much as possible.

Let's see the 6 main integration styles and how they compare:

Application Integration Styles

1. File Exchange/Transfer

Integration by exporting and importing files.

File Transfer allows the applications to be loosely coupled, but at the cost of timeliness.

a) Use Case

Data Sharing — They are mostly used when exchanging periodic updates or large volumes of data.

Often the case when communicating with mainframe systems. Or when data has to be published to a wide range of consumers with varying technical capabilities within an enterprise. For example, publishing Exchange rates or Indexation values.

b) Interoperability

High — Even the most basic/primitive systems support file reading and writing. Consumer applications may have to transform the data to make it worthwhile.

c) Coupling

Low — File name, location & format of data in the file. Deciding archiving policies.

d) Asynchronicity

Asynchronous — The producer application does not wait for the file to be processed.

e) Consistency of data across the system

Eventual Consistency — Usually, files are produced by scheduled (monthly, daily) batch/cron jobs therefore the data could be stale. Data entered in the producer system is expected to be available in other systems after the scheduled cycle.

f) Considerations

Keeps the applications decoupled but as there are mostly scheduled, these are not ideal for transmitting notifications or updates as they happen.

2. Shared Database

Transactional data is exchanged by writing and reading from a shared database.

a) Use Case

Data Sharing — Immediate data transfer between applications with strong consistency (& eventual consistency model is also pretty fast).

b) Interoperability

High — SQL & NoSQL databases are supported by most programming languages & platforms.

c) Coupling

High — It provides an unencapsulated data store which is not owned by any single application. And over a period of time, if unregulated, it can easily grow into a big ball of mud. Thus, making it expensive to make changes or add extensions (negative impact on Maintainability & Extensibility).

Strong consistency and transaction management can lead to contention among the applications & which causes deadlocks, and performance issues. Thus negatively affecting scalability as the contention increases.

d) Asynchronicity

Asynchronous — as the writing and reading application is independent of other applications but the transaction management of the database can block other applications.

e) Consistency of data across the system

Strong Consistency — Applications immediately get updated data in the shared database. The eventual consistency model has a lag (typically in the order of seconds depending upon product and size).

f) Considerations

Shared Database keeps data together in a responsive way, but at the cost of coupling everything to the database.

3. Remote Procedure calls

Functionality is invoked by calling either web services or APIs.

a) Use Case

Invoking behaviour/functionality integration & data sharing— It's a mechanism for one application to invoke a function in another application, passing the data that needs to be shared.

b) Interoperability

High—SOAP services & REST APIs have been already very popular for publically exposing functionality. And gRPC is gaining popularity for service-to-service communication (a consideration for gRPC is support by API Gateway product).

c) Coupling

Moderate-High — While the data is encapsulated by the service exposing functionality, it still brings sequential/temporal coupling. This means doing certain things in a particular order. Often the remote procedure calls are coupled with business processes which makes it difficult to change systems independently (low cohesion, high coupling).

d) Asynchronicity (Effects Performance and Availability)

Synchronous — HTTP protocol works in a synchronous request/response manner. An asynchronous result may still be delivered to the client (link at the end).

e) Consistency of data across the system

Strong Consistency — Attributed to synchronous communication.

g) Consideration

It's the fastest for invoking functionality in a remote system & getting back results. However, it needs the communicating applications to be available & responsive thus coupling their availability, performance & scalability.

Another point is to avoid RPC Chains and Knots which tie different systems together.

4. Messaging

Frequent & asynchronous one-way transfer of small data packets, either to invoke remote functionality (commands) or to send updates (events).

The producer application puts the message into an event stream or Message broker (an infrastructure component) from the message is delivered to a consumer application like RPC but this broker stores the message which makes it asynchronous communication, preventing the need for communicating applications to be available at the same time.

a) Use Case

Invoking behaviour & Sharing data—Used as one-way, non-blocking communication. The message can either represent a command where the producer application invokes a behaviour in the consumer application. Or an Event where the producer is unaware of the consumers (there can also be a document-type message, for example, the result of the command execution).

b) Interoperability

Moderate — While modern platforms support messaging libraries, it can be challenging in legacy applications built on older platforms.

c) Coupling

Low-Moderate — Schema of the message has to be understood by all applications. Event style of messaging decouples the applications more as producers may not be aware of consumers. Command on the other hand keeps the applications entangled in business flow.

d) Asynchronicity

Asynchronous — The producer application does not wait for the message to be processed from a technical point of view. From a business point of view, a command may still need the result of execution, but the result is also delivered asynchronously.

e) Consistency of data across the system

Eventual Consistency — Attributed to asynchronous communication.

f) Considerations

Messaging brings fault tolerance and a high degree of decoupling but interaction over events has a learning curve while designing, implementing & maintaining. Also, correlation Ids and central logging become important.

It can be used to get fast responses when command messages are used but it’s not as fast as RPC.

Event choreography (an interaction pattern built over loose events) removes order and central control thus it shines in a low cohesion environment spreading across business domains (link at the end for more details).

5. Data warehouse

It's a central data repository, where data is consolidated from several applications periodically to get aggregated views and maintain historic data points to develop a wholistic picture (for example all tax liabilities of a person).

a) Use Case

Data Aggregation, Sharing & Analytics — In order to get a wholistic picture, all relevant systems/applications put their data in a Data warehouse. A Data warehouse provides an efficient data access & normalisation mechanism for reporting, aggregation, analysis, and data mining.

b) Interoperability

High — SQL clients can be used. Additionally, there can be product-specific Data APIs.

c) Coupling

Low.

d) Asynchronicity

Asynchronous — The producer applications don't wait for data processing.

e) Consistency of data across the system

Eventual Consistency — Scheduled data ingestion from several applications.

f) Considerations

A data repository used for consolidation and analytics, speed is not a criterion here.

6. Managed Finite-State Machine (Orchestrator)

Finite-State Machines are helpful in modelling workflows.

A managed Finite-State Machine acts like a controller/orchestrator to stitch together applications & services as steps within a business flow. These steps may further be implementing any of the already mentioned integration styles.

Example: AWS managed finite state machine (AWS Step function) which calls serverless functions to check Name and Address, requiring human approval before deciding approval or rejection of the application & triggering messages with Amazon SNS. Further reading & Image Credit https://aws.amazon.com/step-functions/use-cases/

a) Use Case

Stitching together services in a workflow — In the context of application integration, it's used for defining a workflow such as Microservice orchestration, and orchestration of data processing pipeline.

b) Interoperability

Low-High — Depending upon the state machine & infrastructure, it's possible to stitch together cloud-native serverless functions, on-premise RPC calls, cloud-native event streams and message brokers all within a single workflow.

c) Coupling

Low —The applications themselves are decoupled, and the coupling happens at the orchestrator.

d) Asynchronicity

Synchronous / Asynchronous / Hybrid — Depending on steps.

e) Consistency of data across the system

Strong / Eventual Consistency / Hybrid— Depending on steps.

f) Considerations

Know how and the availability of a state machine. Cloud-managed state machines are not the cheapest services. A state machine can be a single point of failure.

Summary

File Transfer — Allows the applications to be loosely coupled, but at the cost of timeliness. Data Sharing across departments/business domains.

Shared Database — Shared Database keeps data together in a responsive way, but at the cost of coupling everything to the database. Judicious use within a high cohesion environment.

Remote Procedure Calls — Popular request-response paradigm for invoking functionality within & across business domains. Its synchronous nature couples clients with the availability, performance & scalability of the service.

Messaging — Frequent one-way data sharing allows the applications to be loosely coupled within & across business domains.

Data Warehouse — Central data repository for aggregated results, analytics & data mining, within & across business domains.

Orchestrator / Finite-State machine — High cohesion workflow for orchestrating services. Such high cohesion is may exist more often within a business domain rather than across business domains.

--

--

Abhinav Kapoor
CodeX

Technical Architect | AWS Certified Solutions Architect Professional | Google Cloud Certified - Professional Cloud Architect | Principal Engineer | .NET