ABN AMRO’s Streaming Architecture

ABN AMRO
ABN AMRO Developer Blog
6 min readAug 1, 2018

By Piethein Strengholt

When I published ABN AMRO’s Service Oriented Architecture 2.0 I received a lot of questions about Streaming processing, Pub-Sub and Event driven. I’ll try to answer those with my view on Streaming Processing on an Enterprise level. Service Oriented Architecture (SOA) to me puts a lot of emphasis on the request-response pattern and facilitates, comparable to Streaming Processing, lower throughput when it comes to the publish-subscribe pattern. Although from a distance Streaming Processing and publish-subscribe might look the same, I see some clear differences:

  • Streaming Processing is typically used for high scalability, real-time requirements and much higher volumes.
  • Streaming Processing is stateless/asynchronous at a high level, but data can be persisted in message queue’s (key value store) concepts. Topics can be in different partitions for parallelism and high throughput execution. These key value stores can also be accessed with REST patterns and can also provide ‘reset’ functionality, forcing all messages from the queue to be streamed to consumers again.
  • Streaming Processing typically supports other protocols as well (AVRO, GRPC, MQTT, Thrift, etc.), while the most SOA implementations with ESB’s and API Gateways are traditionally more focused on JSON and/or SOAP message formats. (SOA on itself is protocol agnostic.)
  • Streaming Processing allows new streams to be generated (filtered, combined) from existing streams.

The streaming processing pattern already have been described at a very high level when ‘open sourcing’ ABN AMRO’s Digital Integration & Access Layer Architecture. Streaming processing is about creating reactive architectures. Whenever a state of change happens, an event or trigger is created, it will be forwarded to a streaming platform, from where distribution towards data consumers takes place. Platforms like Kafka introduce a number of interesting patterns. Just let me highlight some of them and explain how things can be combined.

Command-Query Responsibility Segregation (CQRS)

The first pattern is the CQRS pattern. With this pattern data is stored in different data stores. One store is for the processing of the commands (change of the state), which also maintains the true state. The other store is used for the queries (retrieval of data). Whenever a state of change happens (new record, update record or delete record) an event is generated and sourced to the other store.

The main benefit is that this pattern delivers a much higher performance of the overall application. Data is decoupled. Read queries (e.g. an API Gateway can route the read) against the read store don’t add up to the original application store. Another important benefit is that the stores of the command and query don’t have to be necessarily the same. You have technology choice here. Eventually you can have multiple query data stores, which allows you to facilitate many consumers with different needs.

Event Sourcing

Another cool pattern is ‘Event Sourcing’. Event Sourcing ensures that all changes to the application state are stored as a sequence of events in a data store (immutable log). This allows reconstructing past states, performing analysis, having audit logs and doing back-ups. With ‘event sourcing’ it is possible to restore from any point of time in the life cycle or completely rebuild if needed. Replay, reverse and correct any events to new events is part of this logic.

The major benefit is that this pattern can also be combined with CQRS. Data can be restored from any point in time, can be thrown into new or multiple data stores, data is decoupled and any transformations can be applied while data is in transit.

A cool example is restructuring data based on an unique identifier in the message body. The unique identifier is used to build up a new message queue or data store. The unique identifier is this example is used as the key in a key/value database. Patterns like REST allow consumers immediately to interface with this store, instead using the original source.

Another cool streaming pattern is to combine streams with other streams. When another message, with the same unique identifier arrives, data is combined, stored or streamed again. This makes the streaming platform with all the patterns extremely powerful.

Decoupling the usage and distribution

Within ABN AMRO we use our DIAL Architecture to decouple applications to applications, which means no business logic is allowed to be applied in a stream from one application towards another application. The consumer logic sits always at the consuming side, while our DIAL Architecture decouples applications and takes care of the distribution of data in a controlled manner.

We are currently building an enterprise streaming platform based on Kafka. Events are either captured and stored in the Kafka’s message queue, persisted in a Raw Data Store (e.g. Hadoop) or directly passed to Data Consuming Applications based on any of the routing and subscriptions. Data owners, at the providing side, take responsibility for their data and quality of the data. Data that is stored in a queue or Raw Data Store always is a data owner’s responsibility. The context of the data is not expected to change, however providers can restructure or filter any data to make life of consumers more easy. Any protocol transformations are also allowed. If the message compatibility breaks, providers and consumers should agree on the way forward.

Any business logic applied is always applied at the consuming side. This means consumers will have their own Kafka instances, use Apache Flink, Storm, Spark or whatever they want to use in order to do any complex transformations, complex event processing or something else.

Most Streaming platforms allow you to do analytics by providing support for several windowing functions, such as Tumbling, Hopping, Sliding or Sessions. If you want to know more I highly encourage you to have a look at the following two pages:

https://kafka.apache.org/11/documentation/streams/developer-guide/dsl-api.html#windowing

https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions

The same decoupling principle applies for Microservices. Kafka can be used for intra-application communication, but when crossing the boundaries of the application scope, additional decoupling is required, which means decoupling at a higher level by using the DIAL Architecture. This also applies when combining the SOA and Streaming Architecture. API’s can be put on Raw Data Stores or Queue’s, but the data owner is always in control and remains responsible.

When the data is transformed at the consuming side and distributed again, it is expected that the DIAL Architecture is re-used. By decoupling this way we can keep track of all message and have lineage in our architecture. With this model, domains and applications can evolve at their own speed, use streaming and don’t hold up other domains.

What about Cloud?

At ABN AMRO we have a multi hybrid cloud strategy, so we currently use both AWS and Azure. Within the next months we will deliver a multi-data center replication instance of Kafka for the first set of use cases. Since we have selected Kafka for enterprise distribution an AWS Kinesis endpoint is currently seen as part of the consuming application. In the future we might promote AWS Kinesis to act as enterprise distribution, but this requires tightly integration with our schema (metadata) repository. The main objective of our DIAL Architecture is that we want to be in control of ownership from a Data Management perspective.

Wrap-up

By having clear responsibilities for streaming we believe to have a more flexible and scalable enterprise architecture. This reference architecture is being used for our architects, developers, engineers and solution designers. It will enable them to deliver the highest value for the business and our customers.

About the author:
Piethein Strengholt, Technology Architect
Piethein is Technology Architect for ABN AMRO. He is part of a high performing team of technology enthusiasts with a passion for the latest developments and trends. Are you interested to join me on this journey? Please let us know!

--

--

ABN AMRO
ABN AMRO Developer Blog

Build the future of banking! Use our APIs to automate, innovate, and connect to millions of customers. Go to: https://developer.abnamro.com/