Hyper-Modular systems in banking. Post 1 of ?: The Problem

11 min readApr 17, 2022

Lessons learned and experiences transforming banking legacy monoliths to modular -cloud oriented- new architectural models. What I discovered on the way. What we already know and what we do not know yet.

In this first post: stating the problem and the general approach to tackle it.

All opinions on my own

Why this series

After many years working on transforming banking legacy systems, from traditional monolithic models to new modular -cloud oriented- architectures I have decided to share my experience and views on this journey, writing a series of articles where I will share what I have found, discovered and the potential solutions we have applied.

You should not think on this as a cookbook with perfect solutions: we are learning on the way. We found good approaches for some challenges, but others still are not resolved yet.

This is about my own personal experience, but is actually the result of working with many clients and colleagues in IBM, working together in architecting complex transformational deals.

The Problem: Hyper-Integrated monolith(s)

When working on the banks digitization process it is clear that transforming them it is much more than just improving the channels, user interfaces and apps, is about transforming the inner parts of the banks information systems, like product origination, customer management, customer onboarding, product management and even in some cases changes in the core product processors and engines. The key problem in these systems’ internal capabilities, due to good historical reasons and technical restrictions, is they are monolithic structures without the proper modularity, that are very complex, risky and expensive to transform, being the main stopper for information systems modernization in banking (I would extend this problems to other industries but my experience is in banking, with some experiences in insurance).

The reality of these legacy systems is like the one represented in the following figure (image credits: I could not find the source of this image): a system that is hyper-integrated with unmanageable and, what is worse, unknown connections among all the components.

This lack of modularity causes an uncontrolled hyper-integration among the banking applications at any of the main application layers: data logic, business logic, process logic and presentation logic. This produces extremely complex systems that are very difficult to maintain, run and evolve.

The hyper-integration problem has several dimensions:

Uncontrolled data integration: The Data Monolith. Unique data models and repositories for the whole system business logic. It happens for both operational and analytic data. In the worse cases data is accessed by any functionality, without any data encapsulation that controls the access.
Uncontrolled functionality integration: The Service Mess. Lack of clear and strict interphases; no functional encapsulation. Too many integrations between components. No clear interfaces. It causes the famous spaghetti code. Monolithic processes coupling domains.
Uncontrolled transactional propagation: The Massive Glue. The abuse of transactionality (when is not strictly required) creates a strong glue among the participant parts. Transactionality support on legacy monoliths was so easy and cheap that was pervasive (even when not needed)

Additionally, the traditional trend of designing the IT systems in layers paves the way for the monoliths at different levels as well, for instance, having a “processes layer”. This tends to produce monoliths at the processes level due to workflows that “invades” different business domains, creating huge process coupling business logic that should be independent (we will discuss about modularization on processes).

Hyper-integration is also a root cause of unstable runtime systems: as everything is connected with everything (and interfaces are neither formalized nor identified) runtime problems can produce cascade effects in the whole system leading to full outages or even system melt down. Even planned maintenances are much riskier on systems that lack modularity like many of the banking systems.

All those factors into a monolithic architecture make changes and evolutions on these systems extremely risky, complex and expensive.

So, what?

The Hyper-Modular approach

In any engineering discipline systems use to be built out of components.

https://wallpaperaccess.com/car-parts#2947967 #wallpaper

In IT systems, due to good historical reasons , this has not happened, even “componentization” and “modularization” are concepts considered since the very beginning of the compute science era. Programming methods and styles like modular programming or object oriented design rely in a large extent on the module concept.

The “hyper-modular” concept sounds like a bit buzz-word or commercial concept. The intention of using this term was to define it in contrast (the opposite) to the hyper-integrated systems that we discussed above, highlighting that the key design principle of the new banking information systems should be Modularity: the view of systems as built out of components with a diverse sourcing; bought (COTS or COTS components), consumed (SaaS), reused (from a legacy capability) or custom-made (bespoke) when a capability needs to be differentiating.

A Hyper-Integrated vs a Hyper-Modular system could be represented in this way:

Each and every module supports a business capability as a self-contained component¹, that offers a clear and strict interface (API oriented, event oriented) and contains its own data, programmatic business logic (in any programming language), processes, business rules and even AI algorithms: all what the self-contained module needs to provided a business capability.

As it will be discussed in future posts, BIAN (Banking Reference Architecture Network)², jointly with other techniques like DDD (Domain Driven Design)³, is a good reference or canonical model for identifying (at least initially) the components of the system and their interfaces. This approach, where the implementation is encapsulated and isolated from the consumers thru the interfaces and where cross module transactionality is (in general) not allowed, prevents some of the monolithic hyper-integrated style problems.

Why a hyper-modular approach?

Traditionally, core systems for Tier 1 and 2 banks has been built in-house using traditional programming languages like Cobol , PL/I or even Assembler supported on transactional monitors like CICS, IMS etc. For the integration and channels domains other languages have been broadly used like Java, C/C++, C#, Javascript etc. The quality of the architecture of these systems varies a lot depending on the case, with cases of really well structured functionality, following the software engineering best practices, and other installations being a really mess of many overlapping Systems of Records with internal very high coupling. In any case, we should take into account that 30 or 40 years ago the software engineers had to cope with technology limitations (processing power, memory, etc) that currently we do not have, which led to suboptimal design. But the main difference with the current situation is that now:

The systems will not be deployed in just one location (the data center). The resources (computing, memory, storage, algorithms, …) will be available from a number of providers of IaaS, PaaS or SaaS. The system will be distributed.
The business capabilities (functionality) will be sourced by different providers with different consumption models: some still will be built in house (those where the business needs real differentiation), whilst others will be provided by commercial packages, bought or consumed as a service. Many could be support by third parties as managed services. This is the typical situation in many banks that, in addition to their legacy systems in Cobol/CICS or Java, are using, for instance, Salesforce for CRM, Snowflake for analytics, Zafin for product management or IBM OpenPages for GRC.

Considering this context (distributed cloud deployment models and hybrid sourcing models) create monolithic systems is not longer an option. Systems should be built out of self-contained components that jointly can support the required business capabilities, using solutions that fit for the purpose. The foundational components should be “composable” in order to jointly support complex business scenarios just assembling the blocks. For that is important some level of standardization, functional and technical, following business reference models like BIAN or the technical ones like OpenAPI.

What are the modularity dimensions?

Modularity is a concept with different perspectives:

Structural Modularization. The system is composed of modules, each one with a clear responsibility and interface. The system is structured as a set of collaborating Modules in a loosely coupled way. Each module knows no detail about the other module’s internal implementation style or deployment. This allows polyglot systems as each module is implemented with the architectural style, programing language and middleware that best fits its purpose.

Development Modularity. The strict encapsulation and interface allow very loosely coupled design and development lifecycles, extremely useful for agile development techniques. Different development teams only need to agree the public interfaces which allows independent development and deployment cycles. The intra-module architectural style can allow for much more granular development lifecycles, like in the microservices architecture.

Operational Modularity. Each module is operated independently of other modules (independent operational lifecycle), running on different and isolated runtime environments (depending on the implementation technology). This allows independent maintenance activities (only limited by interface dependencies, which can be in turn limited depending on the runtime style — as using some patterns like circuit-breakers in microservices) and avoids cascade systems fails, outages and ultimately system meltdown.

Governance Modularity. The module is self-governed in alignment with the general IT Governance. So far, some development and runtime activities used to be concentrated on specific subsystems and organizations, like SOA and API Management on shared middleware managed by cross-organizational units, leading to organizational and operational bottlenecks (e.g. a unified services team centralizing services deployment). Modularity would help to decouple the governance of the application lifecycle management and operations.

Note: Modularity does not prevent resources and platform sharing (several independent modules could be deployed in the same Kubernetes cluster with different namespaces).

Implications of a Hyper-Modular approach

Projects for moving monoliths to a cloud delivery model (either on or off prem) fail, in many cases, due to a number of reasons, but probably the first one is using an inappropriate design style, trying to apply old design patterns in a new distributed cloud-oriented platform. New platforms requires new architectural styles. This is not just about a technical redesign but (probably mainly) also a functional redesign that in turn could require a change in the business capability, to improve it taking advantage of the new platform technical capabilities.

For instance, if you consider the traditional batch processing running on large boxes like a mainframes, it has a design that fits very well on the architecture and operating systems of those machines: usually it is a scheduled job trigged by a time event that execute a number of steps ingesting a massive amount of data and processing it to generate an output. This approach, that works extremely well on centralized machines, use to have a poor performance on distributed systems if it is ported as it is. But it is not just to use new available technologies, like in-memory data bases or stream and event processing, it is about to change the functional design to create a new process that is leaner and more real-time oriented. Actually, there are traditional batches that are in its own nature very time trigged (e.g. monthly billing) but others were designed as batch processes due to technical limitations even though the business need was real time (e.g. calculate the customers linkage with the bank, based on the contracted products and services; that should be realtime) .

A hyper-modular approach has to tackle a number of challenges. among them:

Transactionality. The traditional monolithic systems were excellent to manage transactionality, supporting business events that needs to be transactional by its own nature (e.g. a transfer that means a debit and a credit). Now, the new information systems are being built out of commercial components (on prem or in SaaS), reused components or new bespoke when justified, with a hybrid deployment model including private and public clouds, an environment where traditional strict transactionality does not work. Even distributed protocols like RMI/IIOP do not work properly in the new Kubernetes platform. This problem needs be tackled mainly at the design level, both functional and technical, which should prevent the need of transactionality as much as possible, identifying the proper granularity of each module with techniques like DDD (the domain analysis and design is an extremely important element for managing transactionality, constraining it to the module internals). Additionally it is possible to use eventual consistency mechanisms and protocols like SAGAs or TCC.
Data. In the traditional architectural style data was stored in monolithic repositories with unique data models, or even in a number of data monoliths very complex to manage as the same information was duplicated in several repositories requiring a complex governance, replication tools and MDMs solutions. Data decoupling is probably the most complex task when decoupling a monolith. Many transformations have tried to “work around” the problem creating pseudo-modules, where the business logic is modular and self-contained (e.g. built with microservices) but the data is still stored under a unique common data model/base. This can be a tactical solution but preserves the problem, not being possible with this approach to face real transformations. Data should be modular as well as part of the self-contained module using the aforementioned DDD techniques to find the proper level of granularity. This rational applies for both operational and analytical data. There is a new architectural style called Data Mesh⁴ (by Zhamak Dehghani) that support very well this approach.
Processes. The traditional approach to process management considers processes like just another layer in the banking architecture on top of business logic, in many cases with couplings with the channels when there is not a proper design. This approach uses to end up on large processes cutting across different business domains as a kind of large orchestrators, creating couplings among those domains that make them very difficult to maintain and evolve. Processes, as any other business logic and data, should be modularized running only activities from the specific business capability (module) they belong to, being accessed only thru the module public interfaces. The processes can invoke services exposed by other modules public interfaces, which in turn can launch a other process (e.g, an offer process into a product origination module invoking a customer on-boarding service, supported in turn by a process, into the customer management module). With this approach the support to a business scenario (e.g. sell a product) becomes a choreography of collaborating modular processes, each one belonging to its own domain, instead of a monolithic orchestration.

In addition to those above, other challenges for the journey to cloud are how to:

Transform the batch processing (trying to remove it)
Model the system discovering the modules with the proper granularity supported in techniques like DDD and frameworks like BIAN
Identify Non Functional Requirements
Deploy the new system in containers under a cluster manager control, either on private or public clouds.
Address the organizational impact (driven by a software development lifecycle that is Domain-oriented)
Programming languages and frameworks
Plan the transformation
… among others

What next?

In future posts I will share views and experiences on how to face some of these challenges, starting by a more detailed view of the anatomy of a module.

Note: A good design should make the system agnostic of the underlying runtime platform and infrastructure (a guarantee for being future-proof), being able to run it either on or off-premise, on any type of boxes (x86 or mainframe), with the minimum changes. The infrastructure selection should be driven by the non functional requirements, economics, IT strategy or regulations, finding the solution that best fit for the purpose for each specific case. Analyzing infrastructure, with its pros and cons, is out of the scope of this series.

Footnotes

[1] Self-contained components in this approach is not exactly the same than in the Self-Contained Systems (SCS https://scs-architecture.org/) architectural style. A key difference is that we do not include de UI in the component, as SCS proposes.

[2] BIAN Banking Industry Architecture Network (https://bian.org/)

[3] If you are interested on BIAN and DDD modeling approaches for banking follow Alfredo Muñoz in Medium. He is an expert business architect having excellent posts on this topic.

[4] What is Data Mesh: https://martinfowler.com/articles/data-mesh-principles.html