Our journey from Microservices towards Self Contained Systems

marc-olivier fleury
Swissquote Tech Blog
8 min readMar 11, 2020

--

Hello, My name is Marc-Olivier Fleury, and I am a software architect at Swissquote. I am working in a team of 3 people, and our job is to ensure a high level of efficiency of software development in the company. This is not an easy task to achieve ; we do our best to reach that goal by providing sets of recommended technologies and best practices, focusing primarily on the conventions that all applications need to follow in order to integrate with each other.

In this article, I want to discuss a change in this area that we are currently undertaking at Swissquote : evolving our integration model from simple HTTP microservices into Self Contained Systems (SCS). You will learn about the way microservices were implemented, the opportunities that they offered, and the limitations that eventually became apparent. Finally, You will learn about Self Contained Systems, why we believe that they should enable us to overcome these limitations, and how we are implementing them at Swissquote.

Let’s start at the beginning : the adoption of microservices. Until some point in 2014, our development was mainly centered around a single database. We were not deploying a monolith, but the applications developed by the different teams were mainly integrated with each other by communicating through the database, by means of shared Java libraries. Such a library would for instance fetch financial information, or record a trade directly in a globally shared transactions table. This model was very efficient in the early days, but eventually caused different kinds of difficulties, the most apparent one being dependency hell : integration required embedding libraries, which may depend on each other, and on technical libraries and frameworks such as Spring and Hibernate. This quickly led to deep dependency trees that contained many conflicts. In the end, we were observing cases where most of the development time was spent resolving conflicts and making adjustments, such as upgrading to a newer version of some library, that had nothing to do with the business feature to implement.

When the situation became too critical to make any progress in a reasonable time, we decided to adopt a communication mechanism that was very trendy at the time : REST (more accurately, JSON over HTTP, real REST being quite difficult to achieve). In order to bootstrap this change, a small task force was created, and had about three weeks to provide some fundamental building blocks to make it possible for applications to integrate together as seamlessly as possible. This is how we adopted JAX-RS, a powerful Java standard that makes it possible to describe HTTP endpoints by annotating Java classes or interfaces. A crucial addition that was implemented was the ability to use the same annotated interfaces from the client side : by scanning the annotations, it is possible to generate a proxy that effectively implements the interface by issuing the corresponding HTTP calls. This key feature made it possible to expose REST APIs as Java API modules that contain the interfaces defining the endpoints, and POJOs that correspond to the JSON data exchanged. These key elements (API modules and proxy generation) make it very convenient to use a service from the client point of view : after importing the interface defined by the service provider, you can access the remote HTTP service in the same way as you would access any other service local to your application. If you have a creeping intuition that this is a bit too easy and can be dangerous, you are probably right…

For about 5 years, this technique proved to be very efficient. We were able to scale from about 80 developers to more than 200, working in different locations. Our use of Java API modules protected us nearly completely from some of the usual issues when integrating with REST, such as wrong endpoint paths or methods, or typos in names of JSON fields. But today, some issues are becoming more and more apparent ; the most critical ones are that some operations take too much time, and implementing new features is getting very hard. A common cause for both of these issues is the high level of interconnection among services. Indeed, services are freely calling each other, and often calling other services before being able to respond. This creates chains of calls that can go as deep as ten hops, leading to response times that can become very long, up to seconds in the worst cases. And when it comes to implementing something new, there often tends to be impacts on multiple services, under the control of different teams, making it difficult to identify how to perform the change, and requiring the synchronization of a lot of people in order to finally ship the change to production.

A bird’s eye view of a part of the services that are tightly interconnected, built by examining API usage.

This is where Self Contained Systems come into play. About one year ago, I joined a small working group, and our mission was to improve the way REST APIs are defined, because there was a feeling that they were not really “looking nice”. We took a step back and realized that our problem was not so much with specific API signatures, but really with the ecosystem as a whole, and that we needed to act at system level. This is how we discovered Self Contained Systems (SCS), an architectural style that sets the emphasis on a coarser level of granularity than what is usually the case with microservices: in an ecosystem that could contain hundreds of microservices, there would probably be tens of Self Contained Systems.

The fundamental principle of SCS is the following : an SCS is a fully functional web application, that is able to provide value to the end user on its own (for example Trading SCS, or Accounting SCS). An SCS can be rather big, the rule being that it can only be owned by a single development team. This requirement on ownership comes directly from Conway’s Law, and plays well with Domain Driven Design : when implemented correctly, a strong cohesion between team, domain, and implementation is achieved. This becomes particularly interesting for companies that follow a product organization : the SCS style provides a framework that makes it possible to align teams and systems around products, in a way that enables them to evolve autonomously.

Our objective is to redefine systems and teams around products, to reduce friction and enable autonomy

Of course, there has to be some sort of integration between Self Contained Systems, otherwise the end result would look to end users like a patchwork of different small websites. The preferred way of integrating is through the Web User Interface, which is sometimes referred to as transclusion. This property of SCS actually makes them a flavor of microfrontends. The reason to prefer Integrating at UI level rather than API level is to maximize autonomy : teams can work on their systems independently without the need to coordinate with backend teams to adjust the APIs that are used by the frontend, as long as the changes remain in the portion of the website that is served by their SCS.

Integration among SCS is done at the topmost layer : User Interface (source: https://scs-architecture.org/)

Transclusion is a powerful concept, but it has limitations : some of the user interactions really require SCS to communicate together. For instance, if we have a Trading SCS and an Accounting SCS, it is likely that the Trading SCS needs to check with the Accounting SCS that the trade is possible, and notify it when the trade is passed. In order to make such interactions possible, an SCS can expose a “public” API : an API that is intended to be used by other SCS. In our case, the Accounting API could for example provide operations to reserve cash for trading. The SCS manifesto contains a critical guideline regarding inter-SCS communication : it should be asynchronous wherever possible. This rule is key for us, because it is the one that can solve one of our major problems : long chains of synchronous calls. Making APIs asynchronous comes at a cost though, such as buffering of requests, or data replication. These asynchronous interactions can have a big impact on the overall user experience, because they will likely result in a pause in the flow of operations. Some tricks can be applied to mitigate these issues, for instance by using an optimistic UI, or by performing some of the validations in advance. But the most significant way of addressing this risk is to take great care when defining the different SCS, so that a minimum of end user interactions cross the boundaries between them.

When UI integration is not possible, “public” APIs can be used to integrate at logical level (source: https://scs-architecture.org/)

Which leads us to the main challenge that we are facing today. Now that we know that we should move towards SCS, we need to decide how to disentangle our Distributed Big Ball of Mud. This is really difficult for us, because we have always evolved our systems and organization without strong architectural governance, implying that we do not have many people with the necessary authority to make such a change happen. Luckily, we had a great opportunity around the middle of the year 2019, exactly when we became confident that we should adopt SCS : our organization changed from two distinct development departments (some sort of bimodal IT), into a single department, targeting a product-oriented setup. This made it possible to attempt to refactor teams and redistribute systems following DDD principles, by aligning with Bounded Contexts, effectively performing an inverse Conway maneuver, to lay out the foundation to migrate systems into SCS. This is no easy task though, and we are still making adjustments today, but we are confident that the rationale behind the SCS architectural style is understood by our development teams, and that they are doing their best to follow it.

There is still much to be said about the way we are implementing Self Contained Systems, in particular concerning the more precise architectural rules that we defined, and the technological stack that we chose in order to achieve transclusion and asynchronous communication. Since it would take quite a lot of time to cover these topics, I am not going to detail them now, but will rather dedicate a full article to the subject in the future.

We are finally reaching the end of this story. The key element to remember is that microservices are a powerful tool, and using simple JSON over HTTP can get a company quite far, but if not enough governance is applied, they may eventually lead to a sluggish distributed ball of mud. Self Contained Systems are a very promising set of principles that make such governance possible, with the promise of making teams more autonomous, and therefore able to deliver faster. We are still at the beginning of the road, and it will probably take us years before we can consider that we successfully migrated to an SCS architecture, but we are fully committed, and are confident in the talent and enthusiasm of our engineers to reach this goal.

Did you like what you read? Join me at Swissquote:

https://careers.smartrecruiters.com/Swissquote/tech-jobs

--

--