Scaling the bank

Published in

The Qonto Way

16 min readJul 31, 2020

TL;DR: The Qonto Ledger is responsible for keeping track of all transactions and balances. We achieved a great deal of scalability through careful performance optimizations. We’re planning to get to the next level by improving the separation of concerns between the Ledger and payment systems.

At Qonto, we create a product that is easy to use for our internal clients (other cross-functional teams, operations and back-office team, finance team, …). To that end, we need the Ledger to be implemented in the simplest way possible, following the KISS principle.

Introduction

Where do we come from? At Qonto, we have always wanted to deliver a quality product and to deliver it fast. To that end, we first created our banking product based on an external core banking system (CBS). This lasted for roughly our first two years of activity.

This allowed us to take some time to build our own CBS, somewhat in a “fast to market” way. To be able to ship fast, without losing on the quality of the existing product, we decided that this CBS should mimic the API of our external partner: that way, only the CBS would have to change. It was a great idea and helped us deliver it on time.

Why did we want our own CBS though? Third-party solutions either target big banks (with a price tag that follows) or develop features on their end, making them the bottleneck of our ambition. As soon as our first clients started to be onboarded on our CBS, we were able to provide them more targeted features: more payment systems like SWIFT and checks, international deployment in Germany, Italy, and Spain with local IBANs, etc. Also, by having our own CBS, we keep control over the quality, which is at the core of our value proposition.

However, with each new feature comes a cost: our CBS has always been kind of monolithic. There are dedicated services for dedicated tasks like transaction authorization, accounting, SEPA, and more. But the coupling of those services is high. We lack a clear framework to easily add new payment systems and features.

That is why we decided to take a step back and look into how we could improve on the existing to provide this framework. How could we make the bank we all love continue to grow without increasing the product and tech debt? That’s when we decided to make the Ledger agnostic.

Definitions

Before we go any further, let’s define some important terms that we’ll use throughout this article:

What is the Ledger? From a technical point of view, the Ledger is a big storage unit that keeps each state of each transaction for each of our clients, to provide accounting and authorization. It is the set of services built around that store that permits financial, back-office, and accounting operations. Those services are responsible for ensuring that our regulatory constraints are met regarding statements, funds protection, exports, etc. The authority that enforces those constraints is the French Prudential Supervision and Resolution Authority (ACPR).

What do we mean by agnosticism? In the context of this article, let’s define agnosticism as being able to operate without any internal knowledge of the origin of an operation. For instance, a new transaction in the Ledger does not need to hold any information about the network it comes from.

What is a payment system? A payment system is an abstraction of all operation (transaction statuses) originators in the Ledger: SEPA Direct Debit (SDD) is a payment system, as well as SEPA Credit Transfers. Card is another, and check, and SWIFT. To be able to become agnostic, we must be able to find the least common denominator between all of those services. This is the payment system.

Technical summary — to date

Let’s dive a little bit into the inner workings of the current CBS, shall we?

How does it work?

Core principles

CBS is constructed in a few blocks:

Payment systems like SWIFT, card, etc. handle connections to external networks and all transactions transiting through one of those.
The Back office block handles the actions of the operations team.
And all of those blocks are connected to the Ledger.

This last block is responsible of:

Handling the authorization of all transactions and keeping track of balances;
Keeping all clients accounts statuses;
Making the financial accounting as well as enforcing the regulatory constraints;
Interoperating with partner banks.

The Ledger block must be fast (for authorizations), accurate and auditable.

A case study

This is all very abstract, so let’s dive into an example flow of a transaction so that you can see how all of this works. Let’s, for instance, imagine that you just went to your local bakery to buy an amazing pie. It costs €10. What happens at Qonto then?

When you enter your card into the terminal (POS), it sends an authorization request to MasterCard.
MasterCard then determines that you are a Qonto client: they need our approval before allowing you to pay. They transfer the authorization request to the Card payment system service.
From there, the Card service checks if the request is technically valid, then sends it to the Ledger
The Ledger then checks your balance, the limits of your card, your options, and decides on whether or not you can pay. For the sake of the example, let’s say you have the funds, and the Ledger authorizes your payment.
The authorization responses then come back to Card, then to MasterCard, and eventually, the POS tells you “payment accepted”.

That’s it, you’ve got your pie, everything is fine and done. Or is it? Actually, at this point, not a single cent has left your account. Not a single cent has reached the bakers’ account as well. All that happened is us, Qonto, promising MasterCard on your behalf that when the times come, we will pay them the €10 requested. And then we will debit you.

Let’s take some time to analyze the consequences of this: you have a limit on your card and you are not allowed to let your account go negative. How do we handle this when not a single cent has left your account? How do we prevent you to reach those limits before the funds move?

For that, we use a concept known as authorized balance. This is a secondary balance (not the financial one), computed and stored by the Ledger, that takes into account all your pending transactions. This is the main balance we show you in the user interface. This is, at any given point of time, your liquid funds, that you can use.

But this authorized balance has no meaning from an accounting (and accountant) point of view: as said before, some of the funds are still yours. The actual balance of your account is what we call settled balance. What happens next then? Let’s continue with our previous example, shall we?

At the end of the day, the baker wants to get paid: he goes to his POS and asks for clearing of all his transactions of the day.
The POS sends the requests to MasterCard, who dispatches it to all the clients’ banks (including us).
We receive a big file with all the operations that you and our other clients have made. This is an order of transfer, to pay the bank of the vendor, like our baker. They will then transmit the funds to the individuals.
Card parses the file, it acknowledges The Ledger of the new status from the transaction: it is now final and cannot be changed anymore. The file is the definitive source of truth about the amount and status of the transaction.
The Ledger then orders your funds to be moved:
a) Out of the virtual vault holding your funds to a transit account in our partner bank that holds the funds by ACPR regulation,
b) Out of the transit account (aggregated with all card transaction funds for the day) to MasterCard,
c) The next day it checks with the external bank that the payment has been successfully done and marks the transaction as completed. This process is called reconciliation.
In the meantime, the Ledger maintains a mirror image of all the movements on our partner bank accounts in its own storage.

And voilà, the funds have finally left your account and your pie is long eaten — it was delicious, wasn’t it?

Your settled balance is now debited as well (you can see it in the interface by clicking on the authorized balance), the transaction is no longer pending in your history, it will show up on your next account statement. (By the way, this shows why sometimes your settled balance is different from your authorized balance.)

Adding a new actor

We’ve seen how everything works currently. Let’s now see what we would have to do to handle a new payment system with our current architecture: Instant SEPA, for instance:

We need to connect to the SEPA network (we already have a service for that). Then we need to update the SEPA service to take into account this whole new payment system: how to orchestrate it, when to authorize and settle it, etc.

Then in the Ledger, we have to add the rules for the authorization of this specific kind of operation. For Instant SEPA this would be:

Are the limits for Instant SEPA reached? (and update them)
Check that the account has the right options.

Similarly, we have added a whole set of rules:

Which internal accounts are impacted by the payment system?
How to route transactions to those accounts?
Which operations have to be performed on external banks? (Which banks, how?)
How do we check that everything worked as intended (for accountability)?
etc.

As you can see, this is not only limited to the Transfers block but we, the Ledger also have to modify our services to account for the new payment system. And we must do so without impacting any existing payment system, nor impacting the performances.

Are we happy with it?

So, what are the advantages and disadvantages of this way of working?

Pros

For start, the current CBS implementation was fast to implement, and at least at the beginning, adding new payment systems was also quite fast.
As a building-from-scratch strategy, it helped us learn a lot about the specificities of an SME targeting banking system, things that you won’t find in any book. We learned a lot and continue to learn!
Also, having big fat ledger storage makes it easy to monitor and debug: all information being centralized, as well as all logic, we only have one point to look into to analyze any misbehavior of the system. This was crucially important at the very beginning of the CBS, as we needed to make sure that it is reliable for you to use.

Cons

On the other hand, there are a few issues with this way of working:

Performance and scalability wise, we needed to spend a lot of time in the optimization of code and queries to make sure that this would scale: When you put all the logic in one place, it is more likely that you will end up with some sub-optimizations of the code.
Adding new payment systems (or even new operation types on existing ones) is proving more and more difficult as time goes by, as the complexity of the systems increases with each new one.
Onboarding new developers on the team is becoming longer, has the sum of information needed to perform even the simplest change is also growing exponentially.
Developers and product managers from other teams also have to have some knowledge about how the Ledger works, making their work more and more difficult.
Most bug fixes require cooperation between two teams, leading to slower fixes.

Why is it not agnostic yet?

You may then ask “Why is it not agnostic yet, then?” The truth is, being agnostic is not that easy. Payment systems are inherently different:

They follow several, not inter-compatible protocols (you can look online for the ISO8583 card protocol and the ISO20022 SEPA protocol for instance)
Every payment system follows a different calendar for the various operations of a transaction (rejection, cancelation, settlement, etc.) with its own set of rules to apply them.
Every payment system as its own set of internal accounts (transit, abnormal handling, etc.): we need to be able to operate them.
Every single transaction doesn’t hold the same level of risk: some payment systems are safer (for transactions inside the eurozone for instance). We need to acknowledge that with different accounting rules.

This is why the Ledger is full of payment-specific rules.

Scalability via performance

Nevertheless, we already attained a pretty high level of performance and stability via continuous improvement of the existing codebase.

We did it in multiple phases: We knew from the start that optimization of database queries would be critical, so we organized it in a way that would minimize the reads necessary to perform any operation. We also took some long sessions in analyzing the queries performances, to optimize their use of indexes, cache, and memory.

We also decided early on to minimize the time spent on the network: using big prepared queries, and putting the logic in them to avoid too many exchanges between the Ledger and the database cluster.

However, we still have some inherent scalabilities issues. Due to the fast growth of Qonto — new customers, new countries, new features — the transaction rate grows exponentially. This makes scalability a primary concern. Having more and more clients at Qonto means that we need new features and new payment systems to be added, to suit your needs.

The needs of our internal clients such as ops and finance are also growing. We need tooling to assess the risks, to handle abnormal transactions (for instance transactions on closed or lost cards — they won’t impact you, but we need to deal with them on a daily basis). We need to be able to report to the ACPR, to ensure that we follow their guidelines, etc.

Consequently, we had to put some DevOps in our scalability strategy. Came into consideration, not only the resources associated with our Kubernetes pods and databases but also the localization of our servers to ensure fast network exchanges.

And while we scale, we must remain as stable as possible: We built extensive non-regression test suits and added monitoring at every level of CBS. We built procedures to handle unexpected events. We stress-tested every aspect of the software and database.

That is the story of how we dealt with scalability issues up until now. This has proved useful, the Ledger is still very stable, and we have a lot of headroom to handle more and more clients.

We could have continued on this track, and we would still be fine in one year. What about after? What about the long term? The system is becoming more and more complex, and adding new features and tooling becomes more and more costly. We need to take a step back, digest all the experience we’ve gained since we wrote the first line of code on the CBS, and imagine a truly scalable solution: We need to put back some simplicity in the system. We need to be able to plug in any payment system painlessly. We need agnosticism.

Scalability via agnosticism

Why?

Indeed, agnosticism in architecture design brings a lot of benefits:

If the system is agnostic, then we can delete all payment system specifics from the code: mathematically, we reduce the risk of bugs. We also have a better separation of concerns, to ease the maintenance of the codebase.

A truly agnostic system also allows us to ship faster: only one team needs to work to add a payment system, the Ledger stays the same. We don’t lose time in meetings across various teams. We don’t need to accommodate the rhythm of features and can continue or main work: keeping ledger stable, auditable, and accurate.

An agnostic system also provides a lot of benefits for our internal teams: the ops only need to learn one pattern to analyze transaction sequence from any payment system. The finance controls are easier. Bug resolution can now be performed by one team at a time, etc.

Finally, adding new payment systems, launching Qonto in other countries, adding new partner banks, all those operations become a lot easier and faster. We can dream big!

Case study

Let’s review our previous case study, but this time with the new agnostic system, shall we? This way we will be able to show the principles and technical choices behind our new agnostic architecture.

You decided to buy another one of those delicious pies. You went to the bakery, chose it, and put your card into the POS.

Then, the POS contacts MasterCard, which sends an authorization request to the Card block.
Card parses it, and checks that the card is active, the limits are respected, the options are respected. If it weren’t the case, we would immediately refuse the transaction.
It calls the Ledger to request authorization of the transaction. The Ledger doesn’t need to know that this is a card operation. All it needs to know is that a debit of €10 has been requested for your account. It checks the account status and balance, and answer: yes, you have the funds.
Card answers to MasterCard with an authorization response. You have your pie.
In the meantime, Card pushes the pending transaction to your history so that you can see it, edit the VAT rate, add invoice, etc. and the Ledger pushes your new authorized balance so that you have an up to date view on it.

As you can see here, this already allows some profound optimizations: By putting back the card-specific logic in the Card domain, we can implement early return paths. We also are allowed to have a much simpler authorization service, that can be blazingly fast at answering your request.

Then comes the second part of the payment: the actual movement of the funds. Let’s look into it:

The baker collects all is payments from the day and asks the POS to be paid. MasterCard dispatches it to the client banks, and Card receives the clearing file.
Card parses it, and finds your transaction. It compares it with its status, and checks on the Ledger to see if your account is still open. It is!
Card then creates an asynchronous message with all the information of your payment and pushes it on a payment-agnostic topic: your account needs to be debited of €10, to be paid at MasterCard with a specific id (MasterCard will then pay the baker’s bank).
The Ledger consumes the message, stores the transaction on your account, and moves the funds out of the virtual vault to the transit account on the partner bank.
In the meantime, Card prepares another asynchronous message to indicate that it expects a total movement for the whole clearing file to be sent to MasterCard.
The Ledger consumes this message, orders the partner bank to move the funds to MasterCard and stores it as an expected movement.
The day after, when it checks the actual fund movements on our partner banks (reconciliation), it compares it with the expected movement to ensure that everything worked well.

How?

As you can see, we are now able to operate the Ledger without the need to know who calls it. To achieve that, we had to compare all of our payment systems’ transactions to find their greater common divisor. This would become what we call a ledger movement.

A ledger movement consists of a few information:

A transaction identifier, that links movements together;
A movement identifier;
An amount (with currency and direction);
A few status fields that are used to set the accounting rules used for the movement;
A few introspection fields used to facilitate the debugging.

To achieve agnosticism, we also need to unify exchanges: before the changes, almost all payment systems used different endpoints to communicate with the Ledger. Now we only need a limited set of HTTP endpoints for the authorization process (which has to be fast), and asynchronous messaging consumers for everything else (which should be asynchronous).

With that model and interface, we can now focus on the inner workings of the Ledger: We can strip down all payment system specifics from the code:

In the Ledger, we are moving from a distinct set of rules for each payment system toward a ruleless way of working: All internal and external accounts are now tied to real accounts (with IBANs, etc). The payment systems are responsible for the funds that transit through them, so they are able to notify the Ledger when funds need to be moved. For reconciliation, as we receive an expected remittance from the payment systems, we are now able to simply make a comparison between an expected entry and the actual one. All operations on the Ledger are simply new rows in our double-entry bookkeeping that mirrors our partner banks.

This means transferring back responsibility to the relevant teams. This is a complex yet gratifying task still currently ongoing. Once done, we will be able to greatly improve the data model and database performances, as well as greatly reduce the complexity of the authorization algorithm.

To show it, let’s consider what are the entries for our case study:

Schematic Bookkeeping

Having all of our operations recorded in that double-entry bookkeeping way allows us to easily integrate with our finance team tooling. We can easily generate exports, provide metrics, etc.

We have multiple internal accounts like the transit account. Notably, we have accounts to hold abnormal transactions or fraudulent ones. This allows us to easily show them to the operations and back-office team, and to permit them to easily operate on those.

With this, you see how easy it is to plug in a new payment system or partner bank to the Ledger:

Add rules in the Ledger on how to contact the partner bank (on which accounts, with which payment system, etc.), if they don’t already exist.
Create the payment system service to target the agnostic API of the Ledger.

And we’re done. We no longer have a tight coupling between the payment system-specific logic and the Ledger logic. We only have to make sure that the payment system nows how to communicate with the Ledger.

Of course this architecture change — that we are currently deploying for all our existing and new payment systems — is not the end of the game. A lot of the communication from or to the payment systems is redundant (the SEPA payment system makes similar calls than check payment system for instance).

Having a long-term vision is what permitted us to be the neobank you all love. We might change it with experience, but the ambitious goal drives us to always create the product you love.

Towards credit license

And now for the icing on the cake, let’s see what wonderful change being agnostic will allow us to make to Qonto: A credit license! To get it, we need ACPR approval, but we also need a lot of changes in the Ledger:

We will be able to operate with a large number of partner banks.
We will be able to add multiple new payment systems. We can even invent our own!
We will be able to provide you many features we’re sure you’ll love.

Because all of this can now be done with far less code than before and a better separation of concerns, it means that we can focus on the innovation. Less is more.