Low-budget Banking service using Amazon Cloud: architecture, performance & cost

The success of any banking application depends on how efficiently its backend can handle the user traffic. With the rise of cloud technology, businesses can leverage the benefits of scalability, flexibility, and cost-effectiveness. In this article, I will explore the performance of a banking application backend hosted in the AWS cloud.

My tests revealed that the system with a low-budget cloud environment setup was able to handle 200 heavy create transaction requests per second (RPS), around 890 rps at peak. Operating on 30M records set. Moreover, I will delve into the infrastructure cost of hosting such a system in the cloud. My findings suggest that the cost is surprisingly low, at just $673 per month (actual on 15 Mar 2023). Join as we dive deeper into the performance and cost aspects of running a banking application backend in the AWS cloud.

Article bonus: benchmark of the test solution implementation using Java and Node.Js with a public codebase.

In the article below you will find answers to the following questions:

  • What components does a good banking architecture consist of?

Table of contents:

What is a low-budget setup?
Measurement criteria
How to do it wrong, a straightforward/monolithic setup
Banking application challenges
How to do it right
Benefits of such architecture
Tech stack for the performance tests
Results for Node.js
Results for Java
The cost of the production system
What income could such a system generate?

What is a low-budget setup?

For my experiment I used only low-budget components from the AWS price list. The goal was to get the cheapest components that do the job. Example prices (on 15 Mar 2023 for US East region, Ohio):

  • PostgreSQL as a database; setup: db.r5.large (the 2nd cheapest one with 2 CPU’s) x2 CPU and 16GB memory costs $171 per month (US East Ohio);

This is a minimum technological component setup you need for a banking system. Why do we need all these components and what they actually do? I will cover it all sections through the whole article.

Measurement criteria

To measure the performance of the system, I focused on the most critical component, which is the transaction creation process. This means that the requests per second (rps) figures that I obtained only apply to create/write operations. In reality, a banking application involves more read/GET operations, with a ratio of about 20/80%.

Although the application I tested is a real banking application that involves a range of other functions besides transaction creation and updates, the infrastructure I used has the potential to support account management, KYC, and other features.

My main focus was to determine how many transactions a banking application can create per second in a cloud setup. To achieve this, I utilized Artillery (https://www.artillery.io/), a tool for load testing. The test scenario was simple, calling a transaction-create endpoint. I conducted a warm-up session of 10 seconds and then a high-load session of 60 seconds, and for the final setups, I used longer sessions of 5 minutes.

It’s worth noting that the test scenario involved only 500 accounts. This approach allowed me to simulate high activity among a small group of users and a situation where there are many transactions per account in a short amount of time, such as a partner system communicating with ours via a REST API.

You can read here about the PostgreSQL records insertion speed using the cheapest AWS DB.

How to do it wrong, a straightforward/monolithic setup

Basically one may think that a simple monolithic setup would be a good solution to serve all user requests for a banking application. By the monolithic approach here I mean a single service scaled horizontally as on the picture below:

Classic straightforward approach architecture approach

In this diagram a single-purpose API service does everything:

  • user and operator authentication;

This setup can indeed have its place in the real world. But only for demonstration purposes, as a Proof Of Concept.

Though the cost of this infrastructure is very attractive, it has a lot of critical flaws: lack of security, low processing speed, conceptual mistakes while processing several transactions in parallel for 1 banking account, instability, lack of compliance.

Banking application challenges

Before “doing it right” let’s have a look at the basic requirements a banking application should fall under:

  • high-availability (HA), always be able to tell customer what happens with his banking account and his payments, 24/7, even under high load;

In terms of transaction creation, let’s have a look to a basic set of what should be done to successfully process a transaction:

What should be done to create and process a transaction

So to create a transaction, the following list of actions should be done:

  • check account balance;

Some of these actions need to be done within 1 database session, which takes a lot of time. Some of the operations are of an asynchronous nature — we cannot block the customer for a long time, making her wait for an answer.

And evidently this cannot be done in a monolithic app:

  • What happens to the customer if the partner system does not respond now?

How to do it right

The answer to solve the majority of the issues is moving to promises. So the system does not make all actions at once but performs only the critical checks fast and promises to accomplish the rest later. This way an asynchronicity comes.

At first, the application performs the most critical operations like checking the available balance, daily/monthly limits, calculating a fee, etc. And then the system leaves the user with a promise that the transaction would not be forgotten and would be processed for 100% somewhere in the future. So we need queues.

Second, the concurrency over a single banking account should be eliminated. That means for a single account we need to process transactions one by one. If we make a transaction processing service work in such a way with all accounts, we will stick to the performance of each update/create operation. If a transaction takes 50ms to accomplish, then we have a throughput of only 20 transactions per second.

To scale the system, we should route all requests to the appropriate current account-related dedicated service by hashing the requests. This approach results in a set of working services, each processing transactions for a single account one by one, which can multiply the system’s productivity. More account-related groups lead to more services and greater productivity.

Finally the design looks like this:

“The right” approach

Warning: this is the least a low budget setup that can guarantee for scaling, correct transaction processing, correct communication with banking partners and HA. But it’s missing yet:

  • separate authentication service;

The list can be continued according to the budget and the wish to reach higher marks in security/availability/resilience/observability fields.

Benefits of such architecture

Let’s overview a sample transaction lifecycle path:

  1. Browser sends transaction details to the API service;

Benefits here:

  • user does not wait, we promise to process her transaction;

Tech stack for the performance tests

I tried two technical approaches in my experiment to reveal how fast such a banking system can be. Backend was created using Node.js and Java. Cloud components with the same characteristics (CPU, Memory) were used, of course. AWS PostgreSQL RDS (db.r5.large, 2xCPU, 16Gb Memory) was used as a database.

If interested, you can find sources of the experiment in the repository.

Node.js stack — straightforward/monolithic

For the first, monolithic architecture, when 1 service does all the operations at once:

Here REST-API service was run in a cluster using PM2 capabilities. So in total one Node.js service per each core for x2 CPU. Used components versions:

  • Node.js version: 16.15;

In the experiment the following list of actions was implemented before the transaction creation:

  • generating transaction ID (UUID);

Node.js stack — the right one

The setup to test the “right” approach looks like this:

RabbitMQ version 3.8 was used here.

In this setup REST service was doing only these operations:

  • signature creation (for idempotency check);

And the Processing service (qwriter) was doing this:

  • generating transaction ID (UUID) (optional);

So REST service in this setup was doing much less operations.

Java stack — straightforward/monolithic

As for Java, these are the tech specifications:

  • Java JDK 11, default heap size: 512 MB;

Java stack — the right one

The “right” setup was checked as well. Where a Processing service (qwriter) is a single Java service and 8 Node.js services.

Alas, the setup with 8 Java dedicated threads for each queue was not checked. Instead I focused on how fast Node.js can be as a consumer service.

It’s worth mentioning that for transaction storing in PostgreSQL a table partitioning was applied for a “created” column of “timestamp” type. This allows you to decrease the load on the main table using a monthly granularity (this frequency can be changed on your needs).

Results for Node.js

Around 100 test sessions were made to make sure the results are stable and do not deviate. But at first, it was necessary to understand what is the maximum number of POST requests that could be transferred and served by the server if it does nothing. This setup was used:

The number is: 650 dummy rps for a single Node.js process.

The first result — numbers for a straightforward/monolithic setup for Node.js when working with 1/15/30 millions of transaction records in the database:

RPS for different volumes of data for a classic approach

Explanation and details:

  • 75 rps means that it took around 13ms to execute 1 transaction for 1Mln data set;

The numbers for the “right” setup:

RPS for different volumes of data for “the right” approach

Explanation and details:

  • exploratory analysis has shown that using this infrastructure setup, 6 is an ideal number of Node.js processing services; higher number didn’t give any significant performance boost;

Significant difference from the monolithic approach here is that the system is now much more performant and much more stable. Even beyond 400 rps the test stand is working fine but response latencies are starting to grow up to 1 sec. Of course this cannot be counted for a good UX, but still the system is responsive.

Here is how P99 percentile latency (ms) grow within increasing load using the optimal setup:

P99 latency per rps

Important note here — boosts over 200 rps do not come for a good price. The greater speed of incoming requests — the longer it takes to process them all, because the processing units still can work only with 200 create-transactions per second. So if the system experiences 400 rps for 1 minute, that means that last user transactions would be processed only 1 minute later. It’s not acceptable. Users should get a response with a successful message in a maximum of 3 seconds. It brings us approximate allowed spike bursts of 400rps per 3 seconds, not more.

Example of increasing queue load for a big incoming request rate of 400 rps:

Message queue state over time for a burst load

Example of good and bad incoming request rate spikes:

Different spikes for good and bad UX

Here users in blue spike await for a transaction success status message (a promise) for max 3 seconds. Users from the green diagram can wait up to 10 seconds.

Results for Java

According to my experience, pure CPU-bound tasks are completed about x3 times faster using Java compared to Node.js. That is for the fundamental benefit: bytecode compilation.

A banking/payment application contains a mixture of operations. There is a great portion of I/O tasks. The most important issue in our case: the faster code — the less time a database connection is blocked. And this is crucial.

For a straightforward/monolithic setup Java service managed to work stable up to 100 rps before the deadlock. P99 percentile latency was pretty attractive — 96,6ms average.

But for the “right” setup Java was used only for REST API in my tests. On the processing side I managed to run 8 processing Node.js services which gave a stable 200 operations per second even on a big amount of transactions in DB. Java REST API service demonstrated an incredible boost up to 890 rps! And it seems it is not the limit.

Latencies during an increasing load:

P99 latency per rps for Java setup

As you see, latency is almost always the same. It is practically not increasing with an upcoming load. During the tests I could not get higher rps but evidentially the system can hold more.

Probably the traffic capacity for my AWS dedicated internal network was hit. Anyway, Java REST API is at least x2.2 faster compared to Node.js solution for this sample banking functionality. I assume that after some tuning Java could hit the 1000rps mark easily.

The cost of the production system

Here is the production system costs calculation. Banking regulations affect the system architecture, software complexity and the final cost of the infrastructure. The most crucial factor is high availability (HA) compliance requirements. Basically, it means that the system should be replicated and survive in case of point failures or burst load. Compute nodes for services are the ones that were used for tests. They have enough resources to hold much more complicated logic compared to the one it was used for performance measurements.

Components and their cost in AWS cloud:

  • routing, Route53, 1 zone, 50Mln queries per month, $20

The total monthly infrastructure cost: $673.

Cost of a Database component is the most expensive one. Since the system needs to have 2 instances (1 active, 1 passive failover replica) we need x2 to the price of a single database instance: $171. Could the cheaper DB be used? Yes, db.m6g.large (2vCPU, 8GiB memory, $120) could be used. But with lesser memory a big set of transactions will cause rps to drop down to 100, if not lower. A database will not be able to flush data fast from memory to SSD disks. So cheaper DB is possible when there are less transactions. But we want to earn more on more transactions, right? 😀

A note about Kubernetes. Setup where nodes are managed by Kubernetes is a very good option. It can give more speed in terms of infrastructure operations, monitoring. It can decrease costs on maintenance. What would be the cost of the production system based on Kubernetes? It’s right here:

  • Routing, Route53, 1 zone, 50Mln queries per month, $20

The total monthly K8s-based infrastructure cost: $789.


In this article you discovered a banking application stand that is very close to the real one:

  • a more or less exact list of operations performed for each transaction creation;

Such a system is capable of creating 200 new transactions per second. It can hold load spikes up to 890 create requests per second.

When keeping read/write requests ratio of 80%/20% in mind, we can expect this performance:

  • ~650 rps of mixed read/write operations in normal mode;

Such a performant system, compliant and HA ready, costs $673 per month in AWS.

And Java still proves that it’s a highly performant language for a backend.

What income could such a system generate?

Some fantasies on what income can such a system generate.

Let’s assume that mainly users are active through the daytime, 16 hours out of 24. The average rate of new transactions per second is 16. This rate will generate 30Mln new records per month. The average transaction fee is $0,50.

So it’s $15Mln 💵. Hm, not bad for a $673 cost system!

If you have any questions, comments, improvements — do not hesitate to contact.

Link to the experiment codebase.

Icons by: https://olkeen.com/portfolio/vector-illustrations/.

Photo by Ferran Fusalba Roselló on Unsplash.



IT expert with 20+ years of experience. https://tashlikovich.info

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store