Checklist for High-Load Software Architects. Part 1 — Architecture Design

Dmytro Nasyrov
Pharos Production
Published in
2 min readAug 18, 2024

This article will be useful if you are starting a project that can grow into a high load or if you already have a project that has a high load. Each item on this checklist will help to avoid certain problems that arise during the operation of such systems. And although some items may seem quite obvious, others are even unnecessary.

Microservices and functional separation

[✔] my system is segmented and its parts are independent

Microservices are not a panacea. But the fact is that the monolith needs to be properly cut into separate functional applications and made to work independently of each other, while receiving the ability to scale individual nodes, increasing the reliability of the system as a whole. It is important to understand that individual nodes must be completely independent — have different storage, different domain names and scale independently of each other.

Fat client

[✔] I thought about using a “fat client” and client-side balancing

It might be worth considering application-level balancing, if you have something like a social network or an online store, the client side can receive a list of DNS of your servers from the main server and access them in turn or switch as needed, for example, with a long response time or a complete failure of one of the servers. Also, to distribute the load, you can implement a “fat client” — collect raw data with requests and perform rendering on the client side. Of course, this point is highly dependent on the business logic.

Lazy evaluation and asynchronous processing

[✔] what can be deferred is executed asynchronously

Calculation results that are not required in real-time, but can wait a second, a minute, an hour, or two, can easily migrate to background processes. That is, you can transfer complex calculations or calculations of huge amounts of information to asynchronous processing, let it be some worker that will process the event queue, and at this time the client will have a stub “Wait, data is being processed …”. These can be just complex calculations (for a couple of hundred milliseconds), the important thing here is that the client side will not keep an active connection to your service in a waiting state, freeing up the data channel.

Applying fault-tolerance patterns

[✔] I have applied fault tolerance patterns in my system

Failure to implement these practices in the code of a distributed system makes the system vulnerable to such fairly common situations as timeouts of external system calls, restarts (or failures) of downstream services, and other “pests”:

  • Circuit Breaker
  • Bulkhead
  • Default behavior

You can say Hi to us at Pharos Production — a software development company

https://pharosproduction.com

Follow our product Ludo — the reputational system of the Web3 world

https://ludo.com

--

--

Dmytro Nasyrov
Pharos Production

We build high-load software. Pharos Production founder and CTO.