Software Architect — II — Distributed Architecture
Monolith is a single deployment unit of all code
This is a traditional approach where the organization is trying to mimic the vertical separation of their internal teams in the layers of the architecture. Consider an organization having UI developer team who will be coordinating with the Backend service team, who will in-turn depend upon the DB team managing the database.
Then they choose an architectural style similar to the below image,
The domain object will be traversing the entire architecture layers. When they try to make a change to any specific domain code which belongs to a particular module, even a simple cosmetic change, would require the entire application to be rebuilt, tested and deployed as a whole. Some changes would require the entire application to be system tested at least to a minimal level before shipping, increasing the time and resource cost.
Below are some of the monolithic architectures widely used in several organizations to save the time and cost incurred to architect and design, but difficult to maintain.
- Layered architecture (N-Tier)
- Pipeline architecture (pipes and filters)
- Microkernel architecture (plugin)
Distributed means multiple deployment units connected through remote access protocols.
Domain driven design is a very commonly used approach to explain about the distributed architecture. When an organization is dealing with the products which have priorities over architectural characteristics like Scalability, Availability, Reliability, Security, Performance, Deployability, Testability, etc., so that their end customers are served without any interruption of their services, they choose a distributed system. Unlike traditional monolith which deals with the technical separation, we get a horizontal separation of domains. The above layer separation is replicated to every domain.
There are certain ways in which they can be achieved.
- Service-based architecture
- Event-driver architecture
- Space-based architecture
- Service-oriented architecture
- Microservices architecture
There are some fallacies in distributed computing.
#1 Network is reliable
Every distributed architectural style has network coming into play between the separation of individual layers. Consider a service is hosted in a separate host machine has to be accessed by the User interface (web browser or native app), then this can only be achieved by sending messages in a certain format (REST, SOAP, RPC, etc.,) between these hosted 2 layers. Then obviously network is the primary role player in this architecture. Unlike monolith where the layers reside in a single system, distributed layers should communicate reliably to exchange information. Frequent network disconnections could make the entire system to fail and unusable.
#2 Latency is 0
We have the network, OK, but what about its speed. Consider your organization is setting up a distributed system in a 3rd party client environment, where there exists a network with higher latency. If you are developing a service based application which responds to a browser application and you have implemented RPC protocol for communication, which has a ping mechanism to check if the service is connected or not. If the latency is higher then it might lose certain pings to be reached in a definite time and enters into a timeout phase frequently until disconnection.
#3 Bandwidth is infinite
In the same above example, consider there are around 1000 employees who use the same browser application which are currently interacting simultaneously with the hosted service. If a single browser app sends and retrieves 1 MB of data, then 1000 browsers will be communicating 1000 MB of data at a time. What happens if there are more recruitments happening and still more 1000 added who will be adding another 1000 MB of data in the network which increases the contention. The internal design of the system should be able to restrict the amount of data sent, by filtering the actuals from the whole set, instead of sending the entire set.
#4 Network is secure
Once we place an application communicated over the network, it is not safe. The endpoints which are exposed internally as well as to outside networks through VPN, where an employee could access the system from WFH scenarios, should be secured enough. The architect have a tendency to forget about security with firewalls and circuit breakers in place which will be managed by the internal IT team. The architect should have frequent communication with IT team to check about the types of attacks they have already went through and the resolution provided to incorporate them in their own system, by employing a fitness function to validate the security principle during Continuous integration.
#5 Network is homogenous
“Assumptions are the cause of all issues than introspection and interrogation”
Developers tends to assume that all the network hardware vendors are same, as they support the same network topology. Its not true for all cases. This ties back to all of the other above fallacies.
We have seen the fallacies, but what about handling certain common features like logging and transactions posted to a database in a distributed world.
An architect should consider about the distributed nature of the system, and how the maintenance team is able to pin point an issue and resolve at the earliest by going through the logs available at different endpoints in the system. There are several logging consolidation tools to aggregate the log data and provide the analysis results in a neat user interface. www.splunk.com
Monoliths deals with database transactions in an efficient automated manner using ACID. Distributed systems dealing with distributed services should be able to provide a programmatic way of dealing with the database transactions. Similar to ACID in RDBMS, there is a term called BASE which deals with the distributed transactions in non relational databases.
B — Basic Availability S — Soft state E — Eventual consistency
In distributed system like microservices, it is not possible to perform an ACID transaction, as the consistency is not immediate, but it will be achieved at an undefined point of time, when the system reaches an idle state, or when there is a data retrieval operation from the front end service. The database service should be available for performing the Read/Write operations, and it should maintain a state which could be rollbacked if there is a failure in any one of the subsequent transaction in dependent services fails and it should maintain the consistency eventually.
Transactional sagas is such a design pattern which deals with the management of distributed transaction.