How to approach Software Architecture? A First Principle Perspective

10 min readApr 12, 2024

The goal of the Architecture of an App

We make many decisions while building an app (an application, a web app, or a mobile app). What is the best way to think about those decisions?

For example, let’s consider a fresh produce delivery application, “GreenProduce,” that enables users to find fresh produce, buy them, and get them delivered through a mobile application. Let’s assume you are the architect of this application, and let us walk through your meta-level design process and the decisions you may make.

First of all, let’s single out our goals. Our goal is not just to get something working but to write an application that can evolve with the user’s needs and our understanding of the world.

If all we need is something working, soon we can ask ChatGPT. It could generate code in assembly language, which will run very fast. However, such a solution would not work well. When something changes (e.g., requirements change) or something unexpected happens ( bug, data inconsistency) in the application, we need to adapt, and it is tough to do without understanding the code.

It is essential to understand most architectural best practices ( loose coupling, abstractions, layers, interfaces, etc.) and concepts ( services, microservices, MOC, three tiers, databases, etc.) are created to facilitate human comprehension and allow for localized, modular updates without impacting the entire system. Over the application’s life cycle, the code may undergo many changes. Those changes evolve into a path among many possible futures. This creates uncertainties we need to cope with. I see handling those uncertainties as a primary theme across our architectural decisions.

What kind of changes do we have to handle?

Support more users or users in new markets.
New features or changes to features
Adept to changes in dependencies or frameworks (e.g., libraries, APIs, cloud platforms), including changes to pricing models
Incorporating technology advancements
Handling regulatory requirements (e.g., General Data Protection Regulation (GDPR), Payment Card Industry Data Security Standard (PCI DSS), The Brazilian General Data Protection Law (LGPD))
Handling new security attacks — e.g. Apache Log4j (CVE-2021–44228), Apple iOS vulnerabilities (CVE-2023–41992, CVE-2023–41991, CVE-2023–41993)
UX improvements as we understand the user better or because user behaviors are shifting
Bug Fixes and Maintenance

In the next section, let’s explore different approaches to cope with these changes.

Agile Methods + Think Deeply

Try 1: Waterfall

When trying to minimize the impact of such changes, the first question is, can we reduce the changes? Can we anticipate the changes we require and build architecture for that? Through the waterfall method, we have tried and found the approach inadequate.

Try 2: Flexible Design

The next logical question is, can we build a very flexible system, one we can change as needed? Sometimes, this is even considered the goal of architecture. Following are some example techniques for flexibility.

Interfaces, Services, APIs
Modular Design
Microservices Architecture
Layered Architecture
Service-Oriented Architecture
Design Patterns
Cloud portability
etc

If these techniques to achieve flexibility were (close to) cost-free, using these techniques or others to achieve maximum flexibility could have been possible. However, techniques for flexibility have cost too. As Gregor Hohpe pointed out in his article “Architecture: Selling Options,” those techniques behave like stock options. We have to use options judiciously. E.g., cloud portability. ( Please see the article for more details).

Try 3: Agile

Therefore, instead of either extreme, we are now biased towards agile methods. Agile methods release often, get feedback as fast as possible, and use that feedback to find the right architecture. This is a great tool to handle uncertainty in your applications.

For example, we can release the GreenProduce application with limited items for a fixed locale area and use that feedback to learn and improve the system iteratively. We need to start with a minimum version that can add value to end users and, after that, keep adding value in short Sprints.

Is just the agile approach enough?

While agile is a great tool, some feedback may come too late. For example, you may find some changes require significant database and API overhaul, which is hard or impossible. Some changes, like API or data model changes, are very hard, if not impossible. So, thinking deeply about certain areas helps.

Unfortunately, we have limited time and effort. Hence, we can only think deeply about some things. So, what things should we think deeply about? If we go back to the goal, we get the answer. We need to think deeply about parts that are hard to change and other parts we can adjust as needed. This is a natural extension of Jeff Bezos’s “Reversible and Irreversible Decisions” to software architecture.

What are examples of things that are hard to change?

APIs — hard to change even across full rewrites of the system
Choice of database and database schema — hard to change even across rewrites of the system due to customer data
Frameworks we use to build the system
Cloud or hardware platform we build on top and the level of close integration with platform services
Integration points — external APIs, can it scale up? Are their alternatives?
The scale of the system
User and the security model
The unique capabilities of the system would be your competitive advantage and hence require careful handling, e.g., handling the risk of perishing produce.

Incorporating what we discussed so far, our answer while designing architecture is to think deeply about the parts that are hard to change, then use an agile approach with user feedback to find the right architecture.

Three Context Factors

Would that be enough to decide on the right architecture? For example, our GreenProduce application can be in any one of the following different contexts.

We are building it for a startup, planning to launch in a limited locale quickly, prove the concept, and get to series B and C. Startups can often attract talent. Likely, the system will be handling 100s-1000s of TPS.
We are building it for a large supermarket chain and plan to make it available region by region. It will be handling 1000s of TPS.
We are building it for the Department of Agriculture of Country X, where we plan to make it available nationwide.

Each instance of the system will have different scale requirements, timelines, and team skill levels. Hence, they will have different architectural considerations. Therefore, the following three factors (context) affect the right architecture, thus creating uncertainty.

Time to market
The skill level of the team
What is the performance sensitivity of the system / how much load must it face?

The first two follow the project management triangle or triple constraints ( time, quality, cost), where cost is often a proxy for the team’s skill level.

We can tell the first two are not the architect’s responsibility. Still, if architecture aligns with those constraints, it has a much better chance of success than architecture blind to the above constraints. I am concerned with the success of the project, not on the responsibility boundary.

About Performance sensitivity, we know that the architecture for a system that runs close to its limits and the architecture of a system that has enough margin to spare often need to be very different as the latter can use simpler techniques while the former requires the available best techniques.

Considering the above three context factors, we can answer the question of how to make good architectural decisions as follows. We need to think deeply about things that are hard to change but use an agile method with user feedback while considering the context factors also in the decisions.

Based on my 20+ year architecture journey involving projects like Apache Axis2 and many Fortune 500 use cases, I belive the above answer is a good approach. Our architectural decisions need to balance all these factors.

Four Guidelines

Finally, I have found the following four guidelines helpful for making decisions and avoiding common mistakes experienced architects make when handling uncertainty.

Guideline 1: “When can we rewrite the system?”

When we try to think deeply, we get into trouble if we try to plan too far into the future. For example, if our GreenProduce application startup tries to design an architecture that will continue to work even when they are unicorns, it will run into trouble. Likely, they will stuck in analysis paralysis, waste time, and their designs will most likely fail when the time comes. Another variation of this is called Google Envy, where we implement advanced capabilities (e.g., scale) we do not need. ( read Oz Nova’s article “You are not Google”). To avoid these, it is often useful to bind the scope, and one way of doing that is to ask when we can rewrite the system. Successful applications are rewritten many times. It is not unrealistic for a startup to plan to rewrite beyond series C., and it should. Often, more load means more revenue or better valuation, which means we will have more money to invest and rewrite the system. Typically, a system that can handle 10,000s of requests is 10x more complicated and complex than a similar system that can handle 100s-1000s of requests. If you can live with a simple design for a year, delay the complex design as long as possible. Choosing the complex design too early will slow you down while increasing risks and providing a terrible return on investment (ROI).

Guideline 2: Leaders need to make decisions and absorb the risks.

Uncertainty in decisions is a major cause of project delays. When we approach the system and balance different considerations, some invertible uncertainties and risks can’t be reduced to clear answers. For example, how much load does the system need to handle? Often, nobody knows. We tend to be too safe, wasting time and energy, or we get stuck on indecision, blocking progress. Risks are the leader’s responsibility, and he should shoulder the risks and give clear goals to the teams after hearing the available information. Sometimes, we should wait for more information, but that must be done consciously, knowing uncertainty often slows down the projects significantly.

Guideline 3: Sometimes architects need to tolerate duplications.

With both code and architecture, we say, “Do not repeat anything.” However, it is important to know that other costs are involved, and sometimes fixing the duplication costs you more. For example, a code that uses goto statements could minimize duplication because it can reuse every line of code, although such a code will be too complex. There is a right granularity before which reuse makes things worse. We need to be aware of that. For example, let us assume that we want to avoid duplicating the log module code in the GreenProduce app and decide to extract it as a service. It will lead to log service being the bottleneck and add unnecessary network traffic. I have made similar mistakes several times and paid back in project delays.

Guideline 4: Think from the user’s point of view obsessively.

As architects who understand how the system works, we can often make implementation easier by morphing the requirements at the cost of user experience. Sometimes, this happens unconsciously. This is one of the common mistakes good upcoming architects make.

For example, in the GreenProduce applications. When you search the app for tomatoes, the app can actually ship only a subset of the matching results depending on your location and remaining shelf life of each produce. Hence, it is complicated to show users only produce that can be delivered to them ( based on distance, the lifetime of the produce, etc.). Instead of doing the complicated backend change (e.g., divide area to cells, precalcuate items for each cell, and use that what calculate what user’s need), we would be tempted to show the user all the produce and tell him it is unavailable when he tries to buy it. However, that is a terrible user experience.

We should always be on the lookout to avoid losing the user’s perspective. We should keep asking, “Am I driving the requirements from that user needs vs. what is easy to implement?” Sometimes, we may have to adjust user requirements due to technical challenges (e.g., the use of client-side consistency models in some use cases). Still, those should happen only after careful consideration.

Summary

We build software systems to be able to evolve over time and reduce associated costs, or in other words, to reduce risks associated with uncertainty. We have a better chance of doing that when we can approach architectural decisions holistically rather than using architectural techniques from a bag of tools. The article discussed how. I hope you will walk away with the following takeaways.

The techniques we use have costs, too. Use them judiciously, avoiding cases where costs are higher than the gains. Prefer simple techniques over complex ones if the simple ones are sufficient.
Agile development, coupled with feedback, is an excellent tool for reducing uncertainty.
However, some feedback may come too late. So, we should couple agile techniques with deep analysis of parts of the system that are hard to change later.
Time to market, team skill level, and performance sensitivity affect most architecture decisions and should always factored in.
The following four considerations can help you avoid common mistakes when making architectural decisions.

Binding the scope using “When can I rewrite the system?” is often useful to avoid being overwhelmed with possibilities.
Leaders need to make decisions and absorb the risks
Sometimes, architects need to tolerate duplications.
Think from the user’s point of view obsessively.

If you enjoyed this post, you might also like my new Book: Software Architecture and Decision-Making. You can find more examples from the book.