Author: Robert C. Martin
“Getting something to work once just isn’t that hard. Getting it right is another matter entirely. Getting software right is hard. It takes a passion for the craft and the desire to be a professional.”
Review & Thoughts
For decades, the parallels between software design and the architecture of physical structures have been a recurring source of inspiration for new paradigms and methodologies in the field of software engineering. British architect and design theorist Christopher Alexander is credited as the first to use the word “pattern” to describe architectural techniques in the 1970s. It wasn’t long before the term was being used to describe programming techniques, with plenty of accompanying literary works to permanently solidify the metaphor as industry canon.
But this is not a book of software patterns. It’s a book about software architecture. More accurately, it’s a book about the principles of software architecture. It’s an end-to-end initiation into the world of designing software systems. Understanding how to design an effective system is instrumental to your ability to contribute to it. Whether or not your job title has the word “architect” in it, you are impacting the architecture of that system with every commit. Design principles like SOLID guide you to compose quality components. Architectural principles guide you to define and arrange those components into a flexible and dynamic structure.
In that regard, the architecture of a building and that of a software application differ. A physical structure must be strong and rigid, resistant to the forces of change. A software structure must be fluid and adaptable, able to shift and evolve with the forces of change. Robert C. Martin’s no-nonsense, axiomatic principles will establish in you an innate sense for software architecture — you will learn to see the system as a living, breathing organism with data flowing through its capillaries. You will learn to feel it.
Clean Architecture offers no shortage of Martin’s trademark style of plain, direct, no-holds-barred advice. He’s a man with unshakeable clarity, and is genuine to a fault. Thus, it is a challenge to criticize his diction, however irreverent it may be at times. Instead, my more specific insights can be found below as commentary alongside some of my favorite excerpts. The only parting comment I will offer in summary is that Clean Architecture has earned a permanent spot on my shelf among the most formative books I’ve read.
Support This Blog
Help keep the lights on and expand your library! Get a copy of Clean Architecture on Amazon using the link below.
💡 Affiliate link: https://amzn.to/2T1n6YJ
Excerpts & Commentary
“[Younger programmers] might insist that everything is new and different nowadays, that the rules of the past are past and gone. If that is what they think, they are sadly mistaken. The rules have not changed. Despite all the new languages, and all the new frameworks, and all the paradigms, the rules are the same now as they were when Alan Turing wrote the first machine code in 1946. But one thing has changed: Back then, we didn’t know what the rules were.”
“Getting something to work once just isn’t that hard. Getting it right is another matter entirely. Getting software right is hard. It takes a passion for the craft and the desire to be a professional.”
What Is Design and Architecture?
“The measure of design quality is simply the measure of the effort required to meet the needs of the customer. If that effort is low, and stays low throughout the lifetime of the system, the design is good. If that effort grows with each new release, the design is bad. It’s as simple as that.”
This is ETC in a nutshell. Always tend towards the design that’s easier to change later.
“Most modern developers work their butts off. But a part of their brain does sleep — the part that knows that good, clean, well-designed code matters. These developers buy into a familiar lie: ‘We can clean it up later; we just have to get to market first!’ Of course, things never do get cleaned up later, because market pressures never abate.”
“The bigger lie that developers buy into is the notion that writing messy code makes them go fast in the short term, and just slows them down in the long term. The fact is that making messes is always slower than staying clean, no matter which time scale you are using.”
A Tale of Two Values
“Software developers often feel as if they are forced to jam square pegs into round holes. The problem, of course, is the architecture of the system. The more this architecture prefers one shape over another, the more likely new features will be harder and harder to fit into that structure.”
“Dijkstra once said, ‘Testing shows the presence, not the absence, of bugs.’ In other words, a program can be proven incorrect by a test, but it cannot be proven correct. All that tests can do, after sufficient testing effort, is allow us to deem a program to be correct enough for our purposes.”
OCP: The Open-Closed Principle
“Clearly, if simple extensions to the requirements force massive changes to the software, then the architects of that software system have engaged in a spectacular failure.”
ISP: The Interface Segregation Principle
“Depending on something that carries baggage that you don’t need can cause you troubles that you didn’t expect.”
DIP: The Dependency Inversion Principle
“The Dependency Inversion Principle (DIP) tells us that the most flexible systems are those in which source code dependencies refer only to abstractions, not to concretions.”
“We tend to ignore the stable background of operating system and platform facilities when it comes to DIP. We tolerate those concrete dependencies because we know we can rely on them not to change. It is the volatile concrete elements of our system that we want to avoid depending on. Those are the modules that we are actively developing, and that are undergoing frequent change.”
In other words, you shan't be reprimanded for coupling your code with ArrayList and HashMap. Focus your attention elsewhere.
“Interfaces are less volatile than implementations.”
“Inheritance is the strongest, and most rigid, of all the source code relationships; consequently, it should be used with great care.”
The coupled relationship of one class that uses another is analogous to a cancerous growth on the skin. Over time, it’s likely to worsen, but it can usually be excised without causing significant damage to the overall structure. The coupled relationship of one class that extends another is analogous to a neurological disease. Every aspect of the behavior of the derived class is at the mercy of its parent.
Inheritance also tends to result in untrustworthy code. It is untrustworthy because you cannot depend on the contracts of a sub-class to match the contracts of its parent. This is precisely the basis for the Liskov Substitution Principle which, like any principle, is almost impossible to enforce. The best way to guarantee your code won’t be sabotaged by renegade sub-classes is to avoid inheritance altogether. And if you must extend, at least avoid overriding concrete methods of a parent class.
Mocking a class for testing purposes is the major exception to this rule. However, be careful when you violate code contracts at the test level. Without discretion, you might create unreliable tests by inadvertently making exceptional behavior acceptable.
“DIP violations cannot be entirely removed, but they can be gathered into a small number of concrete components and kept separate from the rest of the system.”
“Components are the units of deployment. They are the smallest entities that can be deployed as part of a system. In Java, they are jar files. In Ruby, they are gem files. In .Net, they are DLLs. In all languages, they are the granule of deployment.”
With project management systems like Maven, it’s fairly easy to write software in physically separate components, each compiled into a separate jar file. However, for the more monolithic applications that consist of a single (or small number of) deployable artifacts, you can still separate components at the source code level. The way you think about modularity needn’t change.
Later in the book, in a bonus chapter written by Simon Brown, an alternative definition of “component” is offered: “A grouping of related functionality behind a nice clean interface, which resides inside an execution environment like an application.” Brown mentions a few different techniques for compartmentalizing (or packaging, in Java terms) software entities: by layer, by feature, or by component. He finds the first two to be sub-optimal and favors the latter. Since he defines a component as consisting of related functionality, packaging by component allows you to confine any coupling within the component itself, only exposing abstractions (“a nice clean interface”) to the outside world.
“The classes and modules that are formed into a component must belong to a cohesive group. The component cannot simply consist of a random hodgepodge of classes and modules; instead, there must be some overarching theme or purpose that those modules all share.”
“Classes and modules that are grouped together into a component should be releasable together.”
Again, using a system like Maven to manage a project makes this somewhat trivial. However, if you treat Martin’s advice here as dogma, you’d have more Maven projects and jar files than you could realistically manage. It is the responsibility of the architect to decide when to draw physical component boundaries between independently deployable jars and when to draw those boundaries at the source code level, perhaps using Java packages.
“The SRP tells us to separate methods into different classes, if they change for different reasons. The CCP [Common Closure Principle] tells us to separate classes into different components, if they change for different reasons.”
“We want to make sure that the classes that we put into a component are inseparable — that it is impossible to depend on some and not on the others.”
“The CRP [Common Reuse Principle] says that classes that are not tightly bound to each other should not be in the same component.”
“The REP [Reuse/Release Equivalence Principle] and CCP [Common Closure Principle] are inclusive principles: Both tend to make components larger. The CRP [Common Reuse Principle] is an exclusive principle, driving components to be smaller. It is the tension between these principles that good architects seek to resolve.”
“Any component that we expect to be volatile should not be depended on by a component that is difficult to change. Otherwise, the volatile component will also be difficult to change.”
“One sure way to make a software component difficult to change is to make lots of other software components depend on it.”
“If all the components in a system were maximally stable, the system would be unchangeable. This is not a desirable situation. Indeed, we want to design our component structure so that some components are unstable and some are stable.”
“Database schemas are notoriously volatile, extremely concrete, and highly depended on. This is one reason why the interface between OO applications and databases is so difficult to manage, and why schema updates are generally painful.”
What Is Architecture?
“A software architect is a programmer, and continues to be a programmer. Never fall for the lie that suggests that software architects pull back from code to focus on higher-level issues. They do not! Software architects are the best programmers, and they continue to take programming tasks, while they also guide the rest of the team toward a design that maximizes productivity.”
Alright Mom and Dad, I know what I wanna be when I grow up!
“The architecture of a software system is the shape given to that system by those who build it. The form of that shape is in the division of that system into components, the arrangement of those components, and the ways in which those components communicate with each other.”
“The way you keep software soft is to leave as many options open as possible, for as long as possible.”
“The goal of the architect is to create a shape for the system that recognizes policy as the most essential element of the system while making the details irrelevant to that policy. This allows decisions about those details to be delayed and deferred.”
Martin uses the term “policy” a lot in this book. He is referring to business logic on a broad scale. In other words, the policy is the purpose of the software component. It is the highest level of all. It’s the answer to the question: “What does it do?”
He also uses the term “details” a lot. He defines it thusly: “The details are those things that are necessary to enable humans, other systems, and programmers to communicate with the policy, but that do not impact the behavior of the policy at all. They include IO devices, databases, web systems, servers, frameworks, communication protocols, and so forth.” Indeed, interactions with details compose the majority of most software systems! Think about all the code required to wire up a persistence layer or a GUI. Treating the other end of these interactions as merely irrelevant detail is striking — but true! At least, we should strive to make it true. If you can’t swap your MySQL database with Oracle database without rewriting business logic, there is a lot of room for improvement in that architecture.
“Good architects carefully separate details from policy, and then decouple the policy from the details so thoroughly that the policy has no knowledge of the details and does not depend on the details in any way.”
“There is true duplication, in which every change to one instance necessitates the same change to every duplicate of that instance. Then there is false or accidental duplication. If two apparently duplicate sections of code evolve along different paths — if they change at different rates, and for different reasons — then they are not true duplicates.”
Wow, mind blown. Duplication is not always wrong? That is revolutionary thinking! But actually, it makes perfect sense, and most of us know this subconsciously. Have you ever written some specialized data structure, or wrapped a library class to serve some slightly more specific purpose? Probably. Did you think what you were doing was groundbreaking? Did you envision yourself patenting your invention, publishing it in a prestigious scientific journal, and ultimately receiving the Turing Award in recognition of your unassailable ingenuity? Probably not. You created a convenient solution for a specific problem — a domain-specific problem. The same solution likely exists elsewhere, but for a different problem in a different domain.
Let’s say you need to describe the “size” of strings in three categories: small, medium, and large. Your requirement states that strings under 1,000 characters in length are considered small. Between that and100,000 they’re considered medium. Beyond that, they’re large. For your project, you will use these classifications to decide how to persist data. Small data will be cached, medium data will be written to flat files, and large data will be stored in a remote database.
After spelunking through the codebase for some time, you discover that an engineer on Team X has already created a class called StringSizer for their project. As if by cosmic destiny, they define their sizes the exact same way! Surely, any diligent programmer would not rewrite the exact same code. No, that wouldn’t be DRY. Instead, you reuse the StringSizer from Team X and proceed happily to complete your project. The code works, the stakeholders are satisfied with your deliverable, and the feature ships.
Fast forward to the next release, and Team X has received some feedback from the end-users. Their feature, an archiving tool that uses StringSizer to provide a user-friendly description of log files, has some new requirements. Turns out, most of these log files are much longer than 100,000 characters. Some are even hundreds of millions of characters long! They quickly get to work updating StringSizer with the new specifications: Strings under 1 million characters are small. Between 1 and 100 million, they are medium. Above that, they are large. Again, the code works, the stakeholders are satisfied with the results, and the update is released to the customers.
Suddenly, complaints start coming in of crashing servers, out-of-memory errors, hard disks flooded with gigantic temporary files, and surprisingly low database activity. Team X’s change has caused your feature to flood the heap and the filesystem with an enormous amount of data. Ouch!
The problem is, even though the code was exactly the same, the requirements were never the same. The high-level policies of these two projects were never aligned. What appeared to be duplication was just an illusion — a dangerous illusion indeed.
“Resist the temptation to commit the sin of knee-jerk elimination of duplication. Make sure the duplication is real.”
“Communications across service boundaries are very slow compared to function calls.”
“Ideally, the code that represents the business rules should be the heard of the system, ith lesser concerns being plugged in to them. The business rules should be the most independent and reusable code in the system.”
“Architecture should not be supplied by frameworks. Frameworks are tools to be used, not architectures to be conformed to. If your architecture is based on frameworks, then it cannot be based on your use cases.”
The Clean Architecture
“No operational change to any particular application should affect the entity layer.”
Martin’s onion-like description of good architecture features business entities at the core. He describes entities as anything that “encapsulates enterprise-wide critical business rules.” It might be an object with some methods, or a set of data structures and functions. They are the reusable building blocks of the high-level policy of an application.
“Typically the data that crosses [architectural] boundaries consists of simple data structures. You can use basic structs or simple data transfer objects if you like. Or the data can simply be arguments in function calls. Or you can pack it into a hashmap, or construct it into an object. The important thing is that isolated, simple data structures are passed across the boundaries.”
This is why standardized notations like JSON are so useful. Regardless of your programming language, operating system, platform, or framework, you probably have a variety of ways to parse JSON. It doesn’t really matter how you transfer data, as long as both ends know how to use it. Keep it simple and standard.
Presenters and Humble Objects
“At each architectural boundary, we are likely to find the Humble Object pattern lurking somewhere nearby. The communication across that boundary will almost always involve some kind of simple data structure, and the boundary will frequently divide something that is hard to test from something that is easy to test. The use of this pattern at architectural boundaries vastly increases the testability of the entire system.”
Martin’s explanation of the Humble Object pattern is a bit vague. But it boils down to separating stuff that’s hard to test from stuff that’s easy to test. The classical example is a view/presenter pattern for a graphical user interface. The view is the Humble Object, responsible for the hard-to-test stuff, and kept as simple as possible. The presenter is the testable piece, through which we can pass data and easily see the outputs based on the inputs. For example, the presenter might be responsible for turning an epoch timestamp into a human-readable format. The view is just responsible for sending the result to the screen. The goal is for the hard-to-test code at these boundaries to be nothing more than a thin membrane. We get as close as possible to that boundary and ensure maximum correctness of the flow of data as it passes into the aether — the UI, the database, a web service, anywhere but here.
Layers and Boundaries
“So what do we do, we architects? The answer is dissatisfying. On the one hand, some very smart people have told us, over the years, that we should not anticipate the need for abstraction. This is the philosophy of YAGNI: “You aren’t going to need it.” There is wisdom in this message, since over-engineering is often much worse than under-engineering. On the other hand, when you discover that you truly do need an architectural boundary where none exists, the costs and risks can be very high to add such a boundary. So there you have it. O Software Architect, you must see the future. You must guess — intelligently. You must weigh the costs and determine where the architectural boundaries lie, and which should be fully implemented, and which should be partially implemented, and which should be ignored.”
This dilemma can make for some challenging code reviews. There’s a lot of nuance and subtlety in these decisions. Don’t fear subjectivity. Be comfortable wading through the gray area, where there is no right or wrong, and trust your instincts.
The Main Component
“Think of Main as the dirtiest of all the dirty components.”
What a burden this sentiment has lifted! It’s okay to break the rules when you are setting up your application for success — the setup code should help the rest of your code behave. Martin calls it “Main” in reference to the ubiquitous “main method” which is the entry point of your application, but unless that application is a single-threaded routine with a simple task, the principle applies elsewhere. Any time you are launching a process of some sort, there will likely be setup code involved. Use this as a place to inject your dependencies, assemble the required components, and give the process the tools it needs to succeed. If that requires breaking a few rules, so be it. Just keep the dirt in the pot.
Services: Great and Small
“Services that simply separate application behaviors are little more than expensive function calls, and are not necessarily architecturally significant.”
“Services, in and of themselves, do not define an architecture.”
“Services are, after all, just function calls across process and/or platform boundaries. Some of those services are architecturally significant, and some aren’t.”
“Think of a service in Java as a set of abstract classes in one or more jar files. Think of each new feature or feature extension as another jar file that contains classes that extend the abstract classes in the first jar files.”
“As useful as services are to the scalability and develop-ability of a system, they are not, in and of themselves, architecturally significant elements. The architecture of a system is defined by the boundaries drawn within that system, and by the dependencies that cross those boundaries. That architecture is not defined by the physical mechanisms by which elements communicate and execute.”
The Test Boundary
“From an architectural point of view, all tests are the same. Whether they are the tiny little tests created by TDD, or large FitNesse, Cucumber, SpecFlow, or JBehave tests, they are architecturally equivalent.”
“You can think of the tests as the outermost circle in the architecture. Nothing within the system depends on the tests, and the tests always depend inward on the components of the system.”
“The extreme isolation of the tests, combined with the fact that they are not usually deployed, often causes developers to think that tests fall outside of the design of the system. This is a catastrophic point of view. Tests that are not well-integrated into the design of the system tend to be fragile, and they make the system rigid and difficult to change.”
“Imagine a test suite that has a test class for every production class, and a set of test methods for every production method. Such a test suite is deeply coupled to the structure of the application. When one of those production methods or classes changes, a large number of tests must change as well. Consequently, the tests are gradual, and they make the production code rigid.”
That was hard to read, and even harder to highlight. However, as with drawing architectural boundaries, you must have foresight. You must make the decision to test thoroughly when you believe it is justified. Refactoring legacy code is a prime example of a case where you’d want very thorough tests in place. For new code, use your best judgment. Consider how stable the code is likely to be over time, and test accordingly.
Clean Embedded Architecture
“Firmware does not mean code lives in ROM. It’s not firmware because of where it is stored; rather, it is firmware because of what it depends on and how hard it is to change as hardware evolves.”
“Non-embedded engineers also write firmware! You non-embedded developers essentially write firmware whenever you bury SQL in your code or when you spread platform dependencies throughout your code. Android app developers write firmware when they don’t separate their business logic from the Android API.”
Funny, I had never really thought about the word “software” the way Martin describes it in this book. Software should be about softness. Coupling is the antithesis of softness. Firmware is firm because it is coupled with the hardware. If your program is coupled with third-party frameworks, a certain type of web server, or a particular flavor of RDBMS, how soft can it be?
“Stop writing so much firmware and give your code a chance at a long useful life.”
“Learn what works, then make a better solution.”
“There is much more to programming than just getting an app to work.”
The Database Is a Detail
“From an architectural point of view, the database is a non-entity — it is a detail that does not rise to the level of an architectural element. Its relationship to the architecture of a software system is rather like the relationship of a doorknob to the architecture of your home.”
“The structure you give to the data within your application is highly significant to the architecture of your system. But the database is not the data model.”
Pretty profound wisdom, especially in enterprise software. It can seem like the application is a mere lump of clay being forcibly wrapped around a database. You might shape the surface a bit, but you’re not fooling anybody. Everyone can see that you’ve got a hulking database lurking beneath that conspicuous veneer. The application just reeks of database odor. Even the UI looks like an arrangement of tables and columns! Tragic indeed, an entire system designed around an insignificant detail.
“There is nothing architecturally significant about arranging data into rows within tables. The use cases of your application should neither know nor care about such matters.”
“Many data access frameworks allow database rows and tables to be passed around the system as objects. Allowing this is an architectural error. It couples the use cases, business rules, and in some cases even the UI to the relational structure of the data.”
Okay, I’m starting to suspect Bob knows where I work.
“The organizational structure of data, the data model, is architecturally significant. The technologies and systems that move data on and off a rotating magnetic surface are not.”
The Web Is a Detail
“There are always marketing geniuses out there just waiting to pounce on the next little bit of coupling you create.”
“The GUI is a detail. The web is a GUI. So the web is a detail.”
Frameworks Are Details
“Frameworks tend to violate the Dependency Rule. They ask you to inherit their code into your business objects — your Entities! They want their framework coupled into that innermost circle. Once in, that framework isn’t coming back out. The wedding ring is on your finger; and it’s going to stay there.”
“You can use the framework — just don’t couple to it. Keep it at arm’s length. Treat the framework as a detail that belongs in one of the outer circles of the architecture. Don’t let it into the inner circles.”
The Missing Chapter (by Simon Brown)
“If you make all types in your Java application public, the packages are simply an organization mechanism (a grouping, like folders), rather than being used for encapsulation. Since public types can be used from anywhere in a code base, you can effectively ignore the packages because they provide very little real value. The net result is that if you ignore the packages (because they don’t provide any means of encapsulation and hiding), it doesn’t really matter which architectural style you’re aspiring to create.”
I think the effectiveness of Brown’s point is a bit lost here. I think any competent engineer would understand why making everything public is a silly idea. It is far more common to see monolithic packages that contain way too many classes. Your Java packages should conform to the Common Reuse Principle and the Common Closure Principle. They should be modular components of highly inter-related entities, insulated from the outside world. The key takeaway is that Java packages are an extremely useful tool — a precision instrument for composing and enforcing architectural boundaries at the source code level. Learn to use them appropriately.
I'd love to know how you're enjoying this experimental Distilled series. Do you like the format? Is there something you want more or less of? Share your thoughts in a comment!