Intellectual Control, Developer Abstraction, and Testing

Earlier this year I attended the O’Reilly Software Architecture Conference in NY. One of the keynotes was presented by George Fairbanks on the concept of Intellectual Control. Incidentally, George is the author of my favorite software architecture book, Just Enough Software Architecture. You can find the keynote here. He argued that software developers no longer maintain enough reasoning about their solutions with the focus on automated testing which provides more statistical control. He reviewed software development from the past (think 80’s and 90’s) and that teams were regularly deploying on a short, regular cadence without any automated testing. They could do this because the developers maintained a strict amount of intellectual control over their solutions, that is to say, they fully understood how the solution should behave without running/testing it (obviously they still tested it, just manually), giving them the confidence to deploy.

Nowadays developers lean heavily on automated unit and regression test suites to give them this confidence to deploy. But I’d say even this confidence is shaky at times with incomplete test suites or untrustworthy tests. George continued that this mentality results in a sort of “whack-a-mole” development where the developer writes code that breaks one or more tests, then the developer fixes the test blindly, often without a full understanding of why the test broke initially or how the fix materially impacted the solution. Rinse and repeat. George postulated that automated testing numbs us to our loss of reasoning.

As I think about the concept of Intellectual Control (IC) I think about how much of software development is abstracted away from the developer. And it seems like the amount of abstraction increases year to year. IDEs, frameworks, and third-party libraries all reduce a developers understanding of what is truly happening behind the curtain. The objective is to helps us deliver products faster, but at what cost? This abstraction creates a culture of trust in systems that, in at least some cases, haven’t earned it. Relying on a suite of automated tests, unit or other, is just another facet of this trust. Does the test suite truly represent a full validation of all the system capabilities? Is this mindset just creating a bunch of ticking time-bombs in production?

I came across this blog which discusses the concept of knowledge debt that seems to apply here. It discusses that as a programmer there is so much to learn and in order to get anything done we have to take on some knowledge debt, that is, using some code or library or capability that we don’t fully understand. This makes a lot of sense as if any programmer didn’t deliver any code until they knew exactly how everything underlying their code worked (libraries, dependencies, operating system, network, etc.), they would never deliver anything. Its turtles all the way down. There is always something more to learn.

This all boils down to risk. The risk of taking the time to fully understand all underlying software (and hardware) versus the risk of delivering a poor quality product with potential functional, security, performance, and other deficiencies caused by, or at least supported by, that underlying software. Our job as developers and architects is to balance that risk. We take time to focus on novel or critical dependencies and often ignore those that are used often or less critical. But even then, libraries can be updated, standards change, and expected practices evolve. When was the last time you took a deep dive into your respective HTTP client library?

We can also mitigate the delivery quality risk with thorough testing, preferably automated. But tests must cover all aspects of risk, at least the ones with which we are concerned. In his keynote, George postulated that tests create a kind a statistical control where we can run the test suite and get a result, maybe expressed as a number or percentage, representing the compliance of the software. However, intellectual control can apply to testing as well. How much do we understand our tests? How confident are we that they are thorough and cover all aspects that are important to our delivery? The statistics provided by the tests are meaningless if we are not testing the right things.

It is important to maintain IC over the software, and that extends to the tests that verify the software. If we lack IC on the tests then we lack a true understanding on the overall quality of our delivery. We should balance time spent gaining/maintaining IC of our solution with time spent gaining/maintaining IC of our verification mechanisms.

Software engineer, tech lead, architect blogging about my thoughts and experience with software development, design, architecture, agile, and security.