Code like a systems thinker

Kushal Joshi
8 min readOct 8, 2018

--

Software engineering is unique among professions for two main reasons: Its relative youth (around 58 years old) and the steadfast separation of industry practioners and academics. While there are new studies that use social and organisational theory to examine industry practice, and studies that view software engineering as a ‘design’, I’m going to expand on a article about good naming practice from industry, from the perspective of systems thinking and learning loops. I’m still not seeing enough adoption of this view in coder culture. Or I’m missing it (please send links!).

Arlo Belshee’s Good Naming article (2015) is a notable article expanding on the human-computer-interaction problem of naming in software engineering and touches on the ‘systems’ effects of naming which I will expand on here. I like Belshee’s approach as it doesn’t just repeat Chapter 2 of Art of Readable Code (Boswell and Foucher, 2011) but starts a further exploration of the systemic impact of bad naming that could lead to increased technical debt, and a notes a practical “learning loop” to cycle through to when evolving names. To be clear, Belshee considers the Domain (roughly, the context the code is written for) but here I am talking about the codebase and its context as a System (an organised entity of interdependent parts), to expand on Belshee’s ideas. If you haven’t read it and write code (any code), do have a read. Then please come back :)

I had a little trouble with a couple of definitions which I want to clear up — although I’m happy to put this down perspective and exposure; systems thinking and learning cycles are not discussed as commonly in computer science and amongst developers, as they really should be.

Belshee asks:

“We’re good at reading complex code, and our job is to update code. Shouldn’t the definition of technical debt be something about the cost and risk of changing code?

Freelancing on client projects can involve regular switching between (for me) JavaScript, Python, functional languages (Elm/ReasonML), data analysis languages (R), and often something higher level like pseudo-code or mathematical econometric algorithms, or on a bad day… Bash. This means complexity destroys my time. logical, reasonable, conventions accelerate productivity. More and more developers are moving into polyglot contexts where complexity destroys efficiency.

That is secondary to the critique above, however:

…our job is to update code. Shouldn’t the definition of technical debt be something about the cost and risk of changing code?

This is Belshee asking the obvious question. Possibly due to the widely-experienced need for communication of what technical debt is and how it impacts everything software engineers do. Viewed from a business-operations and systems thinking background, however, this question is quite frightening; who is asking this question? Is it not obvious? Is every developer and CTO not aware of this? The question sounds more like an obvious fact being explained to someone with the wrong mindset: Maybe to someone with localised stone-cutter thinking instead of big-picture cathedral-builder thinking (Stone-cutters and Cathedral-builders, Girard & Lambert, 2007).

The cost and risk, however, is far more highly coupled than when laying masonry. Stone-cutter attitudes may not make a substantial difference to a wall that is being built; but when building the front-end of a new product, or when laying out an new API design, or app architecture, stone-cutter thinking can cause code to contain badly-architected, rushed, unmaintainable, inflexible, code, that not only leads to functional/behavioural bugs but becomes very hard to maintain, debug, or innovate around. This is not news. All developers have experienced this. The line of thought here is how we can build out from Belshee’s naming loop to recognise the system we are naming in and why we are naming this way. Our job should be to communicate through code, not just update it.

We already know how to do this in other domains. Economics, statistics, various other branches of mathematics, and the sciences, all lean on symbols conventions that form algorithms and refer to actions being to data, within a defined context. They do this succinctly and safely. Safely because they are conveying information, not creating a machine that will replicate itself a million times over — and we often forget this.

Cost and Risk

AI and analytics code is notorious for this. The good habits and conventions developed over the last decade of web app development are just starting to filter through to this parallel universe of analytical code written in Python, R and other statistically focussed platforms. Until then we will need to continue whingeing about terrible function names, single letter variable names, and more. Bad naming is an artefact of bad/rushed code that does not respect the ‘system’ the code is in.

The impact of not evolving names highlights the core challenge in todays multi-context codebases. Switching from front-end, to back-end, to data-ops, to statistical-code, to some cryptographic blockchain brain-fart of a code-base, we, the “builder”, have to piece together and hold functioning systems in our minds. We then log the design of the systems within code for the next contractor brave (gullible?) enough to surf the repo. There are two implications here:

  1. The obvious one: The code is both the blueprint and the ‘cathedral’ and should be treated at such. It is design and product in one. We can go full Dipak Chopra metaphysical quantum existence here but I will leave that for you to expand on if you so desire.
  2. The non-obvious one: The developer that writes the code and any future developer that reads and updates the code are part of a system. This system exists for the life-time of the codebase. It’s output is to provide the features of the codebase with maximum available performance.

Going back to Belshee’s question, the answer becomes more obvious; that second implication means the following: The ‘codebase-system’ needs to provide its required service/feature availability with minimum downtime and maximum flexibility and dependability. In the future, code may change. The unchanged code must be dependable — and understood by humans. In the future there may be innovations — and human developers need to grok existing code fast to be able to iterate or rewrite around innovation objectives. In todays hyper-competitive software industry, limiting human understanding of code degrades the software-organisation’s ability to innovate responsively around market changes.

This can mean either death, for a startup, or exponentially scaling development costs for organisations in which generations of stone-cutter type developers have written locally-optimised code that doesn’t respect the system formed of the codebase and all current-and-future-developers. This seems to be the “cost and risk of changing the code” that Belshee is linking to technical debt and the role of developers.

Root Causes

Belshee:

“So if our definition of technical debt is code that is difficult, expensive, or risky to change, then the root cause of that is code that is hard to scan.

Behind this simple statement is the system described above. Hard to scan code can also be seen as code that is not fit for purpose for the system in which it exists. If we accept for a moment that the purpose of code is both performance and communication, naming becomes a leading high order concern. I disagree a little with the root cause however as I feel that when “code is hard to scan” there is an underlying reason.

Why would a developer follow or not follow the steps and leave a good or bad name for a function, or library, or test? The root is the mindset — stone-cutter thinking will be looking to finish the task at hand, attach a quick unit test to ‘prove’ their work, and move on (measure the stone, slap it on the wall, and move on). A cathedral-building systems thinker will hold, amongst other concerns, the context of the system in their mind — the purpose of the code within in it’s domain, the performance relative to how the code has been written, the team of current developers, the lifetime of the system, and the intent and content of what she/he is communicating to future developers (or to themselves in 3 weeks). The communication part of this thinking is what Belshee’s process seems to be capturing.

Learning loops

Belshee’s naming steps, or stages:

1. Missing

2. Nonsense

3. Honest

4. Honest and Complete

5. Does the Right Thing

6. Intent

7. Domain Abstraction

Belshee’s “Insight Loop”:

1. Look at something.

2. Have an insight.

3. Write it down.

4. Check it in.

The naming steps and the debt-reduction process smell so much like a learning loop, I find it impossible not to push it further in that direction. In fact Belshee calls it an “insight loop” in the following section. This is why I said earlier that we need more discussion of this sort in software engineering. What Belshee has noted down is a mental map of how he acts when naming. In 1974 Argyris and Schön defined this as a Theory of Action. Redrawing the steps and process as a learning loop we see how this maps to single-loop-learning:

In the steps laid out, Belshee seems to go around this cycle 7 times to get to the quality of name that he identifies as (my interpretation) maximally performant within the system. We can draw from the work of Argyris and see that in software engineering we already have many double-loop-learning loops that can improve this process. We use code reviews by senior developers and testers to check (external information/validation) whether the naming and information communicated is appropriate. We have pair-programming which allows for rapid learning cycles through the communication between developers as they code. Finally we have Agile driven (or SCRUM if you really insist) retrospectives where we examine the system as a whole and discuss the good and bad of the system:

Reflection

Why do I think this is an appropriate use of learning loops? Argyris and others examined how a firm can behave as a learning organisation. The assumption I’ve made here is that all (past, current, and future) developers that exist within the lifecycle of a codebase are a learning organisation that embed their learnings within the code. This knowledge primarily exists in the form of names (names for variables, functions, tests, libraries, domain abstractions etc), as described by Belshee’s naming process.

I think this highlights the power and importance of systems thinking for developers. A cathedral-builder mindset and a systems thinking approach can carry significant weight when it comes to reducing cost and risk, and improving future operational performance (the availability, reliability, and flexibility of a software product), for firms leaning on software innovation for their competitive advantage.

(Title Photo by chuttersnap on Unsplash)

--

--

Kushal Joshi

Data Science/AI, Blockchain, (Data) Automation and Technology Strategy.