Productivity Squared: Automation and Critical Mass
In my last post, I argued that as developers we’ve been doing it wrong by not engineering our delivery solutions using our core skillset. When we write business software, we sometimes take shortcuts we know are not the best solution, but we normally try to find the best solution, because we know we’ll be living with it. When we write a build script or a Bash script to automate something, we nearly always go for the quickest, nastiest hack, ignoring the fact that we’ll be living with that also, and it has the potential to make our lives miserable and ultimately impact those of our users. Or even worse, we don’t automate at all, and continue to perform a repeated task in an error-prone, ad hoc way.
Without approaching automation strategically, we lose by stumbling into an unmaintainable mess and foregoing core concerns such as testability. But there’s something arguably worse: What we fail to gain. A joined up solution is more valuable than the sum of its parts.
The term critical mass is used in business to discuss network effects. Metcalfe’s Law states that the “effect of a telecommunications network is proportional to the square of the number of connected users of the system (n²).” It seems applicable to social networks. Recent history suggests its relevance for languages, module systems and frameworks, where the “network” is the specific technology and n is the number of users.
Metcalfe showed the figure below to argue that if a network is too small its cost exceeds its value, but if a network gets large enough to reach a critical mass, then the sky is the limit.
The key point of this graph is at first, doing quick and dirty things that aren’t part of a “network” may be cheaper. But when we reach critical mass, the potential of the network is non-linear and rapidly outstrips ad hoc solutions.
We instinctively act on this insight in our work on business applications. Most of us choose to use mainstream libraries and frameworks, because they exhibit network effects. There’s a good chance someone has already implemented a common requirement as an
npm package or Spring Boot starter. When building applications themselves, we don’t implement every feature with a different framework that we think is quickest for just that thing.
However, we forget this important lesson when we consider automation. We resort to individual hacks rather than a joined up solution, so each automation is just as hard to write as the previous ones, and doesn’t enable further advances. We solve some problems with Bash scripts on someone’s machine, others in our IDE with a configuration option that isn’t applicable to the whole team, some with build scripts, some with Jenkins or Maven plugins. But nothing provides a core model for automations; nothing provides a consistent way of working with source code; nothing provides a core event model; nothing provides a data model we can interrogate to observe relationships and trace flows; and there is little or no discoverability of automations. There is no programming model which, once learnt, will enable other problems to be tackled quickly. For example, adding dependency vulnerability scanning on every push to the default branch is a per-project task that at best provides a sample for other projects, and doesn’t have anything in common with adding Checkstyle integration to Java projects, or with pulling down many projects to update a dependency. Each task we face usually means starting from scratch.
To achieve network effects and reach critical mass, we need a common model: what I call the API for software.
Certain things, such as Jenkins plugins, have shown the power of this in specific areas. However, we need something more ambitious and far-reaching, addressing code as well as process, and designed for testability.
We’re building the API for software at Atomist, but the need and the concept are bigger than any one company. Ultimately an automation hub should enable the “network” to scale globally, enabling companies and individuals to share contributions as both running services and reusable modules.
The “systemic value” of n² should grow to power a breakthrough in productivity like that achieved by frameworks such as Rails, Node and Spring Boot. We’re still only scratching the surface of what is possible in DevOps through a more joined up and strategic approach.