Stability vs Innovation at the times of DevOps

Luigi Corsello
Zero Equals False
Published in
6 min readJul 5, 2019

A report of the Wall Street Journal, spread by other media like Venturebeat, hinted at a possible reason why Jony Ive, the talented designer behind many Apple innovations since Steve Jobs and beyond, decided to quit the company in 2019. That reason would be the growing focus on Operations, that is: smoothly running and consolidating the business, rather than pointing at other values like disruptive technology and breakthrough innovation that made that very business great.

While I am not commenting on that report and the replies that followed or will follow from Apple, the eternal fight between maintaining and consolidating the status quo vs disrupting it through risky innovation can spark some insightful reasoning within the technological field of IT Operations. In the DevOps era, should engineers and their management focus more on stability or on innovation?

Depending on location, many DevOps jobs opportunities can be found today though LinkedIn or other job boards. Out of project consultancy, those jobs often seem remarkably similar: the combination of Continuous Integration and Delivery, with all related tooling and practices supporting software engineering, associated with 24/7 maintenance and troubleshooting of the production infrastructure where that software runs.

The latter was once called systems engineering and used to be a different job, with operations engineers and developers living more or less comfortably in their own silos, hating each other. Software engineers, free to create and experiment on almost whatever they desired through code, were traditionally pushing for innovation and disruption. If new methods could help reach the goal quick, why not to use them? At the same time conservative systems engineers struggled at every change or software release to handle resource capacity and stability of the production platform serving the final customers, without whom there would be no business. They would often form a nasty bottleneck, strongly holding back changes. A fierce tension field.

Silos are still present today, no matter how much of a DevOps culture a company claims, especially where legacy, monolithic applications are present or defensive management refuses to see the advantages of cooperation and Agile practices. This is a sad reality in many places, where overloaded and sometimes sleep deprived “DevOps” systems engineers handle, in fact, the (un)pretty combination of two jobs with one salary: doing infrastructure implementation work and night watch shifts like a first line engineer, to ensure continuity, while also supporting developers in their own field during daytime. An excellent and shortsighted recipe to increase attrition rates.

There are illuminated organizations out there that do understand the advantages and properly apply the DevOps paradigm, in which everything is a continuous flow (development, testing, delivery, deployment and monitoring) and all players cooperate within this flow. A developer knows the production infrastructure and makes sure, helped by automated testing he co-authored including QA, that his software is always safely deliverable to production. He technically knows, and perhaps had his word about, the infrastructure architecture and the cloud services used and is not afraid to read and use the information provided by logging and monitoring tools, of which he defined the metrics jointly with the DevOps engineer. It will be developers to report when a software release will double the use of key/value store capacity, not the DevOps engineer to find out with a bang after that release is deployed. Best case scenarios the doubling of capacity will be averted at a common planning meeting, with the DevOps suggesting an alternative implementation before a single line of code has been written.

The DevOps engineer, for his part, understands programming, tooling and algorithms, thinks and acts in terms of automating everything and is conscious he can trust the software pushed by the developers and tested by QA (another essential part of the equation) but at the same time is not afraid to code when necessary. More or less every line of code becomes then part of production infrastructure, just like every cloud service used. The IT infrastructure is built and maintained together, cooperating: no more silos. The DevOps/Operations engineer is no longer the only one responsible for continuity: everybody is. Night shifts can be carried over by less skilled 1st line support, helped by automation and documentation, not by expensive DevOps people, who got acknowledged for their creative effort requiring night rest just like developers. Now also operations (DevOps) engineers take active part in innovation. Communication helped lifting tensions.

Here lies an important milestone of DevOps times: the systems engineer ceases being the rough “mechanic” with boring and repetitive tasks like fixing failures caused by buggy software, or lifting and fitting a software black box into a set of servers, to turn into a recognized player of technological innovation: he or she can be active part of the entire application lifecycle, and can provision and integrate infrastructure and application services on a large scale in a coded, automated fashion, perhaps within seconds. Who has seen complex architectures be set up or migrated seamlessly by DevOps teams knows well the advantages of this approach.

Either way, no company is perfect and idyllic workplaces are hard to find, if not just recruiters propaganda. There’s work to be done: something will need to change from A to B somehow, and every change uses energy and increases entropy. At best, many implementations of the DevOps paradigm will be constantly improving while trying to achieve an optimal balance between stability and innovation. The question here is: where lays that balance?

The answer is that there is no defined answer, as that balance can change over time. In some circumstances, like in a startup, product has the maximum priority: delivering the coolest solution very fast may be the only winning bet, sacrificing SLAs or technological perfection. In other environments, like in manufacture, stability is crucial: one bug and an entire line of production can grind to an halt or be useless, wasting big money. Achieving fully guaranteed continuity while keeping up with constant innovation is the aim of the bravest, as long as the workforce does not burn out in the process: some have definitely enough means to afford this challenging option, like Apple or Amazon; others may go bust while trying, like a startup.

Google went one step beyond, implementing the third option its own way: no DevOps engineers to glue development and operations, that would not fit its scale, but expert developers dedicated to operations reliability (the site reliability engineers or SREs): the natural evolution of systems engineers who don’t need to go bridging to developers because they already are developers. Simple, isn’t it? Not quite that simple.

Back to a traditional DevOps world, who should set the balance among operational stability and desire for technological innovation then? Is it up to the technical squads to decide, within the Agile approach? Should the Engineering Lead or the CTO define when to deploy a new release, or hold tight and stay stable as we are, slowing down development? Or should those matters ever reach the Board?

Experience from different companies suggests that highly skilled people, in both technology and business, have fairly working good brains and should be capable to reach a compromise, as long as they accept and recognize each other’s work and priorities. Consensus is the key. In a modern IT company if all C-level executives or the board would need a fine grained level of control on SLAs vs product delivery something would be going dreadfully wrong.

Whatever the positioning: own initiative, creativity and most of all cooperation should not be lost in favour of any of the options: they should be valued and kept alive even at times of consolidation where there is temporarily no space for innovation, or into more conservative environments. At times when R&D and Architecture departments often no longer exist as entities separated from Operations, stifling creativity as in treating DevOps or software engineers or design people or any sort of creative souls as disposables would simply be nonsense and damage the business on the long run.

--

--