The DevOps Trap

Why the DevOps hype cycle is bad for users.

I’ve never been a server engineer or run a team of server engineers, I’ve always been on the development side of the fence, but I’ve worked alongside server engineers and liaised with infrastructure managers for many years.

There’s a key difference between the two disciplines. Dev teams are all about introducing new features as fast as possible (“move fast and break things”). Whereas infrastructure teams have structure and processes designed to ensure that things keep running smoothly and risks are mitigated, and rightly so.

It may seem expedient to just let developers loose on the servers. They wrote the system, aren’t they the best people to deploy it and look after it? But in my view that’d be like letting the police lock people up without the need for any pesky trials or defence lawyers. Things will go wrong, and it’s a little unfair to just blame the police. If left in charge of the whole process, both the police and developers have a culture and training that teaches them to behave in precisely the wrong way for the outcomes you want.

I’m a great believer in having a “balance of powers” in IT teams; dev teams push for changes and infrastructure keep us in check.

In my experience getting your code into production often involves submitting a Change Request as part of some approval process, with each proposed change including:

  • Explanation of what the change is and why it’s needed.
  • Technical implementation instructions to be followed.
  • A risk assessment of the change; things that could go wrong, and what that would mean for the business.
  • Instructions on how to test whether the change has succeeded.
  • Instructions on how to rollback the change if it has not succeeded.
  • Communication plan regarding the change; who are we telling, when, and how.

A board or committee then sits to consider each change and agree a schedule for when it will be applied. Objections are raised, changes of approach are discussed and agreed. (This is based on ITIL, which is something of an industry standard for these things).

Yes but that was then, now we have DevOps!

DevOps is a relatively new set of practices that automates the processes between software development and infrastructure teams. Tools like Terraform, Puppet and Docker have been created to give engineers a way to automatically deploy servers and code. This enables “DevOps engineers” to “treat servers as cattle not pets”.

But there are two big glaring problems with this change:

First, developers have to think about their careers, and this makes them prone to marketing fads and choosing technology because of resume enhancement, rather than good old fashioned engineering principles. “DevOps is the future!” cry the evangelists, and dutifully, developers up and down the land start trying to persuade management of that fact.

But do you really need to treat your servers as cattle not pets? If all you have is a suburban garden with 3 chickens, you don’t need 3 miles of barbed wire, a tractor and an industrial cattle feeder.

Your todo list app doesn’t need a micro-services architecture with 100k lines of DevOps code. It would be built and in the hands of users sooner if you just created a single Rails app and manually deployed it to Heroku.

Second, developers sometimes think that all they have to do in order to work in DevOps is learn the tooling. But there’s a cultural difference that’s being neglected.

Just because you’ve learnt how to build an auto-scaling Kubernetes cluster to serve your todo list app, it doesn’t mean that all the old lessons of running a production environment are irrelevant.

Introducing changes to production still means risk. And that risk needs to be controlled and managed. And the only way to do that is still: managers sitting in rooms enforcing processes. Without that, we’re going to spend the next 20 years re-learning all the process lessons that traditional infrastructure teams learnt over the last 20 years, while users suffer unreliable and insecure apps.

Done well, DevOps is an indispensable approach to managing large numbers of servers. If you’re Snapchat, BuzzFeed or SkyScanner, you absolutely need DevOps. But doing it well means bringing the old lessons of running an infrastructure team with you and finding ways to keep the developer culture of “moving fast and breaking things” in check.