Are You a Devops or an SRE?
Devops is the work that happens on the way to Site Reliability Engineering
In 2014 and 2015, we saw a confluence of events that prepared the world for Devops — cheaper virtualization, easily configured containers, low-cost cloud hosting, and widespread adoption of scrum/agile to name a few.
Combining the above changes in the right way made it easier to build, deploy, and scale a production application. A movement resembling Devops was inevitable.
Infrastructure teams that took advantage of these changes suddenly had the time to envision a reality where all of their time wasn’t consumed just by keeping their production systems up and patched. Product developers were hearing rumors about how some of their peers were able to spend the majority of their time focused on shipping features.
In just five short years, the scope of Devops has grown so large that the word itself has become almost meaningless. Continuous integration, build and release, continuous deployment, monitoring alerting, cloud infrastructure. You may just be functioning as a traditional Linux sysadmin with a sexier job title. Anyone of these aspects can, and often do, encompass an entire job role or team at larger organizations.
So how to sort through a list of SRE and Devops roles when it is evident that what the job title means will be extremely different from organization to organization?
If it’s not somebody’s job, it may end up like nobody’s job
Most engineering organizations would rank many of the responsibilities that can fall under the Devops umbrella as essential aspects of delivering quality software. Hiring managers are fully aware of their needs in these areas but struggle to hire product-centric software engineers with the experience, passion, and drive to push this type of work forward along with all of their other responsibilities.
I believe this reality has lead to an increase of Devops-specific job titles despite the many efforts by the Devops community to convince the world the title shouldn’t exist.
A Devops job title sends the signal that 1) the organization finds some importance in most/all of these concepts and 2) the organization can admit they do not have all the answers yet about what its infra/it/sec/ops future is going to look like.
At a young startup, this level of ambiguity should be expected and welcomed by a job seeker. If you have a personal drive to push forward a Devops-centric engineering culture, this position represents an opportunity for you to be central in building an efficient and automated cloud-native Devops environment.
Devops is about transformation
At Varo, where we seek to be the first national mobile-first bank in the United States, our Devops journey has been very different than what you would see at a marketing startup that can get away with merely deploying its Nodejs app to Heroku. There is no one-size-fits-all path here.
Process can replace tooling when you are an agile startup and only have four or five engineers to convince to follow the rules. However, for a large organization with dozens of engineers, Devops will begin to mean something entirely different.
How Varo is using Devops to build a bank
We initially focused a lot on production monitoring and alerting, as well as pushing for a service-ownership culture when our application was rolled out to our first customers in summer 2017. With this aspect mostly under control, we refocused our priorities on internal tools, a smoother build and release pipeline and modernizing our AWS infrastructure with Terraform to better meet our long-term needs.
As we began to scale our application and engineering team, the demands on our dedicated Devops team quickly began to change. When I first joined Varo, I was warmly welcomed by the quote “The mission of a DevOps team is to eliminate itself “ at the top of our confluence page. We all knew that asking people nicely to follow the rules would eventually no longer be an option, and tools and guardrails need to be constructed and maintained. Tasks that were once easily handled manually by an individual will inevitably grow in frequency to begin to exclude work on long-term projects. The goal of a Devops team in this stage is to ensure that efficiency stays constant and that observability doesn’t decline with their increased scale.
At this point, specialization became necessary. One or two people can’t handle all of the cloud infrastructure and production monitoring needs as well as the build chain while maintaining their sanity no matter how broad their title is.
The astute reader will recognize that I have begun to describe some of the exact problems that Site Reliability Engineering is proscribed to solve.
What is Site Reliability Engineering?
We aren’t even past the job description, and we already have a mouthful of complexity to unpack here.
An SRE job title tells you is that someone on the team probably used to work at Google or has fought their way through all of Google’s SRE book. It doesn’t tell you if their Devops transformation is complete and they are living in an automated wonderland, or if the organization supports this vision with well-maintained SLIs SLAs and SLOs.
A fully formed SRE environment assumes that 1) You’ve invested 25 years of engineering time into building and refining your internal monitoring logging alerting visibility and deployment tooling or 2) You’ve invested a lot of your venture capital into purchasing and configuring best of breed tools that provide a small part of what a Google engineer would expect to have at their fingertips.
At Varo, we’ve been able to cheat this timeline a bit by combining some robust open-source tools like Prometheus and Grafana as well as partnering with New Relic and Splunk when we didn’t have the time to manage things ourselves.
High-quality tooling becomes almost invisible to an engineer, so many people aren’t even aware at the depth, breadth, and complexity of the things that support them every day until they join a startup where none of this work has been done.
Remember we ended up here when our broad Devops role began to require specialization and one-offs were becoming overwhelming. To be content as a dedicated SRE, you need to be passionate about production systems, their care, and feeding more than their design.
Monitoring and alerting should make you swoon, and you care passionately about alerting thresholds and how they tie into your organization’s priorities. Automation will still be central to your role, but you’ll be more focused on solving your peer’s problems and servers problems than you will be on directly supporting product engineers daily tasks.
The job search
Thinking about applying to that Devops job? Make sure that you are mentally prepared (even better, you are excited) for the chaos that you are going to face and that you will need to be defining and redefining their own job role every day. Flexibility, friendliness, and an open mind are going to be crucial to your success.
If you are considering an SRE role, ensure that the right tools and team members to support you are actually in place. Your world should revolve around observability, and there is no reason to create an SLA if you haven’t collected the right metrics yet. Be wary of SRE jobs at small organizations that may be making many assumptions about how easy it will be for you to create/purchase/configure all the tooling you need to be effective.
For any role, you should be clear about what you want in your job search, ask lots of questions about the team, the tooling, and where things are headed.
Conclusion
Devops can be a great job title BECAUSE it doesn’t mean any one thing. You will get to work on deploy scripts, respond to a production incident, and run a training session on Git all in one day without questioning the value you are providing. If you are most passionate about production, see the value in a Devops culture, but don’t want to solve all these problems yourself from the ground up, look for an SRE role at a more mature organization to make sure you stay happy and productive.
Passionate about Devops and SRE? Just happy that someone else will be handling that for you? Come join us at Varo.
Mark Ferree is a lifelong software developer who migrated into Devops. He is a polyglot who gets excited about learning new things.