Playing a Product role in SRE

Anthony Fairweather
5 min readDec 24, 2019

--

A few months back an ex-colleague of mine approached me about an article he was writing on the importance of having Product Managers in Site Reliability Engineering (SRE) teams and whether I could offer up some insight.

I read the article and immediately fired off a quick response that could be included. After a few days of mulling this over in my head I thought there was so much more I could have provided which has led me to jotting down my own thoughts that I am keen to share with you.

Bunny in a line of lego storm troopers
Photo credit — @Clement127 on Flickr https://www.flickr.com/photos/clement127/

SRE, DevOps, Infrastructure, Platform and/or Reliability Engineering are typically made up of people predominantly from a technical background. The kind of characters that would much rather be nose down in a terminal console than teasing out customer motivations in a focus group, or writing up a slide deck to convince people on why we need to embed measurement and feedback into our products. I jest, but in reality, this is just the tip of the iceberg, for a tech org at any reasonable scale. In my opinion organisations need to work harder at balancing a set of skills within these teams, so that they are not just made up of engineers that are predominantly more interested in writing code and don’t want to get too bogged down with the ‘Why are we doing this’?

From my perspective a good product manager brings a complementary set of skills to your SRE function with one golden objective — Ensuring we build the right things.

Easy right? Well actually no. In a previous role, I worked a lot on service design within the public sector, before you all begin to yawn I’ll tell you this — I regard the skills and approaches I learned from the government digital service as the cornerstone of my learning. It’s through their methodologies I learnt first hand how products based on proper user insight are worth overwhelmingly more than those created in isolation.

It’s probably got something to do with the number one principle on the Gov.uk service standard — Understand users and their needs. So when you’re sat there in front of an GDS assessment panel and trying to blag it with some assumption about how Jonny in Swindon renews his car tax you’re likely to get found out (and fail). Without being able to provide the evidence that you’ve done the hard work upfront you can’t prove you‘re following the standard and you therefore can’t launch your service… moral to the story you need to go back and do the hard yards.

However, done right, that single guiding principle offers so much value — in my experience it results in products/services that delight your users; because it establishes the needs of the user first and foremost, giving you the confidence that the thing you’re building is the right thing (e.g. it solves an unmet need). The only remaining challenge (the easy bit) is building a ‘thing’ that enables the successful completion of said task above all other things.

In an SRE function, your customers (users) are predominantly your engineers, with such an accessible user group it’s quite easy to find out what they want. The problem is, they will probably all name the symptom or a bunch of solutions/tools/features that will make them happier. The hard part, and one of the key roles of the PM is to question them to dig deeper and help you identify what the core problem is that needs solving? If you can tease that out, you’re halfway there.

Onto the next part — Identifying what outcome we hope to achieve by solving the problem. Sure we want everyone to be happy and work in a way that suits them but at the end of the day we have to keep the business context in mind, if it doesn’t drive an actual change (that results in business improvement) then we shouldn’t be focussing our attention on it.

So what’s a business improvement; we’re an SRE department, we don’t actually drive conversion or retention, how can we have an impact? The answer is much more significant than you might expect. Let’s take MTTR (mean time to release) as an example, CI/CD pipelines are complex because we need to be sure the code and infrastructure we deploy are bulletproof, shaving precious time off these processes without making reliability tradeoffs can have a significant business benefit when you are working with more than a handful of product teams.

Or from another perspective simply removing friction from these processes can shift the engineering culture towards smaller (less riskier) changes more frequently, reducing the risks associated.

What else does a PM actually bring to this party? Where does their expertise enhance a team’s capabilities?

Lego figure having a party!
Photo credit — @Clement127 on Flickr https://www.flickr.com/photos/clement127/

N.B. I should probably re-iterate these are all my own views

First and foremost I’ll tell you what a technical PM is not there to do:

— Design or architect a technical solution.

Now from my own POV, not being qualified to do this is one of my greatest assets. There’s no doubt I’m technically ‘literate’ but having never been an engineer (if you ignore some flirting with front-end code in the 90’s), I find it pretty simple to remove the tech from the equation, and instead way up the solution for it merits against the business requirements we have identified.

Having very little bias towards a particular tech stack or solution makes this sooo much easier. Why? because (I believe) all engineers carry an unconscious bias to the how. They inadvertently think about how to build ‘said’ solution, I wouldn’t have the faintest idea how to write it, in Python, Terraform, Ansible or anything else; instead as a Product Manager I just have a crystal clear understanding of the problem we are trying to solve and the outcome I want it to achieve, it’s by impressing this upon the team do we collaboratively get to the most appropriate technical solution.

So in summary SRE needs Product because we’re not engineers, we observe situations from a different perspective and hopefully bring a whole lot of experience from other disciplines into the mix from stakeholder management to UX research and data analysis.

In writing this post I’ve ended up with more content than I can feasibly fit into single post, instead I’ll break down the core skills (IMHO) that a PM brings to the party in some subsequent posts.

--

--