Learning to Love Metrics

Published in

Error Handling

5 min readApr 26, 2021

I used to hate metrics.

When I was a junior QA engineer, and we didn’t have them, I thought they were a tool just for clueless upper management. I don’t need them, I know what the quality is like, I test the product all day!

But our manager wanted them, and I pushed back. “It’s a distraction”, “They’ll be inaccurate”, “They’ll push people to do wrong things”, “It‘ll become a weapon to bludgeon us with”.

When that argument didn’t work, I decided that if we had to have team metrics, we were going to have the best damn metrics in the software industry. I spent the best part of a year, on and off, trying to find the perfect software quality metric — one not prone to bias, that encapsulated all aspects of quality, that measured the quality before release and not afterwards (when it’s too late). Unsurprisingly, I failed.

Over time, as I matured (and became part of the “clueless” upper management myself), I came to have a more nuanced view of metrics, and eventually, gradually, become a fan. Here are some things I’ve come to understand:

Metrics are useful at any level

Management tend to be the people most asking for metrics, because — let’s be honest — we don’t know much. We aren’t seeing the day-to-day details of everything that’s happening. We aren’t in every scrum meeting, every Support call, every pairing session, every bug triage session. But neither is anyone else.

As a QA engineer, I thought I knew everything that I needed to know about the quality of our product. But there was so much that I didn’t know I didn’t know. Was I focussing on the issues that users cared about? Was I making a difference? Was one thing quietly regressing while I was preoccupied with another?

A good set of metrics is better than personal observations, because it can tell you more than any individual person can possibly see for themselves.

There is no perfect metric

I thought that we could avoid the downsides of metrics by just finding the right one. But that’s the wrong approach. Every metric can be easily gamed if people are incentivised to. Every metric lies. Every metric hides some problems and overemphasises others. It took me far too long to realise that.

Yes, every metric is a bad metric. But 10 or 20 bad metrics, combined with their stories (see below), creates a more complete picture that’s far greater than the sum of its parts.

Every metric needs its story

Of course, it’s not always practical to report on 10–20 different metrics at a time. As the engineering leader reporting to fellow members of the business leadership team, I chose 1 or 2 that covered the areas most relevant to them. But even then, it was never just a raw number — it was always a number and a story. I’d say things like:

“We had two critical incidents this month, but things are actually pretty good despite the high number. Personally I wouldn’t have classified these as critical incidents as they were pretty rare cases. But the product team decided that they wanted to treat these more seriously and go through the full process, and that’s a really good cultural sign.”

“No critical incidents this month, which is great, but I still have concerns. The product team found one serious issue through luck just before release, which has me worried. They’re looking into our processes to see what went wrong.”

Metrics become dangerous when they become detached from their stories. Without the real stories, those numbers start to tell fictional stories of their own. As leadership, it’s our responsibility to make sure that doesn’t happen.

Metrics don’t need to be objective…

Engineers love solid, objective reality. But more often than not, in the software industry, reality is far less important than the users’ perception of reality.

It doesn’t matter if your performance tests say that your feature is lightning-fast, if it still feels slow to the end-users. This doesn’t mean that measuring the objective performance is a waste of time. But it might not be enough.

Perception is often underrated because it’s harder to measure. But as with any “unmeasurable” problem like this, I look to the basics — how do we know there’s a performance problem in the first place?

Customers are complaining to Support? Count the support tickets. Twitter complaints? Count the tweets. Nasty emails to the CEO? Count them!

These are all subjective, but definitely measurable. Just be aware that subjective metrics will tend to be less reliable in the short-term.

…But objectivity makes them powerful

Developer A knows the product inside-out. She’s really excited about the new feature she’s building. She’s been showing early prototypes to users who are super excited about its possibilities.

Support engineer B also knows the product inside-out. He spends all day working with customers whose instances are offline, and whose sysadmins are stressed and upset.

If you ask both A and B for their assessment of the product’s quality, do you think they’re going to say the same thing?

Herein lay the greatest issue with my viewpoint as a junior engineer. Yes, I knew the product’s quality extremely well. But like all humans, I’m prone to bias. I was more likely to think that the software was poor quality, because my role at the time was to look for problems (confirmation bias) — even if the problems I found weren’t those that the customers would consider important. And I was more influenced by the last observation I’d made, than by the big picture (recency bias).

Of course, objective metrics are still biased, but in more consistent and predictable ways.

So where does that allleave us? A good set of metrics:

Measures the same thing from multiple angles,
Contains both objective and subjective measures,
Combines stories with numbers to capture nuance that the numbers alone miss,

With the goal of:

Reducing bias from people’s roles, viewpoints and natural outlook,
Capturing subjective perceptions from end-users,
Focusing on the bigger context, so that the most recent event doesn’t overshadow the overall picture, and
Telling you more than any single person could know,

So that you can:

Focus your efforts where they are most needed,
Know when an effort is not working, and change your approach,
Know — and celebrate! — when improvement has been achieved.

These days, measurement is such a natural part of the team culture at every level. You measure a thing, you do a thing, you measure if it changed. It’s hard to imagine how we ever got by without it.