Measuring a modern ProdDev team

define: ProdDev
 — “Product Development”… a term used to identify those who believe in a non/less-siloed approach to the digital software product lifecycle.

Why measure?

“How well is product and engineering executing?”

This is a question nearly everyone who has ever sat in an executive meeting has heard. It is consistently one of the top perceived problems of product and engineering leadership.

Other departments have clear metrics that they are trying to hit, and their performance is judged by those numbers. SQL’s, CAC, closed deals, upsells, page impressions, trials, etc.

Gotta have some numbers

With ProdDev it is hard to identify what we should be judged by… especially SaaS ProdDev teams.

How will we know if “they” are doing good? Sounds reasonable, but it has proven to be very difficult… mainly because many rely on easy-to-measure-but-shitty vanity metrics or lagging indicators.

Here is my latest take on how we could be judged…

Three by Two

3 categories of measures with each category having two specific measures.

  • Output
     — Velocity
     — Date-based Commitments
  • Health
     — Defect Escapees
     — Story Cycle Time
  • Outcomes
     — Fiscal
     — Value

Now let’s explore these areas a bit…

Output

— Output is the easiest to measure. It feels clean, safe, and makes most executives feel like they have something to sink their teeth into. Mmmm, nice and juicy velocity or number of features delivered.

Worst, period, indicator, period, by far, period.

Focusing on this exclusively, inevitably leads to a feature factory culture.

Let’s look at the two most popular ways to measure output.

Velocity
 — If you have been around agile practitioners, I’m sure you’ve heard of velocity. Basically it is the speed of delivery usually confined within a consistently time-based iteration of a team (sometimes called a sprint).

For those not familiar with velocity here is a scenario: Let’s say we think one screen was equivalent to 3 points. When we complete 10 of those similar-sized tasks in two weeks… your velocity would be 30 points for that two weeks. Simple right? Except the more estimates are used to judge performance, the more bloated (and therefore useless) estimates will become… inescapable psychological physics.

Date-based Commitments
 — Accountability != hitting date-based commitments. It gets no easier than to say to a ProdDev team “how long will it take to get X done”. The horrific-ness of this strategy is well-document. So, instead of beating up on commitments, let’s focus on the other ways to do planning without the shittiness of date-based-specification-driven commitments.

We will get into more ways to manage accountability, but DHH’s Signal vs Noise post below is a great example of a better way to do “dates” in modern ProdDev.


Health

The everyday habits. The quality of the work. The flow of the work.

The overall health of a modern ProdDev team, can and should be a major part of the culture… here are two ways I’ve found are best to measure health.

Defect Escapees (impact)
 — Release a thing… customer finds a bug. That is a defect escapee in its simplest form. I learned about this term during my time when I had the pleasure of working with an absolute master of ProdDev quality experience. He would even agree that the goal is not zero bugs… the goal is to have the highest quality experience for our customers as possible.

I threw “impact” in there because not all bugs are created equal. Impact of new code can be handled in a plethora of ways. Feature flags combined with continuous delivery and a solid “alpha, beta, general release” flow is my preferred approach. Bottom line, measuring the Defect Escapee metric is valuable for a team.

“… only impacts a small number of users, within a limited blast radius, and for a short period of time…”
Architecting a culture of learning (with good teams every failure is a learning)
Huge fan of feature flags and continuous integration combined with regular promotion to production

Story Cycle Time
 — This is probably my favorite metric of “every day” health for a team. If your stories are one unit of work that can be tested and shipped in isolation… AND the average time it takes to complete one of these from start to peer-reviewed to tested is less than a day, you have a world class SaaS ProdDev culture.

Story cycle time measures nearly all of the parts of the process in an amazingly effective way.

* Time in Started: A measure of our story writers, whether that be a product owner, a tech lead, or a designer… if your stories are in the “started” phase for long periods of time, your story decomposition might need some work.
* Time in Finished: Usually this means it is in a “Peer review” state. If a story is here for a long time, it means you may have a culture-of-responsive-feedback problem or the initial quality was poor.
* Time in Delivered: Usually this is when a story has been deployed to a CI/QA/RC/Staging environment and it is in QA’s hands. We want the feedback loop to be super tight, so things should not be in this state for very long either.

Story Cycle Time is the bee's knees!


Outcomes

The ultimate measure… the measure that means the most to all involved. The measure we should all be striving to keep at the forefront of our process and culture… tangible customer outcomes.

Fiscal
 — Conversions… Upsells… New Revenue… Reduced Churn… CLV… ARPU… these are the life-blood of business. Without the sustainability that comes from revenue (better yet, profit) there is no business to have. However these are often lagging indicators, which is why my absolute favorite metric is…

Value (quant/qual)
 —“Test Driven Product Management”

That is a term I heard John Cutler use in a podcast interview. How I interpret it is: “When you are about to ship a thing, know what change you expect to happen… then once it is shipped MEASURE TO SEE IF THAT CHANGE HAPPENED”.

It doesn’t have to just be the standard [M/W/D]AU usage metrics. It can be qualitative analysis. It can be specific percentages of clicks on a certain area.

Whatever it is, it should be an indicator of value-add to the customer. The customer should be happier about the job they are hiring your product to do for them after each change.


Final thoughts…

Measuring ProdDev teams is hard, but worthwhile. It is a cultural thing that should permeate every facet of your decision-making. We all want to be successful and be valued, so making sure you are measuring the right thing is one of the most important things you do as a leader.

Nearly impossible to measure, but second behind value-based-outcomes, is Learning. If someone has insight into how best to measure learning, I’d love to hear about it!