Estimation and Accountability

Published in

New Agile Paradigm

6 min readJul 12, 2024

Over in Redditland, and here in Medium, for that matter, I’ve seen a bunch of articles to the effect of “why bother estimating software?” My mind boggles at this. The reasons ought to be obvious, but clearly, they’re not — hence me sitting down and writing this article.

Let me start with an analogy.

Imagine a patient asking a doctor, “how long will the operation take?” and being told, “Well, I’m going to start cutting and I’ll see how it goes.” Or worse, imagine if that question came from the anesthesiologist or a nurse, both of whom have to plan materials and scheduling and all manner of support for said operation. Y’see, that operation on that patient doesn’t occur in a vacuum. Estimates exist to say, “I need this amount of resources for this amount of time.” Otherwise, planning can’t happen.

Your user story, Dear Developer, is but one cog in a larger machine, and that larger machine needs to be managed.

The pushback on estimation seems to mostly come from the Devs. Since I’ve been writing code for a lot of decades, I can understand the concern with giving an estimate, then not being able to stay within those bounds. “Autonomous Team” is the rallying cry. “You don’t get to dictate what we build.”

However, Agile’s notion of Autonomous Teams can be taken too far. Does Autonomous mean scope creep is ok? Wait, let me rephrase that — when asked to build a rubber raft, can the team, instead, build the Titanic? Clearly no. But neither can they deliver a water wing and call it good.

Autonomy means “we get to be creative in thinking up ways to take the requirements and deliver a solution which passes acceptance testing”. It absolutely does NOT mean we’re going to go away, think up something and come back with a fait accompli.

The Agile Manifesto ought to have a 5th principle: We value Transparency and Solution Collaboration above Egos.

Y’see, Management absolutely has the right to say, “Well…yes…yes, we do have the right to weigh in on what’s being built. We’re paying for it.” Moreover, if the collective team (not just the Devs, but PO and all stakeholders, including Mangement) can’t think through what to build at a reasonable level, there are bigger issues.

So about those estimates. How do we generate meaningful ones that everyone’s happy with?

Harkening back to that poor anesthesiologist, it’s about setting expectations and planning for resources — but also recognizing that Shit Does Happen and if it does, estimates need to be revised. It’s not a one-and-done process.

I’ve written other articles about User Story (or, more generally “Change Request Artifact” structure, so I’ll omit the logic and details surrounding those and simply state that a minimum Definition of Ready MUST include three things:
1) A statement of intent, from a customer point of view which hopefully speaks to a single vertical slice (code path) of functionality.

2) A collection of Gherkin-phrased tests which, when passed, prove that the delivered functionality meets expectations.

Without reproducing my prior articles here, suffice to say that a bug is when expected behavior does not align with observed behavior — thus without stating expectations (intent/customer PoV) and collecting observations (gherkin-based tests), you cannot detect bugs.

So what’s the 3rd part? Tasks.

Tasks answer the question, “What does the delivering team need to do, step by step, to produce functionality which passes all of the tests in that collection?”

Imagine telling a robot how to get from your house to the store. You’d probably start with something like “Drive south to the end of the block and turn left.” But what if, immediately after that turn, there’s construction blocking the road? How does that affect this thin vertical slice (code path)?

Answer: It doesn't!

Why? Because “WITH construction at 5th and Elm” is a different use case. THIS code path is “WITH no blockages or impediments along the way”. So that construction can’t be there! Otherwise it’s a different vertical slice / code path / user story.

So, with that in mind, it’s pretty easy to instruct our robot on the 3 required turns to get from here to Safeway. And if there’s construction at 5th and Elm, it’s just as easy to know what to do to get around that when the time comes. The point being, it’s easy to evaluate the process of tracing a path from a trigger event to a fully manifested expected behavior, step by step. In fact, I’d go so far as to say if you CAN’T do this, you’re at best a coder (as opposed to a Developer) and at worst in need of finding another line of work.

Nothing wrong with being a coder. We all start there — very junior, haven’t seen a lot of real world situations and only exposed, if we’re lucky, to classroom assignments and whatever we can glean from stackoverflow. Nothing wrong with that at all. Over time, as we gain experience, we are able to wisely apply those to similar situations, taking into account all factors involved, to come up with elegant solutions. Hopefully.

So here’s the choice. We can, as a team, determine what that first step (drive to the end of the block, go left) is collectively before the start of the sprint as part of “refinement”, or we can trust one person to do so after the start of the sprint in a vacuum without transparency. I’m trusting the choice here to be obvious, and metrics on this work pattern bear me out.

At some level, THE TEAM should collectively think through the steps which need to be taken to get the code out the door and these should be listed as tasks (part 3), attached to the change request artifact. Those tasks should be estimated in real-world time units — we’re not sequencing anything here, we’re estimating.

Embarrassment at possibly picking the wrong first step should never be an excuse for opaque tasking. This was the impetus behind planning poker — to generate those discussions to flesh out what person A has thought of (for better or worse) and person B hasn’t. We’re just applying more structure and rigor to that same process here — which is a good thing.

In the end, the team gets to say, “based on everything we currently know, it’ll take 2 hours to modify this method on this object to handle this new code path” and maybe 8 or 10 other tasks they’ve identified.

I’ve found, over the years, that this is sufficiently granular that teams can do this with high accuracy. And teams that do so strongly correlate to those who produce bug-free code.

When do we do this and to which artifacts? Great question, glad you asked!

Stories (and features, epics, capabilities — they’re all identically structured change requests, just at different scope levels) which are candidates for inclusion in forthcoming work (this sprint, next sprint, coming off the kanban backlog, whatever) are candidates for this level of rigor. Ones which aren’t — aren’t. It’s that simple.

Ones which aren't’ should STILL have estimates, but those estimates should have a huge disclaimer: “The estimate you’re about to read is high level, with a big standard deviation, subject to overwriting by task-based estimates when the time comes. Proceed with caution.” Such estimates should be treated as the fuzzy, fog-shrouded guesstimates they are.

So I’m proposing a few things here:

Everything should have an estimate, even if its “sometime between now and 3 years from now”
Estimates consist of several parts, not just a number. A duration, a source and a plus-or-minus. Estimates which are “22.3 hours, task-based, plus or minus 2 hours” are far better than “20 hours, seat of the pants, plus or minus 30 hours” — but BOTH ARE VALID.

Let’s circle back to the folks asking, “Why bother estimating”? Short answer: because resources need to be planned.

But the longer and far more important answer is dependent on a different question: “Why bother estimating to the Nth degree something we’re not going to get to for 4 months?” and conversely, “Why bother estimating to the Nth degree something that’s going to be in next sprint?”

The answer to the 4 month question is, “you shouldn’t”, but if management is silly enough to TASK you to do that, you need to raise the fact that this is going to take TEAM CAPACITY away from higher priority activities.

The answer to the next-sprint question is, “you should” or “you’d better” — because if you don’t understand (i.e. haven’t thought through, transparently, with the team) what you’re signing up to build and you’re just going to “start cutting and see how it goes”, management has every right to go find another surgeon.

#bugfreecode #softwaredevelopment #agile #scrum #userstory #softwareestimates

Estimation and Accountability

Written by Steve Ciccarelli