How to estimate in software teams — Part 1

Dan Draper
[Run]time
Published in
8 min readFeb 5, 2017

TL;DR: Software estimation is hard. Here’s how to make it easier.

You may recall my earlier post on why software estimation is usually terrible. If you missed it, you may wish to go and read it before reading this article as it lays the foundations for what I describe here.

After over 10 years of attempting to establish good estimation practices in software teams I feel that I’ve eventuated on something passable. Not a silver bullet, probably not even something really good but not bad. An approach that has proved to be fairly reliable a tool for forecasting delivery and sharing with eager stakeholders.

Stories

It all starts with a story; typically a user story that describes a particular behaviour. For example a story for an online store might be: “As a Guest, I want to add an item to a shopping cart”.

Stories are often a subject of debate when teams consider the wording or the level of detail they should contain. My philosophy is that a story should not be too prescriptive as to the specific behaviour the software should execute. Rather, it should focus on the value it adds to the user’s experience. It can be helpful to think of a story as a promise for a conversation. The details should evolve with our understanding of the problem.

“It can be helpful to think of a story as a promise for a conversation.”

No matter how you approach writing stories in your team, each story should have a set of agreed acceptance criteria: the minimum requirements of the feature being added. These could be general in nature such as: “must work in Internet Explorer 9”, or they could be specific to the story like: “a visual indication that the product has been added to the cart is displayed to the user”.

I’ll likely cover more on story writing in a future post but for now, if you need help with stories check out Jeff Paton’s User Story Mapping.

Relative Sizing

Once the team has a backlog of stories you’re ready to size them. If you recall from the previous article, humans tend to be bad estimators when it comes to time. However, estimating uncertainty or relative effort can provide a basis for sizing work.

No doubt my opinions here will attract some controversy but in my experience estimating for time in teams never (or at least rarely) works. Instead, the teams I work with use a relative points scale.

When using a relative points scale, the members of the team assign a point estimate to each story in the backlog to give an indication of its relative complexity. Usually complexity will be some measure of effort and uncertainty.

If you’re familiar with relative points estimation you’ll know that there are lots of different flavours of it. I’ve tried several now and really they are all fairly comparable in practice. Its good to try different approaches but remember to give each one some time to settle in before assessing.

Having said that, my favourite estimation scheme uses a fibonacci scale of effort and complexity. Here it is:

1 point: Low effort, low uncertainty

Implementation of the feature is either trivial or obvious, even to junior members of the team.

2 points: Medium effort or medium uncertainty

We understand a little about the story but there is still some information that is unclear or it could take some time to complete.

3 points: High effort and/or high uncertainty

We don’t understand very much about how the feature might work and there could be a high degree of effort either in research or in the implementation itself.

5 points: Very high effort and/or very high uncertainty (“Oh, sh*t”)

Basically, I call this the “Oh-sh*t” measure. If the team’s emotional response to the story is one of fear or high stress then chances are its a 5.

8 points: Story is not clear/too broad — it must be broken down

In practice no story should ever be an 8. We use it as a placeholder for a story that is poorly written or very broad. It should be broken down into multiple stories or rewritten.

An example of an 8 point story is “A user can purchase a product”. While valuable, the story is vague and ambiguous. Some analysis of the problem (perhaps via sketches) should allow the team to break it into multiple stories that cover the steps a user would take to buy a product from the store (browsing and viewing of products, adding products to a cart and processing a payment).

In an ideal world, all of our estimates would be 1’s. Consider an archer shooting at a target. Its easy to imagine that the further away she is from the target the less certainty there is that she will hit it. Relative sizing estimates are the same. The less certainty there is about how to implement something the less accurate our estimates.

Incidentally, this is why a progressively increasing scale like the Fibonacci scale is used. It puts a higher weighting on the estimates for which we have low certainty. Alex Yakyma has a good explanation from a mathematical perspective (though he uses a slightly different scale).

So of course a story backlog of all 1’s would be great but its rarely practical. Instead, try to write your stories as granularly as possible. This way you are moving as close to the metaphorical target as you can.

Wholistic Process Estimation

Another consideration when estimating stories should be for the wholistic process. That is to say, the estimate should cater for things other than just writing code in order to implement the feature.

These might include gathering information from stakeholders, quality assurance checks and some cross-browser testing, deployment and the sign-off of a story once it has been delivered (say by a product manager).

Your team may have “definitions of done” specific to your process or product but when estimating its good to cast the net wide. So often poor estimates are because of unexpected non-technical work that needs to be done.

It can also be helpful for the technology lead or product manager to set expectations prior to estimating. For example, the team may decide only to support Internet Explorer 9 and better. Knowing this can help the team frame their estimate.

“So often poor estimates are because of unexpected non-technical work that needs to be done.”

Planning Poker

Planning poker (or sometimes “poker estimation”) is one of the ways in which a team can collaboratively assign a relative size to a story.

Teams can use playing cards for Planning Poker.

The emphasis here is on establishing a shared view of the uncertainty or effort relating to a piece of work.

The process starts with one of the team members describing the first story in the product backlog. The rest of the team can ask questions, have a discussion about constraints or even opine on potential solutions.

Then, when everyone is ready each person will simultaneously hold up a numbered card indicating their relative estimate for the work taking care not to show their estimate until the right moment — hence “poker” estimation.

Inevitably, a range of estimates will be revealed (though I have occasionally seen a 6 person team all estimate the same value). Now, the team has to debate the estimate. Those that estimated differently to others now must justify their estimate to the rest of the team.

The goal is for the entire team to agree on a value for an estimate and each member can either argue their case or change their estimate. In the case of a stalemate, a team-lead or delivery manager can set the estimate and move on to the next story.

In the rare cases where the team cannot agree you can just set the estimate to the most common value or if the values are split, the higher of the options.

The point here is to eliminate or reduce “information asymmetry”. Because each team member gets a chance to argue their case, they can share information or perspectives that other members may not have considered.

Furthermore, because estimates are hidden until the “reveal” team members do not bias each other’s scores. Juniors are less likely to feel pressure to estimate inline with their more senior peers for example.

The net result is a balanced, highly symmetrical team assessment of the complexity of a story.

What to estimate?

So far we have limited the discussion to the estimation of stories — things that directly add value to a user. But what about defects or bugs? Should those be estimated too? I won’t keep you in suspense: the answer is almost always no.

Fixing a bug in a piece of software shouldn’t be considered adding value to the user. Rather, it is repairing value that was supposedly already delivered. You don’t get to count the points twice!

You can think of the number of points delivered as analogous to the revenue generated by a retail store. Suppose a salesman sells televisions in the electronics department. He has a sales target to hit and every sale he makes counts towards that target.

Now lets say on one occasion, the salesman sells a swanky new TV to a customer and records $1800 towards his monthly target. A few days later, however, the customer comes back to the store complaining that the TV is not so swanky after all. In fact, the TV isn’t receiving any channels.

The salesman, after a short investigation realises that there is something faulty with the TV and it needs to be repaired under warranty. He works with the service department and gets the now fully functioning TV back to the customer a week later just in time for the latest series of Game of Thrones.

Even after the return and the repair, the revenue counted towards the salesman’s target is still $1800. Despite all the trouble, he hasn’t added any further tangible value to the customer.

The fault with the TV is just like a bug in a piece of software. The behaviour of the bug was not intended and by fixing it we are ensuing that the value we intended to provide to the user was provided indeed.

“…there is an incentive for teams to release defect-free products…”

The number of effort points completed in a period of time is known as the team’s velocity. It is a measure of the value added to the product and not just of the work done.

This approach ensures that there is an incentive for teams to release defect-free products as any introduced defects could have an impact on velocity in the future (stay tuned for an upcoming post on Technical Debt).

Whether your team works in sprints or on a more continuous cadence (a la Kanban or XP), measuring velocity can be useful for forecasting. In part 3 of this series, I’ll cover how to use velocity to track progress against a release and share high level information with stakeholders.

Estimating in software teams is usually terrible but it doesn’t have to be. By democratising input, having a robust process and measuring the delivery of value instead of work done you’ll find that in time your teams will deliver consistently and reliably.

If you have a different perspective on estimating or just want to share your experiences please write a response. I’d love to hear from you!

--

--

Dan Draper
[Run]time

VPE/CTO, Nerd, Coder and Producer of the forthcoming film, Debugging Diversity.