If Estimating Work Needs To Be Reliable, Why Use Story Points Instead of Time?

The Argument for Relative Estimation Using Story Points

Published in

Serious Scrum

14 min readOct 1, 2020

We should choose the most appropriate method for measuring something.

I have often asked myself the question, “Why do my teams fail to fully grasp story point estimation?” Thinking about it, I had to take a different approach to explain it to them.

One of the impediments to our understanding of any concept is our failure to recognise how our cognitive biases affect our perceptions and decision-making capacity. We shall discuss these biases in another article but. let’s discuss an analogy to illustrate the point.

Have you ever watched a sporting event between two teams then listened to the coaches of those teams after the game? Although they both witnessed the same game they will give opinions of what happened during the game that can at times almost be polar opposites. What is more, if you are a neutral viewer, their differing opinions can vary from your own observations. How can that be the case? Paraphrasing Einstein’s “Theory of relativity”, these differences depend upon the perspective from which you are observing. Our mindset and our beliefs influence our capacity to accept our observations — even if they differ from what we believe to be a reality.

I have given explanations/coaching of various subjects and concepts in the past. Something I have learned is that asking if there are any questions or if everyone understands (closed-ended questions) have little value in identifying how well your message has been understood. An approach that has stood me in good stead to check that understanding, is to ask one of my audience to explain back to me what they have understood in their own words. Try it with something simple, it is very revealing.

So, let’s recall the opening sentence of this piece:

I have had to ask myself the question, “Why do my teams fail to fully grasp a useful, practical understanding of story point estimation?” Thinking about it, I had to take a different approach to explaining them.

I shall attempt to do so here. Can I give an explanation that could be repeated by someone reading this article and they can then explain the concept well to somebody else? Will it achieve the aim of sharing a common perspective to facilitate meaningful discussion and practical use of story points for estimation?

Even with previous experience of working on projects (either a waterfall or agile), story points can be hard to grasp since it is an abstract concept. It can actually be harder to grasp if you have been used to estimating work based on time. It seems instinctive to estimate in time with regard to the delivery of pieces of work, so why even bother with an abstract concept in place of what seems natural?

What are Story Points?

Story points are a sequential series of sizes to estimate what is involved to complete a piece of work. The sequence of story points is typically a modified version of the Fibonacci sequence: 0.5, 1, 2, 3, 5, 8, 13, 20, 40, 100 (although other versions exist, the modified Fibonacci is the most common). Cards are used for these values in a technique called planning poker, This has the aim of promoting discussion of the piece of work being estimated, to facilitate sufficient understanding of what is required to estimate the work.

In planning poker, there can be two additional cards, a question mark (signifying that an individual does not understand enough to give any estimation) and a coffee cup ☕️… “ Man, I need to take a break”. Sometimes, it benefits to take a quick break if your eyes are starting to glaze over. Perhaps you need time for a domain expert to join the discussion to give some further information that will improve the understanding of what you need to do for this piece of work.

What is a story? That is a whole other conversation. Not only to describe what a story is, but also what a story is not. For our purposes here, a story is a description of a user need that requires a solution; it describes the user motivation. How do you code it? Your decision. How do you test it? Your decision. The solution should meet the user need while observing technical excellence in its delivery and composition. A more full description of user stories will form a future article.Mike Cohn provides a good short description.
https://www.mountaingoatsoftware.com/agile/user-stories

Estimation Using Time

I would suggest that it is most common when enquiring about the capacity for someone to complete a piece of work that the question asked will be, “How long it will be before the work is finished?” If someone were providing an estimate for how long it will take to complete some work, they would usually answer in hours or days. The same was the case for software developers during the 20th century. Traditional project management uses the same approach since the Project Managers need to work with budget and time (deadline) constraints, so estimating using time is a natural fit for that paradigm.

If developers and software teams could reliably provide accurate estimations using time, then there would be no need to seek another approach to providing estimates. Since the turn of the century and the advent of agile approaches to software development, however, estimation using time was identified as something that needed to be improved to provide greater reliability. Story points are an outcome of various attempts (gummy bears, NUTs, t-shirts etc) to use abstract units for estimation.

That begs the question, “What is wrong with using time?” Well, estimates using time can work. They can also be reliable. The issue with using time in practice can be explained with a couple of definitions, the ideal time and elapsed time. Confusion between the two is a large part of what can make time estimates unreliable. Ideal time, not always consciously, makes the assumption that;

the person doing the work has everything they need to complete the work available to them from the start of that job (knowledge, resources etc),
that it is the only job that the developer will work on until the job is complete and,
that there will be no interruptions that could impede the progress of that work. If you can meet those three conditions and the developer estimate is accurate then the job should be delivered on time.

The challenge here is that these three conditions can be difficult to ensure. An additional complication to those conditions is the frequent incidence of emergent conditions in the implementation of the code required to provide the solution for the job. Coding implementation cannot always be anticipated in full and often unexpected challenges to the implementation of that code add additional time to the completion of the work. What results on top of the ideal estimated time, plus the factors that cause delays and interruptions, resulting in what is known as elapsed time: the actual chronological time required to complete the work.

There is a body of research work on the reliability of estimations using time. This is a list of research reviews on this subject:

The time required to complete work will eventually be factored into the average velocity. We can use an example to illustrate this. Whether we are using hours or days for ideal time does not matter, since the instinct to think of days in terms of hours per day is compelling. The miscommunication in providing estimates based on time can be explained in an example. If a developer takes on a task and then estimates that to be one and a half days of work, in the case where the developers are working five day and forty-hour week. If work starts at lunchtime on a certain day then the manager could expect the work to be completed by the end of the next day. The problematic assumption here is that the developer day is eight hours long.

When we start to factor in something as obvious as the amount and duration of meetings, we can begin to see how available development time is consumed. It does not take much consideration to interpret the impact this has on the concept of developer (ideal) days. For situations and products where the conditions for using ideal time can be satisfied, then go with that if you choose to. I would propose though, that is not the case for the majority of products, projects and situations. This idea of finding an alternative to estimates of time was widely adopted into agile software development approaches worldwide which speaks to some appreciable recognition of the need, even while given the difficulty with accepting change.

Estimation Using Story Points

The challenge when starting to use story points is that a story point is not a standardised unit of measurement. So, if we cannot define a story point precisely then how do we define multiples thereof?

One of the benefits of using story points is to enable us to dissociate estimations from time. The irony is that when beginning to use story points, association with periods of time can help some team members to start to “visualise” what story point values mean. That will change with experience of estimation, but you have to start somewhere.

The key point from my perspective in coaching story points is to recognise that they represent size or volume. The volume is another way of describing capacity. When we estimate a story point value, we need to consider three aspects of the work we are estimating:

The effort required to complete the work
The complexity of the work
The remaining uncertainty of the work that we are trying to estimate.

Uncertainty can encapsulate dependencies on other work, unfamiliarity with code bases or missing detail in the story. It also includes the degree of risk to completing the story, which could be mitigated. The definition of ready should be used as an aid to assess how well we can determine our knowledge of the three factors mentioned above.

What we are seeing here — in terms of estimating effort, complexity and uncertainty — is a measurement or estimation using three coordinates. We are already familiar with a three coordinate (three plane Cartesian) method for calculating size. Remember that story points are about size.

We can use length, height (depth), and width to calculate or estimate the capacity/volume of a container. We can think of a sprint as a container within which we put stories of different sizes.

Think of sprints/iterations as containers.

When we are estimating story size though, our three coordinates are effort, complexity and uncertainty. We will not have precise sizing for each of these but we do have a way to improve how we judge those. A key point to note here is that, just the same as container sizes, the relative values of effort, complexity and uncertainty can vary, yet have the story can have the same, or very similar, size.

Stories with different values for effort, complexity and uncertainty/risk can have the same size. NB: the use of a formula above is for illustration only. We should not “calculate” our story points but merely consider all aspects in our estimate.

The hardest aspect of using story points is getting started in the first place. This is because the standards organisation for a story point is the team itself. The process of switching from time estimations to using a nebulous, undefined unit requires a Copernican shift in mindset. The experience gained in initial sprints using story points is well worth the discomfort that many can feel. What we achieve during those initial few sprints is to gain experience of the stories, including our relative accuracy in sizing them. This needs to be reflected upon in retrospective meetings to confirm team confidence in the estimations they have made. Lessons learned about how estimates could have been improved are also valuable.

The secondary benefit of those early sprints is that the total points completed in each sprint, enable us to start to calculate an average velocity for the team. We are now getting a view of the capacity for our sprints. I did mention that we should aim to dissociate story points from time although I should confess that I have conceded to allowing teams to draw some parallel to time in the initial stages because it helped their understanding at that stage.

Was I making a rod for my own back? Sure, but let’s try something and learn from it. We can coach and guide our teams but what we want to achieve is understanding, not compliance.

When the team can understand the idea of a sprint as a container and that stories that we estimate are smaller boxes that fill that container during planning, the idea of estimating without association to time starts to make sense.

Story Point Defined

With the experience of having worked on stories over a few sprints, we can look at them and decide, “Which are the smallest stories here that are worthy of an estimate?” Some work is so trivial in size that more effort could be expending in estimating it than doing the work. The smallest story point value we have is half a point. Do the team consider that is a value that they can use? If so, then identify the SBIs (Sprint Backlog Items)that the team can agree would represent that story point size. The next value up is one story point — we do need to identify SBIs that represent this value (if we cannot, are we decomposing stories enough? That is a subject for another article). Which are the previous SBIs that the team can agree represent that size? Complete that process and we now have a team agreed standard for the story point.

Velocity

How do we capture velocity? Well, simply speaking it is the sum of the story points completed during a sprint. Average velocity? The sum of the story points completed per sprint over a number of sprints. No surprises there! There is a potential issue in teams where the velocity is captured using Jira, or perhaps how Jira is set up.

Jira captures the story points for SBIs that are “done”. It does not capture partially done work, therefore it would benefit the teams to not over-reach in their commitment in early sprints. More work can be pulled in if capacity remains. SBIs with no estimate, obviously, contribute nothing to the recorded velocity. Also, Jira only captures story points that are assigned to SBIs which opens the debate about which SBIs should have story points assigned.

Let’s bear in mind that we are using story points to identify capacity for planning purposes. That capacity is analogous to velocity. I would suggest a guiding principle that if a stand-alone SBI occupies development capacity in a sprint then it should be estimated. Tasks below a story, or even spikes if they are below a story will be included in the story estimate. If they are independent of the story then estimate them. This helps near-future planning as well, for some number of sprints ahead. We should bear in mind that average velocity often varies over time.

This use of determining capacity and velocity can become complicated where management only want certain pieces of work estimated. Limiting what is estimated does not demonstrate an understanding of the main use of story point estimates. This should promote a discussion to see how the motivation of management outside of the team can be satisfied or met, without disrupting the work of the team unproductively.

I am a firm believer that we should not adjust estimates during the sprint. If we find that an SBI was significantly under-estimated then the team should — having estimated the extra capacity that the work will involve — meet with the Product Owner to determine if the sprint goal can still be met or if a reorganisation and prioritisation of work is required. The sprint retrospective is an ideal opportunity to reflect upon previous estimates to agree on our accuracy or otherwise. Did we miss something that we can use going forwards, to improve our estimates?

Image of a team meeting — Photo by You X Ventures on Unsplash

Relative Estimation

Story points start to shine when we are able to size new stories relatively. I am sure everyone has seen the example of comparing animal sizes by way of explanation.

I discussed earlier in this piece how we can identify stories (or other SBIs) that we have completed previously, to define half a story point or one story point. The efficacy in relative estimation overall can be increased by first identifying the typical range of story point sizes that the team tend to have in their sprints, with their current practice of story composition and refinement.

For ease of explanation, let us assume that we are dealing with a team that rarely have stories of greater than 13 story points in sprint. That would be expected to cover most teams and products. To both reduce the burden of identifying previously completed stories (of a size that the team agrees on) and also to simplify the process of relative estimation, we can use benchmark sizes. This involves using every-other story point size for relative estimation.

We should identify a number of example stories for what the team agree are representative for story point sizes, remembering that “boxes” with differing dimensions can have the same capacity. Let us use an example where the team agree that sizes 2, 5, and 13 would work for them. We identify example stories for those sizes as a team.

I would actually suggest starting with one size — the most common — then include the other sizes once the team are happy with their examples for the first size.

Once the team have examples for their benchmark sizes they can perform relative estimations in this way….

Showing initial benchmark sizes for this example

Since a five is more than twice the size of a two and a 13 is more than twice the size of a five, then initial comparison (closest to a 2, 5 or 13) should be easily made.

Which story point size is the new story closest to?

For this illustration, let’s say that it is closest to five story points.

Final comparison to assign a story point size.

Et voila! Is the new story a good comparison the the five point story examples, bigger or smaller? You now have your story point size.

Summary

OK, cards on the table (appropriate, don’t you agree?). Relative estimation is not a “one size fits all” approach, it is another tool in your team estimation toolbox. To perform relative estimation, the team still need a good understanding of a story and planning poker is a great way to have that discussion. There will be some stories that the team assess are best estimated using planning poker, others they will be able to decide can be more readily estimated relatively.

For my teams working with relative estimation, the important take away is that they do not feel the need to satisfy requests for “time to complete”. The work is committed for this sprint (or PI if you work with SAFe) and will be delivered in that iteration.

Link to Serious Scrum — Do you want to write for Serious Scrum or seriously discuss Scrum?