Why the Fibonacci Point System is Terrible for Sprint Estimations

Jorge Yau
Jorge Yau
Jul 24, 2020 · 5 min read

The Fibonacci Point System

We employ the following point system in our sprints to size feature complexity and estimate velocity. Each point builds upon the previous point’s complexity requirements.

  • 1 — Small one line fixes, copy changes, or design tweaks.
  • 2 — Small logic changes with potential regressions.
  • 3 — Standard feature that often requires tracking and AB testing.
  • 5 — Larger features than 3 that usually involve creating new screens and components.
  • 8 — Very large feature that often requires significant refactoring or designing a new architecture.
  • 13 — Epics. Projects that are too large to scope. Highly complex multi-sprint efforts that require significant planning and testing.

It’s a very common software industry standard to use Fibonacci pointing to estimate how much work we can accomplish within two-week sprints. Teams use estimations to determine sprint goals, calculate velocity and staffing needs based on burn-down charts, and plan quarterly roadmaps. Nevertheless as much as they try to estimate correctly, teams are always falling behind. From my experience, people end up not taking estimation seriously and even abandon it entirely.

It’s a common joke among developers to say,

“Oh just take whatever work you have to do and multiply it by 3.”

We laugh it off but there is some truth behind it.

Why it’s Bad for Sprint Estimation

While Fibonacci pointing is good for measuring the complexity of a project, it by itself is a poor point system for measuring the actual amount of time and work it will take to complete a feature.

Here is where the fallacy lies. Take this scenario for example:

  • Person A has FIVE 2 point tickets. Total points: 10
  • Person B has TWO 5 point tickets. Total points: 10

On paper and from management’s perspective, both person A and B have the same amount of work, 10 points each. However in reality, Person A has a relatively light sprint with work that requires little-to-no planning and minimal code review times of about 5 minutes per ticket.

Meanwhile Person B has an impossible sprint that is severely underestimated due to the number of unknown variables that typically arise with 5 point stories and above. Not to mention, it takes HOURS if not DAYS to get larger pull requests to be reviewed, updated, reviewed again, and repeat until it’s approved. Writing unit tests and QA-ing larger features also scales linearly with how complex the feature is. So in total for a 5 point feature, Person B is actually doing significantly more work to have everything done “correctly”. Otherwise they are taking “shortcuts” by writing hacky code, bypassing code review, or skipping unit testing. Either that or they are working overtime which WILL lead to developer burnout.

The problem then becomes: how do we convey this to others.

The Solution

A more accurate estimation would be that Person B has 30 points of actual work compared to Person A who has 10 points. This is calculated using the table below where the actual amount of work is the result of multiplying the story’s Fibonacci complexity by a linearly increasing scaling factor.

Image for post
Framework for calculating actual work from Fibonacci complexity.

The framework: simply estimate the complexity of a feature, then multiply it by the scaling factor, and use the result to calculate the actual amount of work required to complete the feature without taking shortcuts. If the amount of work cannot fit within a sprint, break up the ticket until it does. If shortcuts can be taken and repaid later, do so but make a debt ticket to repay it. Make sure to communicate this debt ticket to management so they understand the costs of taking shortcuts.

But…points vary from person to person

One common reason why point estimations are considered inaccurate and why teams stop doing them is because every developer has a different level of experience, aptitude, and perspective. What’s often a 2 pointer could easily be a 3 pointer and vice-versa depending on the person. This is often the case for newer teams or teams switching to sprints so management finds it confusing and unpredictable. However as the team becomes more experienced and develops a rhythm, the Fibonacci complexity estimation will improve in accuracy over time.

Retrospective Hindsight

It’s no wonder why teams are always missing their sprint goals — they are doomed to fail without management truly understanding why. The developers know why: there’s so many subtle nuisances and scope creep that often arise as the feature grows in complexity — but it’s difficult to anticipate and plan for them. In retros, teams will undoubtedly end up talking about the same underlying issues over and over again rather than actually using retro to improve the agile and estimation process. The psychological impact to team’s morale becomes noticeable and detrimental. Leadership by this point may get involved — even if things are going well because deadlines have been missed and quarterly roadmaps are thrown out the window.

Generally I’ve noticed that these issues begin to arise with 3 point features and frequently occur with 5 and 8 point stories. The amount of work required to complete those larger features often requires many hidden tasks such as refactoring code, handling edge cases, discovering and addressing race conditions, and debugging issues discovered in code review and QA. Not only that but resolving these issues often involves other people who are already busy themselves and thus causing downtime from blockers. This downtime is very disruptive to flow; is mentally taxing due context switching; and requires waiting time for replies and impromptu meetings to reach a decision. These hidden costs would be accounted for during the estimation by the framework and would be much better for psychologically framing how scalable the architecture is and how well the team is doing.

Pair Programming

One solution to alleviate downtime and improve estimation is through pair programming. It is often much better to tackle problems and design solutions together where you can help catch fallacies that the other person may not realize. You’ll discover issues sooner, learn A TON from each other, spread domain and technical knowledge across the team all while building friendships along the way.

Conclusion

My takeaway to any reader is to try it out. If humans are great at one thing, it’s that we are all great at underestimating. This framework accounts for that — it accurately estimates how complex a new feature is and how much work it will take to build while conveying the hidden costs of building larger features to management. Try it out and let me know what you think.

The Startup

Medium's largest active publication, followed by +752K people. Follow to join our community.

Jorge Yau

Written by

Jorge Yau

Senior Web Engineer at Stash. I write about NYC, tech, and the immensity of life.

The Startup

Medium's largest active publication, followed by +752K people. Follow to join our community.

Jorge Yau

Written by

Jorge Yau

Senior Web Engineer at Stash. I write about NYC, tech, and the immensity of life.

The Startup

Medium's largest active publication, followed by +752K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store