Why Software Development Time Estimation Doesn’t Work & Alternatives
Originally published at innoarchitech.com here on August 15, 2015.
Time estimation of software development tasks without statistics doesn’t work. I would also argue that the time, cost, and effort required to estimate and track time with traditional methods are not worth the perceived business value they create.
Having said that, setting target goals and deadlines, along with associated progress tracking can and should be done in many cases and can certainly provide value. The key is that it should be done using minimal impact and statistically-driven techniques, which we’ll discuss later.
The goal of this article is to present a benefits and disadvantages-type analysis of traditional time estimation, as well as a discussion of viable alternatives to the traditional approach.
Let’s define what I mean by traditional time estimation. In this article, time estimation refers to the practice and process of trying to determine the amount of time (usually hours) required for a single software engineer to complete a given task (task, story, epic, etc.) before beginning development. This also includes any subsequent progress tracking and target date commitments made based on these estimations.
A task may also need to be broken into many smaller tasks that must each be estimated as well. In this case, the estimation process must be applied to each subtask, and then accumulated to get the overall task estimate.
The ability to accurately estimate time and delivery of a given software development task is a function of the relative size of the task, and I would argue is highly exponential. In other words, the bigger the relative size, the more inaccurate the estimate becomes at an exponential rate.
The Reasons for Time Estimation and Tracking
The first reason is that sometimes software releases and/or feature introduction needs to be driven by a fixed date. Examples of this include having a customer deliverable deadline, an upcoming industry conference or event, or an excellent marketing opportunity that’s time-sensitive. This is a legitimate reason and should be addressed accordingly.
The second potential reason is that estimates make certain people happy, typically senior managers or executives. These folks are interested in metrics, status updates, and reporting in order to gain visibility, transparency, and insight into departmental initiatives, deliverables, and actions taken that are geared towards achieving the company’s objectives and goals.
These metrics and reports are designed to provide indication and measurement of the effectiveness, progress, and impact as a result of these actions and deliverables. This is also a legitimate reason if the metrics provided are meaningful, valuable, and accurate enough to warrant the time and cost overhead.
Product development, which in this article refers to product management, software engineering, and QA departments collectively, are also usually expected to provide metrics and status updates regarding key deliverables. In the tech business, these key deliverables are chunks of software that we call features, enhancements, and so forth.
The metrics and progress updates that are most requested of product development are how long will a task take, what is the delivery date, and is the project on track?
The benefit here is that it helps certain people have an idea as to when to expect the aforementioned key deliverables to arrive. This is the case regardless of whether these key deliverables need to arrive at a given time due to one of the fixed date scenarios mentioned above. There is also the possible benefit of deadline-driven motivation, but there are reasons discussed later why this could actually hurt instead of help.
The last reason is that it helps calculate time-based productivity and planning metrics like sprint velocity. In this case, I am not going to consider it a benefit for multiple reasons, especially given that velocity is subjective and non-statistically determined.
While the fixed date scenario does have potential business and motivational value, I would argue that the informational reason has value to individual members of the business, but not necessarily to the business as a whole.
The Disadvantages of Time Estimation and Tracking
Before I get into the disadvantages, I’ll give an analogy involving a sales scenario. Suppose a salesperson is trying to close a deal with a very large corporation for a SaaS product.
Large companies often require multiple phone conversations, demos, internal meetings with their key personnel, iterations of contracts and service level agreements, budget and cycle considerations and timing, approvals, security audits, and so on, before finalizing a product purchase.
Sales will then report to senior management that this deal could take 6 months to a year to close, but has no way to determine the exact date or actual time due to the factors described. Typically senior management accepts and understands the lack of a closing date estimate, and doesn’t question the inability to accurately estimate one. They may give suggestions to help speed the process, but again, reporting a guess about the timeline and reasons for delay is often good enough.
Time estimation and target dates of non-minor software development tasks are equally difficult to estimate accurately, and yet are often required. The main issue is that as soon as an initial time estimation and target delivery date are given, these values tend to get set in stone, set unrealistic expectations, and are viewed as commitments. Estimates are not commitments.
Why Is Software Time Estimation Rarely Accurate
So why is traditional software time estimation rarely accurate, and as a result often not valuable or worth the required cost, effort, and time?
Firstly, time estimates rarely take into account the following:
- The productivity and experience level of the engineer, particularly if multiple people are involved
- PTO, Late arrivals, early departures, sickness, etc.
- Unforeseen defect and customer requests, troubleshooting, challenges, system/environment issues, software/library issues, learning and ramp-up, design/architecture, required research, etc.
- The fact that software engineers notoriously underestimate
- Unforeseen issues with maintainability, architectural flaws/imperfections, scalability, performance, testability, etc.
- Time associated with spikes, R&D, design, architecture, mockup, prototype, POC, etc.
- Administrative work and non-engineering related requests
In virtually all cases, some of the above occur and result in an inaccurate estimate. It’s very difficult to take many of these until account since they only become apparent once coding. A common practice is to add padding to the estimates to account for these unknowns. In doing so, the estimate is inherently inaccurate due to the arbitrary and erroneous padding as well.
Most software organizations and senior management realize the downsides of the Waterfall method, and have embraced and implemented some form of lean and/or agile methodology as a result.
One may think that the only way to get a better estimate of time and delivery date is to try to subtask and think of everything up front, which involves more requirements gathering and documentation, mockups, prototypes, UML diagrams (sequence, use case, etc.), and so on.
This is exactly the opposite of being agile however, and is an immediate regression to waterfall methods. The primary problem with waterfall is that requirements always change once people actually see and use the software, which means that even doing all of this up-front work still results in an incorrect estimation due to rework and changes.
Requirements also almost always change mid-development as well. These changes require additional tasking and estimation, but are rarely reflected in the initial estimate, which at that point is considered a commitment. The project immediately appears to be behind schedule, although it isn’t.
Parkinson’s law states that work expands so as to fill the time available for its completion. This certainly applies to software engineers, and more specifically, an engineer will work to the full time of a conservative estimate that includes padding, even though they could potentially finish the job in a quarter of the time (for example) without the estimate. This therefore represents a lost opportunity to get more done faster.
Estimates are also not usually updated once the team realizes that a task is behind. The engineer(s) will typically try to speed things up and do more faster, or work longer hours and weekends. In the first case, they can’t suddenly multiply their productivity, and there would be a significant quality loss and risk of missing requirements even if they could. In the later case, this is the fast lane to burnout and reduced overall productivity from lack of downtime, rest, etc.
Lastly, the software engineers are some of the most technical people in the company, and typically have a background in programming, mathematics, and other very technical topics. These same engineers, across all companies, continuously estimate tasks and therefore target dates incorrectly. If anyone in the company should be able to accurately estimate anything, it’s probably these folks, and yet they rarely do.
In summary, the traditional up-front planning, tasking, time estimation, and the resulting time tracking administrative tasks slow progress down and burn up time that could be spent doing more faster and with less, and would be better spent on things like improving, creating, and/or innovating new products and services. In addition, these types of administrative tasks tend to reduce engineer’s productivity, efficiency, and creativity.
The Cost of Time Estimation
Assuming the national average salary for software engineers is $95K. Adding the usual 25% for taxes and benefits brings it to $118,750. Assuming a 40 hour work week, this is approximately $57 per hour.
Now assume a team of 10 engineers that each spend 5% of their week working on meticulous tasking and the associated time estimation and tracking administrative work. This means that 2 hours are spent per developer on this per week, which equates to $114 per week per developer, $1140 per week across the team, and therefore $13,680 per year.
The time and cost involved in creating and tracking inaccurate estimates results in no actual software or value creation. I would argue that there’s an additional opportunity cost component that should be considered here as well.
Alternatives to the Traditional Approach
Statistics work, and should be used as appropriate. What follows is a discussion of alternative techniques to estimate software development task lead time, target dates, and progress measurement using simple statistical methods.
The ability to accurately estimate software development tasks, and the variation in this accuracy due to the difficulties described above, is clearly random given what we see in practice. Events and systems that are random in nature, such as software development time estimation, can only be described using statistics and are referred to as being _stochastic_. The trimmed definition of this term according to Wikipedia is:
The term stochastic occurs in a wide variety of professional or academic fields to describe events or systems that are unpredictable due to the influence of a random variable. In this regard, it can be classified as non-deterministic (i.e., “random”) so that the subsequent state of the system is determined probabilistically.
The random variable in this case is the time estimate and resulting target date for a given development task and its accuracy. There is no question that this is stochastic and can therefore not be estimated without using probability techniques.
The problem is that velocity and/or software development estimates rarely involve probabilities and statistics, and yet must. The resulting estimates are therefore nonsense since we try to deterministically produce the value of a stochastic random variable. This would be considered a laughable practice by any statistician.
In David Anderson’s book Kanban (read it), he discusses the concept of a target lead time coupled with a due-date performance metric as an alternative to estimation and target date commitment of individual development tasks. He also talks about time estimation as being considered a costly activity, while target date commitments are low-trust activities due to unkept promises stemming from inaccurate estimates.
An example of a target lead time with a due-date performance metric would be:
Producing a deliverable that the engineers think is about one week’s worth of work, in exactly one week about 90% of the time.
So 90% of the time (the due-date metric) the engineer(s) will deliver the finished task in exactly one week (the target lead time). The other 10% of the time it may be delivered either early or late.
In practice, an engineer (or team) decides on an approximate lead time based on gut feel, experience, educated guess, etc. This process requires almost zero actual time. Once decided, the task is worked on and the actual lead time is recorded once complete, along with some way to categorize the relative size for the task. Over time, a data set is created that allows for statistical analysis and much more realistic estimates (lead times) and due-date performance metrics.
One method that I use for characterizing the relative size of development tasks is a variation of the tee-shirt sizing method. Each task is given a relative size corresponding to five tee-shirt sizes, along with a very rough lead time estimate (for a single developer) as shown.
XS: Half day or less
S: Half day to one day
M: Two to three days
L: One week
XL: One to two weeks
In the case that a task may take more than two weeks (e.g., epics), the task should be broken into subtasks that can be assigned one of these sizes.
As a database of sizes and actual lead times gets bigger, the more meaningful and statistically significant the lead time estimates and associated due-date performance metrics will become.
One last point. While the engineering team should be the only people ever estimating software development tasks, sometimes a fixed date is required as previously discussed. In these cases, it is very important that the scope is tailored accordingly, and the project is prioritized by “must haves”, “high wants”, and “nice to haves”, just in case the full scope can’t be completed by the target date.
Let me start by saying that I’d say that 99.99% of the time, projects are “behind” due to incorrect estimates and not because of product development issues. This is a result of everything discussed and is the key takeaway of this article.
Unrealistic time estimates and the subsequent perceived date commitments can also incentivize the wrong things. They incentivize speed, which often results to reduced quality, increased defects, missed requirements, rework, unhappy customers and CS staff, technical debt, and additional maintenance cost and time.
Fixed or target deadlines can be very important and have value, but should be handled using appropriate techniques as described. For these situations, it’s better to take the time to do things properly with high quality standards and perhaps a somewhat reduced scope for initial versions. If done correctly, the overall time and cost will be less in the end.
If you or your company want an idea of how long a task will take and an approximate target date for delivery, then do yourself, your team, and your company a favor and start trying to get to these values by using statistical methods. It’s the only way to go.