45 min readMar 7, 2019

***Note: this is a very early, unedited version of this article. I will hopefully update any spelling/grammar/reference mistakes in time. So please forgive any obvious error or omissions. Feedback welcome!***

Don’t Be a Ditka,
Or, Can You Prioritize Your Way to Optimizing Value?
Or, CD3 is Bollocks

TL;DR: CD3 is often touted as a prioritization/sequencing method used to help an organization maximize its value delivery. However, CD3 requires a specific set of assumptions to be true in order to provide an optimal result. Those assumptions are simply not valid in most complex product development environments. In those contexts, focusing on duration by making items as small as possible is the strategy that wins out over the long term. In short, it’s not “outcomes over output”; it’s “output over outcomes”.

How good are you at estimating value?

How good do you think an organization that is worth more than $1 billion is at estimating value? Do you think such an organization is better at estimating value than you? Do you think it should be?

The billion-dollar organization that we will use to test this theory is a professional American football team called the New Orleans Saints. The Saints themselves play in an organization of other teams known as the National Football League (NFL). As you can imagine, much of an NFL team’s ability to compete is dependent on its ability to attract the very best players possible — the Saints being no exception.

The main mechanism by which each team in the NFL selects new players to add to its roster is called the NFL Draft. The draft is held once a year in the Spring just before training for the new season commences. Draft candidates are players that typically have been playing football at an American college or university for several years, thus giving each NFL team’s professional scouts, coaches, and management an opportunity to see how the prospects play.

During the draft, there are multiple rounds of picks, with each team getting exactly one pick per round (to start out, at least). In each round, the teams take turns using their one pick to select a single player, with the order of picks progressing in descending order of their previous year’s record. That is, the team with the worst record picks first and the team that had won the last year’s Super Bowl (the NFL championship game) picks last. This same order is used for every round. As the draft proceeds, teams are allowed to “trade” pick order with each other. For example, a team with the 28th pick in the first round may choose to “trade up” with another team to get an earlier pick (the 7th pick, perhaps). That trade offer may include later draft picks, other players, money, or some combination thereof. Once a player has been selected for a given team, that team then has the sole right to offer that player a contract (or release the player and allow him to be pursued by other teams). For the initial contract period a selected player can only play for the team that drafted him (yes it is only male players that are drafted at this time — my hope is that will change in the future).

Returning to the New Orleans Saints, in 1999, then head coach Mike Ditka staked his whole draft on the acquisition of a single player — a running back by the name of Ricky Williams. According to Richard Thaler in his book Misbehaving, “Ditka decided that the only thing stopping the Saints from winning a championship was the acquisition of one player…That year, the Saints owned the number twelve pick [in the first round], and Ditka was worried that Williams would be snapped up before their turn came, so he announced publicly that he would be willing to trade away all of his picks if he could get Williams (not the smartest negotiation strategy). When it was the Washington Redskins’ turn at the fifth pick and Ricky Williams was still available, the Saints were able to complete the trade Ditka wanted, although at a very steep price. Specifically, to move from the twelfth pick to the fifth pick, the Saints gave up all the picks they had in the current draft plus their first- and third-round picks the following year.” To be clear: Ditka valued the right to select one and only one player, Ricky Williams, as equivalent to nine other picks over two years.

Another nuance of the draft is that all teams in the NFL have to operate under what’s known as a “salary cap”. The salary cap is a monetary limit set by the league that the aggregate sum of the salaries of all the players of a given team is not allowed to exceed. For example, the 2018 salary cap was $177 million. That means if you sum up all the salaries of all the players of a given team, that sum is not allow to be larger than $177 million.

The constraint of the salary cap also influences draft picks because, as you can imagine, players that are picked earlier in the draft are considered “premium” players and thus usually demand a higher starting salary than players picked later. This puts pressure on teams to value their picks properly to either avoid paying too much for a new team member or to avoid missing a player who could have a significant impact on the team’s performance (thus increasing the team’s overall value).

Not only do teams have to estimate player values, but they also have to prioritize their best candidates as they only get to pick once per round a total of seven times. In other words, a team would not want to leave a player that they value highly as their seventh-round pick because chances are that the player would have been drafted by another team in an earlier round.

Thaler sums both these points up as follows: “So high picks end up being expensive in two ways. First, teams have to give up a lot of picks to use one (either by paying to trade up, or in opportunity cost, by declining to trade down). And second, high-round picks get paid a lot of money.”

At the time of the 1999 draft, coach Ditka had been around professional football for almost 40 years. The argument could be made that his four decades of experience would make him an excellent judge of value. Did it? Did he and the billion-dollar franchise known as the New Orleans Saints make the right value decision when it came to selecting Ricky Williams?

You’ll have to read on to find out, but I’m hoping all of this sounds familiar.

Prioritization in Product Development

Most of my work in the past (especially the very recent past) has been focused on answering the extremely important customer-centric question, “When Will It Be Done?” (WWIBD). In giving guidance to answer that question, the implicit assumption I have been making is that people/teams/organizations/etc. are actually currently working on (or starting to work on) the right things. This is admittedly a very dangerous assumption as more often than not I see teams killing themselves to deliver items of dubious value against unrealistic delivery timelines. It turns out that for WWIBD to have any kind of relevance, we first need to answer a potentially much more important question, “When Should It Begin?” (WSIB). The “When” word in WSIB implies that WSIB is simply a matter of timing. While that is true, it is only partially true. The bigger component of WSIB is actually priority. What I mean is that is if a team has 1,000 items in its backlog, it would be ridiculous to think that they could start all of those items all at the same time (despite most product managers’ desire to do so)^. After all, if we had unlimited resources, then there would be no need to prioritize or to ask WSIB — the answer would always be “right now”. In the real world, however, the best answer to WSIB for the great majority of items is “not now”. Thus, the answer to WSIB is really a function of priority.

According to Klaus Leopold, “Prioritizing things is, in principle, a wonderful activity. You bring order to chaos and have the satisfactory feeling of direction. You know which assignment comes next on the list and the level of satisfaction increases as more work on the list is completed.”

So, if prioritizing is so great, then there must be an easy way to do it, right? I’m assuming, of course, that you want to prioritize such that you are optimizing the delivery of customer value. If not, then you might as well stop reading now. The wonderfully good news is that we are spoiled for choice when it comes to techniques on how to prioritize. I’m sure you’ve heard of at least one more of the following:

· Stack ranking based on value — this is a fairly standard approach where items are ranked 1..n based on some value assessment (usually some type of fictitious business case — ROI — where strangely enough no idea ever loses money).

· HiPPO (Highest Paid Person’s Opinion) — this is where whomever makes the most money (or has the highest title) gets to decide what is worked on.

· Eurovision Song Contest / American Idol Voting — this is where consensus is attempted to be reached by voting on all possible candidates with order determined by the top vote getters.

· CD3 (Cost of Delay Divided by Duration, which is a specific implementation of the Weighted Shortest Job First [WSJF] algorithm) — where items are ranked in descending order according their CD3 calculation (more on this later).

· Throwing darts, Crying, Curling up in the corner in the fetal position — need I say more?

(Before we continue, I’d like you to make a mental note of which one you think is best at optimizing the delivery customer value.)

For a more detailed discussion on prioritization schemes, please see Chapter 5 of Klaus Leopold’s excellent book, “Practical Kanban”.

Cost of Delay

Of all the prioritization methods mentioned above, the one that seems to have curried the most favour amongst the Agile intelligentsia is CD3. The reason for CD3’s popularity is quite sensible: CD3 gives the best economic framework upon which to base prioritization decisions (or so the argument goes).

To demonstrate this, let’s take a closer look at how CD3 actually works. Any discussion of CD3 must first, of course, begin with a discussion about Cost of Delay (CoD). My favourite way of explaining just what CoD is, is to borrow (steal) from the Don himself, Don Reinertsen. Don’s explanation of CoD goes something like this (for more information on this topic, please see https://www.youtube.com/watch?v=OmU5yIu7vRw ):

Let’s start with the assertion that time is money. Mathematically, that can be expressed as:

t = m

A corollary to this is that a change in time equals a change in money:

∆t = ∆m

The problem with this equation as currently stated is that the units on either side of the equivalence do not match. On the left side of the equation, time is in units of seconds, days, weeks, etc. and on the right side of the equation, money is units of dollar, pounds, euros, etc. In its current form, this equation will not be valid until we get the dimensions to match. The way we achieve that is to insert a partial derivative:

∆t (∂m / ∂t) = ∆m

(Don’t get too hung up on the math itself here, it’s not really that important.)

It is that partial derivative term that is the CoD. In more layperson’s terms, CoD is the change in the total lifecycle profit for an item with respect to a change in the availability date of that item — where the units of CoD are always in terms of money per unit time (e.g., $/week). The core idea here is that the total amount of lifecycle profit that an item will generate is a function of when that product becomes available. CoD, therefore, is the amount of decrease in cumulative profit if we delay the introduction of that item. Put another way, lifetime profit of an item is dependent on its availability date, and CoD is a measure of the rate of that dependency. That’s a bit abstract and hand-wavy, so let’s see if we can demonstrate that concept a little better with a chart.

Assume we have a graph with time across the bottom (the x-axis) and total lifecycle profit for an item up the side (the y-axis). Now what we are going to do is — for a single item — determine what is the absolute earliest date that this particular item can be delivered, and, if it is delivered on this earliest date, what is the total lifecycle profit we will get (in most cases, that earliest theoretical delivery date is today — whether that is realistic or not is another matter). For example, let’s say the earliest date we could deliver a particular feature is on January 1, 2019 and let’s say if it is delivered on January 1, 2019 that we can expect to make $1 million in total profit over the life of the feature. What we do, then, is on our graph we would go across the bottom and find January 1, 2019 and at that point on the x-axis we would plot a point on the y-axis that corresponds to $1 million — as shown below.

We would then do this same exercise for every successive relevant time interval until we get to a date that it would no longer be feasible to deliver the product. Again, for example, let’s say the time interval we are interested in is months, so the next point on our graph that we would plot is the total lifetime profit if the feature is delivered on February 1, 2019. If that profit is $900,000, then we would plot a dot for $900,000 at January 8th. We would do the same thing for March, April, and so on and so on until we reached a date that no longer made sense. If we followed that procedure, we would come up with a graph that might look something like:

The way we would figure out the CoD for any given point (date) along this curve would be to calculate the slope of the tangent line at that point.

To get an average CoD between any two points (dates) we could calculate the slope of the line between those two dates.

Note, again that CoD is always communicated in terms of money per time. If you hear anyone talk about CoD in any other units then you can be sure they are not talking about CoD. Also note that the above curve is a Lifetime Profit curve. It is NOT a Cost of Delay curve. In your studies, you may have come across something like Figure X that was labeled as a CoD curve. To be honest, I’m really not sure what those things are, but what I can say is that what I’m talking about here is definitely not that. Got that?

The reason that CoD is important is because it now gives us an economic framework upon which to base prioritization decisions. Let’s say that that we are on a team that is ready to start work on a new feature and we are trying to decide between two competing feature options. For the sake of simplicity at this point, let’s assume both features will take the exact same amount of time to complete (we’ll handle the case of differing duration shortly) and that we won’t be able to steal resources, work overtime, etc. to get either feature done faster. One way we can make a decision about which feature to work on is to calculate CoD for each and chose to work on the one with the higher CoD. It’s slightly counterintuitive because it is often the case where Feature A will have a higher lifecycle profit than Feature B, but on the date delivered, Feature B may have a higher CoD. In that case, it would be more economically feasible to prioritize Feature B over Feature A (a more detailed discussion about this is beyond the scope of this article so I invite you to look up all the great resources of Don Reinertsen in this regard).

In the immortal words of Klaus Leopold, “When discussing Cost of Delay, the economic perspective is automatically integrated into the decision-making process. It does not deal with fictional units multiplied by imaginary measurement figures. It deals with values which can be quantified in monetary units and can be compared within the entire company, or even across enterprises.”

A quick word on Cost of Delay vs. Delay Cost. This can seem like a pedantic distinction, but it is an important one as most literature gets this wrong. As defined earlier, CoD is the rate at which lifetime profit changes with a change to the availability date of the item. So, let’s say that a particular item has a CoD of $5,000/week. If that item is delayed by 6 weeks, then the company loses $30,000 of potential lifetime profit. That $30,000 is known as the “Delay Cost”. Unlike CoD, Delay Cost is expressed in units of money only. I mention this here only in an attempt to clear up some confusion around the two terms. For the rest of this article;8, CoD and Delay Cost will defined in terms of what I have explained here.

Cost of Delay Divided by Duration

Again, stealing from Klaus:

“Let’s do a thought experiment. We are in a prioritization meeting and there are three different features — A, B and C — before us and we should put them into a sequence. For all three features, we have calculated the Cost of Delay and the Cost of Delay begins immediately. When considering how to do the sequence, we must take into account how long it takes to complete the individual features, because time is an economic component. Let’s assume we know the completion time for the three features:

• Feature A has a low Cost of Delay at €5,000 per week but has a long completion time of ten weeks.

• Feature B has a low Cost of Delay at €5,000 per week and has a fairly short completion time of five weeks.

• Feature C is finished fairly quickly in five weeks but has a very high Cost of Delay at €10,000 per week.

These three features can be illustrated in a diagram as blocks (see Figure 5.10). Regardless of which order the features would be worked, the total Cost of Delay at the beginning is always CoD(A) + CoD(B) + CoD(C) = €20,000 and must be amortized. The same applies for the duration, as the completion time for all three features is always t(A) + t(B) + t(C) = 20 Weeks. We would like to know which sequence is best to reduce the total Delay Cost as quickly as possible.

Figure: Reducing Cost of Delay

“To begin with, let’s choose the sequence ABC. When we start with Feature A, the €20,000 Cost of Delay from A, B, and C, remains until the work on Feature A is completed, i.e. ten weeks. The Cost of Delay for Feature A is removed once completed. The Cost of Delay for B and C remain, €15,000 per week, as long as Feature B is being worked on, which is five weeks. After a total of 15 weeks, the Cost of Delay for B can also be removed and remaining is only the Cost of Delay for C, €5,000 per week, which needs five weeks to be completed. Figure 5.11 illustrates this cost-reduction process. The area created gives us the total Delay Cost, which can be quantified using an area formula for the three rectangles that are formed:

Total Delay Cost = (10 × 20) + (5 × 15) + (5 × 10) = 200 + 75 + 50 = €325,000

Figure: Poor Sequencing with High Cost of Delay

“Let’s try to minimize the area. It makes the most sense to reduce the highest Cost of Delay as quickly as possible, so we choose sequence CBA. While we are working on C, the total Cost of Delay of €20,000 are present. However, as soon as Feature C is completed, five weeks later, its high Cost of Delay of €10,000 is removed. The remaining Cost of Delay from B and A is relatively low in comparison. Next, we work on Feature B and after five weeks another €5,000 in Cost of Delay is removed. Finally, we dedicate ourselves to Feature A, and after ten weeks, the remaining €5,000 Cost of Delay is removed. As can easily be seen in Figure 5.12, the area is much smaller than it was for Sequence ABC. Here, too, we can quantify the area:

Total Delay Cost = (5 × 20) + (10 × 10) + (10 × 5) = €250,000

“Sequence CBA saves a total of €75,000 over 20 weeks when compared to Sequence ABC. This is because we thought out the sequencing before starting the work and decided wisely as to the order of the work. Take a moment to really consider this. Saving €75,000 was achieved simply by changing the order of work — not a single person had to work harder or faster. This is the strength of Cost of Delay!”

The good news is that you don’t always have to draw these rectangles and calculate resultant areas to get proper items sequencing. Like everything in math, you have to learn the absurdly hard way to do a calculation before learning the easy, shortcut way (anyone remember how to “complete the square” to solve a quadratic equation? Why the f*ck didn’t we just learn the quadradic formula to begin with? And don’t even get me started on derivatives…) The mathematical shortcut to the above graphical analysis is to calculate a “Cost of Delay Divided by Duration” (CD3) for each feature. Optimal sequencing is therefore obtained by sorting the CD3 numbers for each item from highest to lowest and then working on items in that order. For example, in Klaus’ thought experiment above, the CD3 for each feature is:

Feature A CD3 = (€5,000/week) ÷ 10 weeks = €500/week2

Feature B CD3 = (€5,000/week) ÷ 5 weeks = €1,00/week2

Feature C CD3 = (€10,000/week) ÷ 5 weeks = €2,000/week2

(For what it is worth, I have no idea what the units money/time2 mean — this is a detail that all writeups on CD3 that I have seen have failed to address).

In this example, Feature C’s CD3 is highest, then Feature B, then Feature A. The correct sequencing given by CD3, therefore, is CBA, which, you will recall, is the exact same sequencing that Klaus came up with in his example.

This discussion has provided a rough proof that CD3 produces an overall item sequencing that minimizes total delay cost and thus maximizes value delivered (Figure 7–11 of Don Reinertsen’s book “The Principles of Product Development Flow” shows this pictorially as well). The theory is sound and seems to be backed up by rigorous quantitative analysis.

So, what’s the problem? Is there even a problem?

Problems with CD3

Let’s look at the deficiencies of the inputs to CD3 first. These initial deficiencies aren’t fatal, but they are necessary to understand before our more detailed exploration of the flaws of CD3 in certain contexts later.

What’s called “Duration” in CD3 is what I have referred to as “Cycle Time” in my previous books (see https://leanpub.com/actionableagilemetrics and https://leanpub.com/whenwillitbedone ). Cycle Time is simply a measure of the total amount of elapsed time it takes for an item to complete — exactly the data we need for the denominator of the CD3 calculation. However, you will also know from my previous work that Cycle Time itself is stochastic. That is, the Cycle Time for a given process is not a single number but rather it is a probability distribution. What that means is that it is impossible to know before an item is started precisely how long it will take to complete (i.e., in the form of a single number).

In many contexts, the exact same argument can be made for value (or its proxy, CoD). To be sure, there are contexts where value can be deterministically calculated beforehand (contractual obligations and regulatory compliance, to name but two) but I believe that for the vast majority of scenarios in the domain of complex work, it is impossible to precisely determine value upfront. We could, however, as in the case of Cycle Time, reasonably come up with a probability distribution that describes value in a given context.

All of this means that we are most likely operating in a world where both CoD and Duration are stochastic, yet the inputs that I showed in the previous example (and, shamefully, the inputs you will see in many other writeups of CD3) assume single, deterministic, precise numbers. That practice is just silly.

Most CD3 examples show single numbers as inputs when really those inputs should be probability distributions

However, as I just mentioned, the fact that both CoD and Duration are themselves probabilistic is not fatal — thankfully we have statistical tools at our disposal to handle such a case. The solution to this problem would be to use something like Monte Carlo Simulation (MCS) that took in probability distributions for both CoD and Duration and calculate CD3 for a particular feature as a range of possible outcomes. The resultant CD3 distributions for all features could be compared and sequenced according an acceptable level of risk (do I want to show an example of this here along with the Singh simplification?). This is exactly approach that Klaus Leopold, Prateek Singh, Todd Conley, and myself took several months ago as we ran multiple simulations to better understand the stochastic nature of CD3. In short, our results confirmed that even in a probabilistic world, CD3 gave the best value-maximizing answer when it comes to the prioritizing/sequencing of work to be done (again, within an acceptable degree of risk).

No, the weakness of CD3 is not one of stochastic inputs — again, it is one of timing.

The Saints Come Marching In

Which brings us back to the 1999 New Orleans Saints. Recall that the ’99 Saints traded all their picks in that year’s draft plus the first and third round picks in the next year’s draft to get the player they wanted, Ricky Williams. Also remember that the Saints had to make their pick and sign the player’s contract all before that player played even one second of professional football. That meant that the Saints were not only placing a very high estimated value on Williams in that they traded up to get him (thus almost ensuring they would have to pay him more money) but they also highly prioritized his pick as they gave up 8 subsequent picks over two years to get him.

Any thoughts on how this turned out for New Orleans and Mike Ditka? As I’m sure you can guess, not very well. “Williams played four years for the Saints and was a very good but not transformative player, and the team could have used the help of all the players they might have acquired with the draft picks they traded away.” To say that New Orleans did not do well that year is a bit of an understatement. They ended up as the second-worst team in the league. Again, according to Thaler, “Clearly, snagging Williams was not enough to turn the team around, and Ditka was fired.”

Uncertain Value

The estimation and prioritization problems faced by the Saints when making their draft picks are the same problems that face most every product development team. That is, teams have to make decisions about estimation and sequencing usually before work is ever started. However, as we all know, value is determined by the customer, and therefore any exact value determination can only be made after work is delivered — not before! A better way of saying that is that organizations are prioritizing work when they have the least amount of information and the greatest amount of uncertainty.

Product Development teams prioritize work when they have the least amount of information and the greatest amount of uncertainty.

That statement is the fundamental reason why CD3 does not work as well in the complex domain as most proponents think it does.

It’s even a little trickier than that. If you think it is hard to make a value determination BEFORE work has started (and it is), it isn’t much easier to calculate value delivered even after it has been delivered. Take Google’s G Suite bundle of SaaS-based business applications, for example. At the time of this writing, Google charges $5/user/month for bundle of online office tools. The G suite bundle itself has dozens if not hundreds if not thousands of included features. What is the value of any single one of those? The lazy answer would be to say that we calculate a baseline of revenue, then release a feature, then calculate how much revenue went up and, voila, we have our value determined. Problem is, that’s not quite how it works. If Google adds spell check to its document editing application, and revenue goes up, then it is reasonable to assert that the revenue didn’t increase just because of that one feature. A person made the decision to buy because of the cumulative effect of spell check plus all other features. The thought experiment to prove this is would you by a document editor that only had spell check and no other features like bold, paragraphs, or even a way to enter text? Probably not. That means then that the value of the spell check feature has to be amortized across the $5/user/month along with the thousands of other features that already exist. So, is spell check worth $0.01/user/month? $0.10/user/month? Or $0.000001/user/month? I don’t know, and I bet Google doesn’t know, either.

And we still haven’t even discussed the very real possibility that the delivered feature could actually turn out to have a negative value. How many times has Google or Apple or whomever released a “UI update” that caused users to leave in droves? Even if the delivered feature resulted in a 0 change to net subscribers (that is, on balance, none were added and none left), the feature would have had a negative impact on lifetime profit because of the cost to develop it was not offset by new revenue from new users (not to mention the opportunity cost of the other things we could have done with the resources spent in building the feature).

So, we are left with the fact that value is uncertain going into development and it is also likely uncertain (and quite conceivably negative) coming out. But remember, our CD3 sequencing calculation is made BEFORE work is started and value validated — if it can be validated. What happens if we prioritize/sequence our work based on certain value assumptions but then find out our that actual realized value is dramatically different? In Klaus’ example, we assumed Feature C had a CoD of €10,000/week. What if when we delivered it we find that the real CoD was actually €1,000/week?

No problem, CoD proponents would say, because we just acknowledged that value is stochastic and so we just need to come up with a probability distribution that covers all likely outcomes as input to CD3. That’s true, but, again, what if the actual, realized distribution is much different than what we assumed^^? Going back to Klaus’ experiment, what if we assumed a probability distribution of CoD for Feature C that ranged between €1,000/week and €15,000/week. But what if when we deliver Feature C, the real CoD distribution is -€5,000/week to -€1,000/week? That is, what if the realized value is in a range that we never even considered to begin with? In short, what if we were just plain wrong about our initial assessment of value?

This is exactly the case of Mike Ditka, Ricky Williams, and the ’99 New Orleans Saints. It is the fundamental reason why CD3 does not work — and it happens all the time. How did the Saints (and how do we) get it so wrong? This is a team was worth hundreds of millions of dollars — they had an almost unlimited stream of capital from which to work to get this value decisioning right. They employed dozens of experts who collectively had centuries’ worth of experience. Yet they still made this glaring error. And as I alluded to earlier, they are not alone in the NFL when it comes to their inability to estimate. In his book, Misbehaving, Thaler gives several other examples of how teams consistently overvalue picks and make the wrong economic tradeoffs when it comes to the draft.

1. People are overconfident. They are likely to think their ability to discriminate between the ability of two players is greater than it is. (stack ranking fallacy)

2. People make forecasts that are too extreme. In this case, the people whose job it is to assess the quality of prospective players — scouts — are too willing to say that a particular player is likely to be a superstar, when by definition superstars do not come along very often. (e.g., inflated business cases)

3. Present bias. Team owners, coaches, and general managers all want to win now. For the players selected at the top of the draft, there is always the possibility, often illusory, as in the case of Ricky Williams, that the player will immediately turn a losing team into a winner or a winning team into a Super Bowl champion. Teams want to win now! (Too much WIP fallacy)

In the above, replace the word “team” with “feature” or “project” or “initiative” and you will understand why we routinely get this stuff wrong. The problem of overconfidence, extremism, and bias in our ability to determine value upfront, coupled with the fact that value is going to change over time anyway, is so prevalent, in fact, that I would argue it is necessary to assume that we don’t know anything about value initially (probabilistically or otherwise) and that any sequencing decision we make based on upfront value assumptions will necessarily lead to suboptimal economic outcomes.

Uncertain Duration

The same arguments can be made about duration. How many times have you “estimated” duration of an item before you started working on it, and then actual time spent on development was drastically different once delivered — either much shorter, or more likely, much, much longer? Yet again, that initial CD3 calculation and sequencing is made on our very wrong idea of duration BEFORE works starts. So, if you are working on something and you find it is taking too long to complete and thus the economics of CD3 have changed, do you immediately stop working on that item and start something else with a higher CD3? Or do you try to break the item up into smaller pieces and deliver those? But what if those smaller pieces also change the economics of CD3 — especially when measured against all of the other items that we are waiting to work on? More on this in just a bit.

New Items Show Up

Speaking of other items that we are waiting to work on, we still need to discuss the problem of sequencing as pertains to CD3. In the Klaus thought experiment and CD3 example calculation above, you’ll remember that we came up with a supposedly economical sequencing of CBA. But what happens in the very real case where Feature D shows up while we are working on (i.e., not finished with) Feature C and Feature D actually has a higher CD3 number than C? As you are probably aware, this happens all the time! There are some CD3 proponents who would say that the second we have information about a higher CD3 item, we should immediately stop work on the lower one in favour of working on the higher one. This is a very slippery slope because what if we start working on Feature D and Feature E shows up with a higher CD3 than Feature D? Would we stop D to work on E? If we followed this logic to its extreme conclusion, nothing would ever get done. While a bit reductio ad absurdum, I hope you can appreciate the point that things aren’t as straightforward as they might first appear.

The more mature solution to the problem of new items showing up (and the one that I would hope most CD3 proponents would argue for thought they never mention it) would be to finish Feature C in the first place and then recalculate and re-sequence having included any and all valid items that may have shown up since the last sequencing. But does that answer even lead to optimal value realization over the long term? Or is there yet another option that requires less work and yields an even better result (otherwise known as nirvana)? It turns out there is, but before we go there let’s first quickly review where we’ve been thus far.

Inputs to CD3

· Value is stochastic — in real world modeling, value input into CD3 must be a probability distribution and not a single, precise number.

· Duration is stochastic — in real world modeling, value input into CD3 must be a probability distribution and not a single, precise number.

Assumptions for CD3 to work:

1. Value In = Value Out — Value at time of prioritization/sequencing (distributional or otherwise) must equal realized value at time of customer delivery

2. Duration In = Duration Out — Duration at time of prioritization/sequencing (distributional or otherwise) must equal total elapsed time for customer delivery

3. No New Items — optimal sequencing requires no new items arrive that have an economic impact on items already in progress and/or already sequenced.

It should be obvious from the previous discussion that if realized value is dramatically different than estimated value, if items take longer or shorter than originally assumed, or if new items show up that change economic considerations, then CD3 will by definition provide the wrong sequencing answer.

Any one of the above assumptions actually occurring in the real world is extremely rare. The chances of all them occurring simultaneously to make the right prioritization decision is infinitesimal.

The fatal flaw of CD3 is that you are trying to solve an economic optimization problem at the time when you have the least amount of information and the greatest amount of uncertainty — before work starts. In other words, using CD3 as your prioritization scheme is a risk management nightmare. Worst of all, you’ve masked these deficiencies in a mathematical model that provides a false sense of security. My favourite quote on risk management sums up the problem of CD3 quite nicely: “The biggest risk is that you have a losing strategy when you think you have a winning one.” [Jeff Yass, founder Susquehanna International Group]

The biggest risk is that you have a losing strategy when you think you have a winning one.

CD3 is Bollocks

So, what would happen if you built a simulation that more closely modeled what happens in the real world? Would CD3 still give the best prioritization answer to maximize economic outcomes? (spoiler alert: it doesn’t) Or is there a better option? (spoiler alert: there is).

Those are exactly the questions that Prateek Singh from Ultimate Software and I set out to answer. Remember, of course, we are answering this question from the perspective of product development contexts where the three CD3 assumptions don’t hold. If you live in a world where value is known upfront (and validated to be exactly what you thought it was after delivery), where duration is known upfront (and validated to be exactly what you thought it was after delivery), and where the arrival of new items does not affect the economics or delivery of existing items, then congratulations. You’ve hit the jackpot. You should never, ever leave your job. Ever. And CD3 will work perfectly for you in your context, so use it. If, however, you are like the rest of us poor slobs where those assumptions are almost never true, then please read on.

To model what happens in the real world, you need to set up a simulation where you have variable value inputs, variable value outputs, variable duration inputs, and variable duration outputs. Further, you need to extend that simulation to include new items that arrive at variable times thus forcing a reprioritization. For a more detailed discussion around how such a model might be set up and how the results of the simulation might be interpreted, please see Appendix A.

Once you have this simulation modeled properly, it then becomes a case of trying different prioritization schemes to determine which algorithms maximize delivered value over the long term. Some schemes that Prateek and I considered: CD3, CoD (highest value first), Duration (shortest time first), pure random selection, etc. Once we had our model in place, it was time to put it to the test.

Output Over Outcomes (Simulation Results)

The simulations we ran presented some very interesting results. When we compare the results of testing our model at varying degrees of confidence, we observe the following.

Case I: CD3 Assumptions Are Met

Let us first look at the case where Value estimates are precise and the original estimated value is what we get after the feature (or project) is released. (It should be noted here that in this case we are also not resizing items — a point that will become very important a little later). The base case (CD3, Value = Original, Size = Original) is represented as final value delivered after 40 weeks and the other prioritization methods are displayed as percentage deltas.

The above results show that CD3 is clearly superior to random prioritization. This is exactly what we would expect from our previous discussion. In this case, all CD3 assumptions are valid and therefore it should be no surprise that CD3 yields the best results. In most cases it delivers 70% or more value than the random prioritization scheme (on average 68% more value).

What is surprising here, however, is how close the The Duration Ascending prioritization method is to CD3. All results with that scheme are within 10% of those of the CD3 and on average 10% worse than CD3. In other words, if we were very good at precisely estimating value, and did not right size our items, prioritization using CD3 would be only about 10% superior to simply ordering by shortest Duration! The question of whether the exercise of collecting detailed value estimates for each project is worth the 10% gain is very context specific. If the effort for getting to these accurate estimates is minimal (it rarely is), then pursuing CD3 is a good option, otherwise prioritizing purely by duration would be as effective.

Case II: CD3 Assumptions Not Met

If we reassess both value and duration after the feature is delivered (such that value and duration at the time of prioritization may not necessarily equal the actual value and duration once the item is delivered), the results change quite a bit. In the chart below, CD3 is again used as the base case. Items are still not being right sized and value is being reassessed after delivery.

In this case of value being reassessed (or in other words, estimates of value being inaccurate), prioritization in ascending order of duration (shortest job first) on average produces 18% more value than CD3. From these results, we can conclude that if getting accurate estimates of value is close to impossible, purely ordering by duration will produce optimal results without incurring the extra cost of time spent estimating value.

The main reason for Duration beating out CD3 in these simulations is that by only concentrating on duration, we are putting out a greater number of features. We can fit more of the short duration items into the 40-week simulation length that we have chosen. If we are willing to accept that we are not good at forecasting value (and not be a Ditka), the more items we do the greater our chances become of delivering a high amount of value. In other words, because value cannot be determined ahead of time, we increase our chances of releasing a highly valuable feature by releasing more features. Without stressing the system, the best way to deliver more features is by prioritizing the backlog by duration so that more features can fit into the given timeframe (we’ll explore this conclusion after the rest of the simulation discussion).

Case III: Right Sizing of Items

What happens if we introduce right sizing into the picture? Now we examine the cases where we are not good estimators of value and we actively break features down for delivery whenever they are above a certain threshold. In these cases, whenever a new project is pulled to be worked on, we evaluate its size and if the project is larger than a given size (5 weeks in this case) we randomly break down the project into 4 pieces and redistribute the value between the 4 pieces as well. These smaller pieces are added back to the backlog and we then re-run the prioritization algorithm (CD3, Duration Ascending or Random) to pick the next project.

Random now performs much better than it has in our previous comparisons. It is, on average, only 18% behind the CD3 prioritization. Prioritization by duration though is, on average, 51% more effective than the pure CD3 prioritization. This is a major difference, and, we are not even accounting for the time spent in acquiring value estimates in order to do prioritization via CD3. Similar to the last set of results, if we are not good at estimating value (which none of us really are), prioritizing by duration is a superior strategy. If we add to that the active sizing of the items, where we break items up into smaller deliverable, valuable pieces, prioritization by duration delivers over 50% more value.

Resizing features into smaller chunks helps us increase the number of short duration features that are in our backlog. This, in turn, increases the number of features we can deliver if we are sorting by duration. As we deliver more, smaller features, we are able to place multiple bets which all have the possibility of being high value. The combined effect having more small options to choose from and choosing purely by using “shortest duration first” gives us a major advantage over all other prioritization methods.

A Final Important Simulation Result

The positive effects of “Right-Sizing” are not limited to just the Duration prioritization scheme. Right sizing increases the pool of available (small) options and this benefits every prioritization scheme tested. Honestly, this is probably the most important result. Even if you have no control over the prioritization scheme, you will almost always be better off making items as small as possible. The wonder of small batches never ceases to amaze.

Right Sizing Wins

Yes, you read the preceding section right: in the contexts where assumptions are not met, CD3 is not the winner. Shortest duration (and right sizing) is.

This actually makes sense upon reflection. In contexts where value is extremely uncertain, it is best to place as many bets as you can in order to maximize your chances of one of those bets paying off. Venture Capital firms do this all the time. There expectation in terms of managing their portfolio is that only about 10% of their investments will actually pan out but it is impossible for them to tell you at the time they make those investments exactly which ones will pay off. Poker players face this all the time, too. In Texas Hold ’Em, a player would never go all in on the first bet just because she got dealt two aces as her hole cards even though two aces is of rather high value. In fact, most poker players will tell you that “bankroll management” is one of the most important aspects of a long-term winning strategy. The idea is to get out of pots as quickly as possible (i.e., short duration!) and live to fight another day. The way to win a poker tournament is to survive (still have chips in front of you) and one way to do that is to be in as many hands as possible for as cheap as possible because you don’t know which one is going to win big. See Annie Duke’s excellent book “Thinking in Bets” for more context around how to make multiple decisions under extreme uncertainty.

These results confirm the whole point of this paper. In most complex product development contexts that are dominated by uncertainty, the best prioritization/sequencing scheme is work on as many “short” items as possible.

In complex product development contexts that are dominated by uncertainty, the best prioritization/sequencing scheme is work on as many “short” items as possible.

Ravings of a Madman

If you will humor me for a bit, I have to spend a couple of paragraphs ranting about certain arguments I hear when others try to justify the inadequacy of CoD and CD3. Please forgive me for indulging in this venting. Some of this section is reactionary, some of it is petty on my part, all of it makes me angry. You can certainly skip this section in its entirety with no loss of continuity. But if you read on, maybe you will find a nugget or two worth retaining.

Only Have to be Better Than the Next Best Thing

You’ve probably heard this joke before:

“Steve and Mark are camping when a bear suddenly comes out and growls. Steve starts putting on his tennis shoes.

Mark says, ‘What are you doing? You can’t outrun a bear!’

Steve says, ‘I don’t have to outrun the bear — I just have to outrun you!’”

Whenever objections are raised about the deficiencies of a given approach, idea, or whatever, this joke is dragged out as a justification for mediocrity. The point of the joke is that we need not strive to be the best, we only need to strive to be better than the previous “best” thing. While that is technically true, I think this line of reasoning is a wholesale cop-out. Anyone who uses it should be ashamed of themselves. Why not always strive for process excellence and the best we can possibly be in everything we do? Why not continually look for better options that give more accurate answers with less effort? Come on, community. We can do better. Much better.

The worst part about this argument as it pertains to CD3 is that it is not necessarily accurate. As this paper has shown, CD3 may not — and probably doesn’t — give any better answer than the prioritization scheme that you are using right now. Don’t buy the snake oil!

The CD3 Numerator Is Most Important

Ugh. I see these types of CD3 comments all the time:

· “It’s the numerator that matters far more” [Joshua Arnold discussing CD3 in his blog, http://blackswanfarming.com/cost-of-delay-divided-by-duration/]

· “Lesson: Our assessment of VALUE is probably a good deal more important than our forecasting of duration in many cases.” [John Cutler, https://hackernoon.com/better-decisions-by-forecasting-cycle-time-as-a-team-6d36690f511f ]

From a pure mathematical perspective, these types of arguments make no sense whatsoever (with fractions, the numerator and the denominator matter equally — otherwise you don’t have a fraction).

Worse than that, this advice is extremely misleading (if not potentially wrong). We just proved that if anything, when maximizing economic value, making duration as short as possible matters much more than any notion of estimated value. Need I point out that in CD3, duration is the denominator — not the numerator?

Now, I’m not saying that Joshua or John don’t know what they are talking about, but these statements in particular don’t stand up to scrutiny.

Where’s the Beef?

I have asked several times for real-world evidence that CD3 actually works. Apart from Joshua’s excellent case study [http://blackswanfarming.com/experience-report-maersk-line/] (which, unfortunately, in my opinion comes to an incorrect conclusion after taking all the right actions) there is a dearth of actual, empirical, verifiable evidence that CD3 yields optimal economic outcomes. If CD3 works so well in complex product development domains, why are people so shy about sharing their experiences? “Absence of evidence does not mean evidence of absence” is the retort I get whenever I ask for proof. Again, that is technically true. But they said the same thing about the earth being the center of the universe.

CoD is Easy to Quantify

The reality is that in complex product development contexts, it is difficult if not impossible to gather the data needed to both quantify and validate CoD. I’ve already alluded to this fallacy with the G Suite example above, but please don’t take my word for it. Watch this video from Don Reinertsen on CoD: https://www.youtube.com/watch?v=OmU5yIu7vRw Around about the 24-minute mark in this video, Don starts to talk about how to calculate CoD. Don actually uses the words “it’s not that hard” when describing how to quantify CoD. However, as you watch the video, I would say what he’s describing is anything but “not that hard” (to get the calculation right, that is). Further, it’s quite obvious from Don’s example that he is speaking from the perspective of a large organization with a well-staffed finance department. If you are a small-to-medium business (which, by the way, represents about 99% of all businesses in the United States) with no dedicated finance department or if you are a startup where your finance officer is also your IT support specialist is also your office admin, well, then, good luck trying to quantify CoD.

It’s the Conversation that Matters Most

I am trying to decide which of the arguments in the section make me the angriest. This one is very close to the top of the list. Yet another justification for CoD/CD3 is that it is not the actual number that matters, but the conversation around getting the number. First, this is the same line of reasoning employed to justify the use of story points as an estimation technique. As far as I am concerned, it has been soundly proven that using story points does not materially improve the quality of estimates (in fact, in many cases, story points make estimations worse). Secondly, though, what this paper has proven that most upfront conversations around CoD are waste. There is simply too much uncertainty to wade through to make accurate decisions. You don’t drive out uncertainty by talking, you drive out uncertainty by doing. Therefore, the better strategy by far is to short circuit any initial conversation, start work as quickly as possible, and break up items into the smallest pieces possible in a just-in-time manner as more information is gained.

Conclusion

It would be easy to say that the 1999 NFL draft was an aberration and that the rest of the league actually does much better at drafting players. But it turns out that 1999 is simply an entertaining example of endemic valuation problems in the NFL that is very well documented. For more information on this, I’d point you to Richard Thaler’s book, “Misbehaving”, as well as his published paper on the topic, “Overconfidence vs. Market Efficiency in the National Football League”. Take a look at those resources and every place you see “NFL team” substitute in your mind “Product Development Organization” and you will start to understand why CD3 is flawed.

****

Let’s bring this back to the complex product development domain, however. As with most things concerning flow, Don Reinertsen is way ahead of me and everyone else in the community. In his book “Flow” book he actually discusses the contexts where decision algorithms like CD3 are most applicable. Specifically, he states that CD3 only works “when we have reliable estimates of delay cost and task duration.” I would highlight the words “reliable” and “and” in that sentence. Most domains have neither reliable estimates of delay cost nor duration. Far fewer have both. What this paper has shown is that in those situations, you are far better off to control for duration and simply choose value at random.

Allow me to also reiterate that if you are lucky enough to live in a context where value and duration are well known upfront (and easily verified to be accurate after delivery) then by all means use CD3 — it really is a good value optimization algorithm in those settings. Some examples of those domains might be a highly regulated industry where compliance demands implementation of some features; or a contractual environment where you are legally bound to deliver certain functionality; etc. However, I believe most companies in the complex product development domain don’t fit this categorization.

Further, I would hope I would never say to not use relevant information when making a decision. If you have any reliable information around CoD, then you should definitely use it. Even then, however, you will get tremendous benefit from breaking your items up into the smallest possible pieces. Just remember that in contexts where uncertainty dominates less time should be spent in upfront planning and estimation, and more time should be spent in actually doing the work (driving out uncertainty). The ideal is to validate that you have either a winning or losing strategy as quickly as possible and move on to the next thing. Your optimal strategy over the long term is to place as many bets as possible because you don’t know which ones will hit.

So, to sum up, in order to maximize economic value over the long term:

1. If you focus on one thing, focus on duration (break items up if it looks like they are taking too long to finish).

2. Prioritization and sequencing order doesn’t matter. Choosing at random is as good a method as any (assuming #1).

3. Pay attention to Ageing — you’ll need an objective measure of “how long is too long” for an item to complete. See my two books listed below for more information.

4. Limit WIP

Epilogue

Most teams (and organizations) make suboptimal prioritization/value decisions every day because they are forced to make those decisions under:

Conditions of scarcity (not enough time, money, or people)
Conditions of stress (customers want their requests handled right now and delivered yesterday)
Conditions of uncertainty (imperfect information about their current state and future state)

These poor decisions adversely affect their ability to effectively, efficiently, and predictably deliver value to their customers. While teams will not be able to change these conditions, they can learn to make better decisions by embracing them.

And remember, CD3 is bollocks! Good luck!

^It is still possible to start too many things at once, but that is where the principles of flow and pull systems come in. If you aren’t paying attention to those, then it doesn’t matter what prioritization scheme you use — they will all be equally bad.

^^From a mathematical perspective, another option here is that the chosen input value probability distribution itself is *not* stationary — that is, the value probability distribution changes over time. This change could result from changing customer needs, changes in business climate, etc. A non-stationary input value distribution is a very likely possibility thus emphasizing the point that it would be extremely dangerous to assume that the value distribution going into the CD3 calculation is the same as the real value distribution that exists at the time when the item is delivered.

Appendix A: Simulation Set Up

Let us consider a fictional company MakeMoney Inc. This is a for profit company that releases a subscription product and can create value with each feature delivered. As features are delivered, the company makes money and we keep a running tally of how much money the company has made over a period of time (100 weeks). MakeMoney Inc. works with a strict WIP limit of working on one feature at a time. The company only picks up the next feature when the previous one has been completed and is delivering value to customers. While MakeMoney Inc. is very good about limiting WIP, it does not do the following things –

There is no specified prioritization scheme for which project to pick up next.
Projects are not right-sized and can range anywhere from 1 to 20 weeks.

We can setup these parameters in order to simulate the results of the company to find out how much money they would make. We will assume that MakeMoney is working in this fashion for a 100 weeks. There are 10 initial features available to be worked on in the backlog and every 2 weeks a new feature idea is added to the backlog. Each feature delivered can produce value anywhere in the range of 0 to 10,000 points of value. These values are randomly assigned once the feature is delivered. Listed below are the results at the various percentiles after running Monte Carlo simulations with the above parameters.

We can interpret these results as — after 100 weeks of delivering features, MakeMoney Inc. has a 90% chance of delivering 18,601 points of value and a 10% chance of delivering 45,831 points of value. We will be using these results as our base case in order to make comparisons with other results as we change some parameters.

These results by themselves don’t tell us much. What would be interesting would be to find out what happens if play with the parameters of prioritization schemes and right-sizing of projects. Let us change these one at a time.

What happens if we start asking the team to estimate duration and value of projects before they have been started and use these estimates to do CD3(Cost of Delay Divided by Duration) prioritization. Important to note that these are purely estimates. The actual value delivered will still be random for each project and duration is still random and not precisely determinable. What do the simulations tell us in this case.

The results show major improvements in the amount of value produced. MakeMoney Inc. now has a 90% chance of delivering 54,475 points or more and 10% chance of delivering 91,294 points or more. Comparing these to the random prioritization results we can see that across the percentiles just introducing CD3 prioritization drives up the value delivered results on average by 135%.

What happens though if we don’t spend any time estimating value, but estimate just duration. We will now pull these projects purely in the ascending order of duration. The same conditions as before apply, our duration estimates are still not precise and the actual duration might vary from initial estimate.

Comparing these to the original random prioritization we once again see major improvements. In fact, the improvements in this case are even more pronounced. While CD3 prioritization improved the results on average by 135%, Duration Prioritization (Shortest Job First) improves results on average by 178%. In other words, we gained value by spending time estimating duration and value, but estimating only duration using that for prioritization delivered even more value.

The next set of simulations we want to run are to explore the effect of introducing right-sizing of projects. We are going to go back to the initial configuration of randomly picking projects for MakeMoney Inc. This time though, instead of working on projects regardless of project duration, we will right-size these projects. We will break a feature up into 4 random smaller deliverables if the feature is deemed to be too big (>5 weeks). We will pick the first of these pieces to work immediately and put the others in the backlog for later selection. After each project is finished, the next selection will be made from the backlog at random.

The results of introducing right sizing are even more outstanding than changing prioritization schemes. While CD3 and Duration prioritizations produced 135% and 178% improvement respectively, the average improvement produced by introducing right sizing is an outstanding 567%. We are almost guaranteed to produce 4 times the value and likely to produce 5–6 times the value when compared to our base case. Also, right-sizing requires less reliance on use of up-front estimated value and duration, which we know to be highly inaccurate.

The immediate result from the above results is that right sizing (ie making smaller bets) has a more powerful impact on value delivered than changing prioritization schemes. The primary reason this works is because we are releasing more features, more often and have the potential of collecting value for each feature delivered. CD3 and Duration based prioritizations also improve value delivery due to the same reason. They, by their nature prioritize shorter projects and hence encourage more frequent delivery. In the absence of right-sizing though, these approached start to suffer when estimates are wildly inaccurate (which is very often the case in the real world). This is not to say that these techniques are inefficient. The bigger point we have made so far is that, if you are to make one intervention to increase the value being delivered, between changing prioritization schemes and right-sizing projects, right-sizing is a vastly superior intervention.

What happens if we treat our prioritization schemes results as the base cases and add right sizing to them. We are going to start with CD3 first, where developers at MakeMoney Inc estimate value and duration, then apply CD3 to prioritize projects. When a project is pulled to start it is right sized in the same manner as described above. The estimated value is randomly redistributed among the 4 pieces of the original project. The first of these pieces is worked and the other 3 are added to the backlog for future prioritization. The results are shown below –

The results are as should be expected. The two techniques combine to produce an average improvement of 714% over the base case. The addition of right-sizing to CD3 makes it 214% more effective on average when compared to base CD3 itself. Clearly, CD3 works, but it works miracles when combined with right sizing. A more accurate way to say this would be — Right sizing works miracles, but the miracles become even more pronounced when CD3 is added to the mix.

What happens when we do the same with the Duration based prioritization (Shortest Job First)? Again, all the same right-sizing rules apply. Below are the results of the simulations where we use right-sizing with duration based prioritization –

Not surprisingly these perform better, but somewhat surprisingly, these simulations, on average perform 1580% better than the base case. The combination of right-sizing and duration based prioritization also on average performs 498% better than just pure duration based prioritization. Right sizing creates multiple small options to pick from. Each of these options can potentially deliver large amounts of value. Working in duration order allows for more and more of these individual features being released to production. The combination of these two results in the most value being delivered.

We can summarize the broad findings from these sets of Monte Carlo simulations as the following two points –

· Regardless of prioritization schemes, right-sizing will outperform systems where projects are not being broken up.

· When we are right-sizing our projects, the prioritization scheme that out-performs the three schemes tested is ascending order of duration or Shortest Job First.

References

Adventures with Agile. “Cost of Delay: Theory & Practice with Donald Reinertsen” https://www.youtube.com/watch?v=OmU5yIu7vRw

Arnold, Joshua. “Cost of Delay Divided by Duration” http://blackswanfarming.com/cost-of-delay-divided-by-duration/

Arnold, Joshua. “Experience Report — Maersk Line” http://blackswanfarming.com/experience-report-maersk-line/

SBE Council. “Facts & Data on Small Business and Entrepreneurship” http://sbecouncil.org/about-us/facts-and-data/

Cutler, John. “Better Decisions (By Forecasting Cycle Time as a Team)” https://hackernoon.com/better-decisions-by-forecasting-cycle-time-as-a-team-6d36690f511f

Duke, Annie. “Thinking in Bets”

Leopold, Klaus. “Practical Kanban”

Cade Massey, Richard H. Thaler. “Overconfidence vs. Market Efficiency in the National Football League” http://www.nber.org/papers/w11270.pdf

Reinertsen, Donald. “The Principles of Product Development Flow”

Thaler, Richard H.. Misbehaving: The Making of Behavioral Economics (pp. 279–280). W. W. Norton & Company. Kindle Edition.

Vacanti, Daniel. “Actionable Agile Metrics for Predictability”

Vacanti, Daniel. “When Will It Be Done?”

Don’t Be a Ditka,Or, Can You Prioritize Your Way to Optimizing Value?Or, CD3 is Bollocks