Predicting release dates — with dice

JC Plessis
Aug 25, 2017 · 6 min read

Everybody wants to know the future. When will I receive the item I bought online ? When will I arrive at my destination ? When will we be able to release a particular set of functionalities ? Of course there is no easy way to answer any of those questions. The best we can do is try to predict when it is probably going to happen.

Let’s predict the easy way

In Agile methods there are two distinct approaches : estimates VS no estimates. The former requires developers to estimate how long tasks are likely to be. The later bases its predictions on number of tasks done regardless of size. No matter your team’s choice, you have to measure how much work you have accomplished over the last few sprints.

The simple approach is predicting using a burn-up chart. You plot your data :

  • the work you have already accomplished,
  • an horizontal line representing the amount of work needed to reach your target.

Then you can draw a line that follows the general trend of your work progress and you have your prediction.

If you try it yourself, you’ll see that when you draw your prediction you will have a very precise prediction. Not accurate, precise (check this video if the difference is not obvious to you : What’s the difference between accuracy and precision?). In my example it tells us we will be done at ¾ of sprint 8.

But reality might hit hard as this measure reduces all the randomness of the data to one fixed value : the curve’s trend.

Monte Carlo simulation

So let’s take it one step further and use Monte Carlo simulation approach. The idea is pretty simple : do a lot of random simulations and analyse what happens.

I had already encountered this in my job. I was working for a fund management company and they wanted to analyse where to invest the money they had. We applied random scenarios 10'000 times to potential investments. If investment performed well under the scenario, they got a good grade. With the grades we selected the best investments.

http://grikdog.blogspot.fr/2010/10/go.html

This approach is also used in AlphaGO, the AI that won against top Go players last year and again this year. To determine if you are winning or losing, you simulate the game until you reach its end. Analyzing one move takes the AI about 1 minute. So to analyse if it is winning or losing, it will need about 6 hours in the first game moves. This is not a viable option.

Instead AlphaGO is using Monte Carlo simulation :

  • It will choose moves at random until it reaches end of game
  • decide if the outcome is good or not
  • repeat that a lot of times
  • compute ratio of wins / losses to get an idea if situation is good or not

This approach coupled with other AI improvements lead to AlphaGo’s victory.

Back to our case

In our case Monte Carlo simulation can be used like this :

  • We have data : the outcome of previous sprints and a target
  • We simulate what would happen to 1000 virtual teams, picking a sprint outcome at random from our data.
  • We compute when our virtual teams have finished the project and we analyse the result.

Let’s take an example

Imagine our team has already a 5 sprint history with these outcomes :

  • Sprint 1 : 6 stories completed
  • Sprint 2 : 4 stories completed
  • Sprint 3 : 7 stories completed
  • Sprint 4 : 3 stories completed
  • Sprint 5 : 5 stories completed

And our next release contains 40 stories.

I will simulate manually a virtual team using a normal 6 sided die. I’ll roll the die and if I get result X I’ll consider my virtual team has achieved as much stories as my real team did on sprint X. As I have only 5 historical data, if I roll a 6 I’ll just roll again.

Here is my virtual team’s result :

This virtual team completed the 40 stories of our release at the end of sprint 9. But we don’t know if this is a normal team, a lucky team or a catastrophic team. So we will do that 1000 times to get a more representative simulation. I won’t do it by hand though, I have a spreadsheet to do that : Simple Monte Carlo

It presents the cumulated work for each virtual team :

Computes when each virtual team is done :

And plots the results :

We see that :

  • some very fast teams finished by end of sprint 7 (even 1 team end of sprint 6)
  • 450 virtual teams finished at the end of sprint 8
  • bulk of the teams finished end of sprint 8 or 9 (as our manual example did)
  • and by the end of sprint 11 almost every team is done.

From the burn-up we were expecting something end of sprint 8, now we would take a bit more caution and aim for end of sprint 9 or 10.

A different strategy

Adrian Fittolani went a different path in this article : Agile Project Forecasting — The Monte Carlo Method. He uses a Monte Carlo simulation to compute a lot of random Takt Times. Using those Takt times, he predicts project completion. I won’t go into details, check his article for that. I was expecting that our methods would give very similar results. His method seems to shift the predicted project completion even further.

I’ve made a copy of his spreadsheet that I fed with my team’s data : Takt Time Simulation

What we can see :

  • 28% of teams are done at end of sprint 8 (about 45% in my simulation)
  • 86% are done end of sprint 10 (98% in my simulation)
  • 2 teams finishing sprint 13

Further thoughts

What is striking is that even though there is a random part, the results are very stable from one run to the next. Part of the explanation can be found in this video (in french) : La puissance organisatrice du hasard — Micmaths

We can use Monte Carlo simulation for many purpose : AI, statistics, search and rescue, physics, engineering. You can also check this video by Physics Girl and Veritasium : Calculating Pi with Darts

However we should stay critical :

  • We assume the future is like the past,
  • We assume our data is relevant
  • We already found 3 different methods giving 3 different results

And we know those hypothesis are not always true :

  • We might build technical debt slowing us down
  • We might have had early successes that are not reproducible

Moreover this is a prediction based on estimations, not a commitment. Keep your eyes open to detect major fluctuations. Update the prediction as often as possible.

This article is already long enough, I’ll let for another day the questions :

  • Why should I try to predict the future?
  • How should I react to this prediction ?
  • What should I do when my prediction is changing ?

Links

Adriano’s article : http://scrumage.com/blog/2015/09/agile-project-forecasting-the-monte-carlo-method/

Monte Carlo method use cases : https://en.wikipedia.org/wiki/Monte_Carlo_method#Applications

How many moves in a Go game ? : https://en.wikipedia.org/wiki/Go_and_mathematics

La puissance organisatrice du hasard — Micmaths : https://www.youtube.com/watch?v=2Wq6H8GMVm0&t=816s

Calculating Pi with Darts : https://www.youtube.com/watch?v=M34TO71SKGk

What’s the difference between accuracy and precision? — Matt Anticole : https://www.youtube.com/watch?v=hRAFPdDppzs

My copy of Adriano’s spreadsheet : https://drive.google.com/open?id=1EhF12DV4vfDb5JN_M1ELqsEEY8v8KuEZ8q7u46XNOd0

My spreadsheet : https://drive.google.com/open? id=1er1ojzS8Nau7NjuPzf5UiXZ2HInyCjsXyM119mSnIUc

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade