Forecasting

troy.magennis
Forecasting using data
15 min readJul 7, 2017

Chapter 1

This chapter introduces the basic concepts of forecasting and sets the scene for how we will explore this topic in this book. By then end of this chapter you will understand what makes a “Great” forecast and what simple practices can set you on your journey to become a “Superforecaster.”

Goals of this chapter –

  • What is forecasting?
  • What makes a good forecast
  • What makes a good forecaster

Forecasting, defined

Forecasts are used to assist human intuition when making decisions. This book is about the decisions we make day to day in a software engineering and delivery context, and how to improve them. It’s about using data to avoid our reptilian brains leading us astray through copious amounts of wishful thinking and misinformation. This book is about forecasting using data, with the goal that even a little data will outperform our biased intuition.

There are many definitions of “Forecasting,” and some of them are likely correct. I’m not going to argue with anyone else’s definition of forecasting, what I’m going to do is give you the characteristics of what a forecast is to me and let you decide for yourself.

Forecasts have –

1. A statement about some future outcome or event outcome yet unknown (an answer to a question about the future)

2. A statement about the level of uncertainty such that the forecast has a chance of coming true, but not guaranteed (an answer to an agreed level of uncertainty)

3. A way of eventually testing the actual outcome against the forecast (otherwise its conjecture, and my guess is as good as yours)

Given these characteristics, we can apply them to some real world forecasts you are used to consuming. We listen to the news and desire to know what the weather is like tomorrow. The weather forecast gives us a general range of temperatures to expect and a chance of rain, snow or inclement conditions. Is it always right? No, I think we have all had cases where it isn’t, but is it better than your gut instinct? Yes. Why? Because meteorologists have used historical data to refine their models over many years, and use data from instruments to guide those models. If (when) they are wrong, they look to understand why and make the model better over time. They have been doing this for decades, and being significantly wrong is less and less likely.

Putting this against the characteristics a forecast needs as stated earlier –

1. Question: “Will it snow tomorrow?”

2. Uncertainty: “80% chance of snow.”

3. Feedback: “Did it snow?” No, why did we get it wrong? Are we wrong the appropriate amount of expected times?

Forecasts are a mechanism for humans to go on the record with an expectation about of the future. It’s then up to the receiver of the forecast to decide what to do next. The actions taken on the forecast are where success or failure is decided. Forecasts play the role of aligning expectation and actuals, helping us see clearer what action is more appropriate next. Which is why we need to make every effort to reliably forecast, without them we have no way of taking appropriate action with enough time to make a difference.

To put a final bow on what a forecast is to me, here is what I once tweeted:

Forecasts are essential to decision making. Formulating the right question is important. Being honest about how much uncertainty is clouding the ability to forecast accurately is vital. Deciding how much effort to expend in achieving those forecasts is an important skill. Before we start learning how to forecast, let’s take a quick look at what makes a good forecast and a great forecaster.

What is a “good” forecast?

Good is a relative term. If you have no information, even a rough idea of what might happen in the future can help make a better decision, meaning the forecast was useful (a subset of good). Dropping a rock of the edge of a cliff-face on a moonless night and counting the number of seconds before it hits something solid is enough to know if it’s too high to jump. To judge a good forecast we need some idea about what is already known and the consequences of being wrong. These factors will decide how “good” is “good enough.”

Looking at one good example of forecasting and presenting forecasts is the Google Maps navigation product. Ignoring how it works for the moment, lets focus on how it presents the results.

Figure 1–1 — Google® Maps navigation results table-view

Figure 1–1 and Figure 1-2 show how navigation forecast results are displayed from where I live to where we dine on the all too rare date night with my lovely wife. There are a couple of key points to discuss here –

1. It returned multiple options using different methods and roadways, not just one it defined the “best,” it left that decision up to me.

2. It didn’t give my arrival time, it given a duration of travel time from when I choose to leave. It helps me see the impact of leaving at various times.

3. And although not evident here, once I begin travelling, the forecast duration to complete the rest of the journey is constantly updated as more data (other cars) travel similar routes and Google is able to compute a more recent transit time.

Contrast this to I.T. and software forecasts commonly performed today –

1. We give a single option, the one the team as a collective assumed given current team size and current understanding of the feature or project concept.

2. We give an actual date result, ignoring that the start date may change (because when has that ever happened before!).

3. Once we begin, the original date forecast given is defined as a team commitment and new information is discarded in preference to hitting that original date at all cost.

Figure 1-2 — Google® navigation results map-view. Helps people understand the options being presented

Figure 1-2 is particularly lovely in representing the options for travel route. It clearly shows the different options, from public transit to freeway choice if I choose to drive myself. Beyond just being a well presented image, this map help me navigate when I do commit to one of the options, giving me turn by turn directions and updated travel times based on constantly changing conditions.

It also communicates whether the original forecast is still valid. This is important, if my original plan is at risk due to traffic accident or bridge closure due to breaking waves over a floating bridge[1] I’m immediately told. Software project forecasts rarely have this early warning about progress. Sure, we find out about it eventually, then scramble to recover, but a key reason for having a forecast is knowing EARLIER that that forecast isn’t turning into reality. This allows a lighter fix to be applied early enough to make a difference rather than a heavy project change (cutting key features after lots of effort) late in delivery.

I guess my main barometer for whether a forecast is “good” is that it communicated information well. If a forecast helps “see” the options available, and judge the impact of one option over another that I wouldn’t have seen through gut-instinct alone, then it’s a good forecast for me! If the options it portrays to me leads me to make a bad decision in retrospect, then it was a bad forecast, and I won’t rely on that source of information willingly again.

The final point I want to drill home about good forecasts is the point that the most beneficial forecasts highlight the unlikely and contain high information value. Cold icy and snowy temperatures at the South or North Poles are generally expected. A forecast (and actual) temperature at the North Pole are 30 degrees Celsius (86 degrees Fahrenheit) HOTTER than normal[2] is a high level of information that is a significant surprise. In this case, it’s the unexpected nature of the forecast or observation that adds value. Forecasts that illuminate unexpected results are the gold nuggets. Especially when there is still time to do something about improving an undesirable outcome!

Valuable Forecasts Change Decisions and Behavior

A valuable forecast is one that changes behaviors in advance of an undesirable outcome. Forecasts should highlight results that are unexpected, not just confirm the expected.

To be able to highlight abnormal, we need some feeling about normal. It is the use of historical data that drives our belief of normal. Sir Francis Bacon describes that being able to spot deviations is the key to science.

“To do science is to search for repeated patterns. To detect anomalies is to identify values that do not follow repeated patterns. For whoever knows the ways of Nature will more easily notice her deviations and, on the other hand, whoever knows her deviations will more accurately describe her ways. One learns the rules by observing when the current rules fail.” — Sir Francis Bacon

What makes a good forecaster?

In the book “Superforecasting — The Art and Science of Forecasting” (Tetlock & Gardner, 2016) the authors note that we are all forecasters. In our daily lives we constantly have to made decisions about future outcomes, whether it be for work (should I take the job at Google or SpaceX) or pleasure (will I catch more fish here, or over there). The Superforecasting book follows an experiment that looked to uncover the way and reasons some people forecast better than others. The experiment simply posed questions to participants and gave them a chance to forecast on-the-record about a wide variety of subjects. These forecasts were tested against the eventual outcomes and the participants scored as to how well they did at various intervals of time prior to the event unfolding. The results are astounding, and in the authors words “…it turns out that forecasting is not a “you have it or you don’t” talent. It is a skill that can be cultivated.”

It turns out, it is more important to have good process and strategy when forecasting than having the highest IQ, the highest salary, the sexiest job title or even knowing the field of expertise in detail. In fact, the more you (think you) know about field the more your brain will lead you astray with a plethora of cognitive biases. Being ignorant pays dividends (finally, my time to shine).

The strategies and techniques to be a better forecaster aren’t mathematically heavy, but common sense heavy. Anyone can be a good forecaster, it just takes practice, curiosity and awareness of one’s own biases. Here are some of the key tactics that were found successful by those in the study and that you might consider.

Answer the right question — triaging and splitting

So many questions. So little time. Don’t stress the easy questions that won’t change a decision. Look for the harder questions that will alter a current misconception or open up new opportunities.

Quickly decide if it’s possible to answer a question posed. Questions like “is this true love” probably not knowable in advance, no matter the hormone levels. “Is it likely possible to design, code and release the new checkout page one-month before Valentine’s Day” is a better question.

Sometimes it helps to break the bigger questions into a smaller one that drives the answer for the whole question. For example, “Will we have the final art and images at least one-month before Valentine’s Day?” Answering this simple question might make answering the first question avoidable. If it just takes one reason not to deliver, look for ways to see if one of those reasons will fail.

Balance general and contextual views — start general then adjust by context

Good forecasters start by knowing the basic odds of an outcome, then adjust those odds with local context. These basic odds can come from logical inference, for example two candidates in an election will start with 50/50 starting odds each, or from historical data in statement form “a bridge of this span has never cost less than $150 Million dollars.”

This type of starting perspective is called an “Outside View” (terminology coined by Bent Flyvbjerg which we talk about later). It has less to do with the exact thing being forecast and everything to do with basic chance of prior outcomes. Starting with a base rate avoids cognitive bias. We aren’t engaging the specifics of this project or endeavor until we anchor on un-emotional facts. There is still plenty of time to add our biases, we are just choosing not to start from there!

The chance of an election winner in a two party runoff is 50%. The cost of similar rail projects of this length has been £320 million. The job of the forecaster is to move that chance or estimate up or down based on new information.

Incorporating New Information — But not over-reacting

Good forecasters balance how new information is incorporated into their opinions. If there are ten experts saying it will take three months, they don’t cherry pick the single person who says one month. And they don’t ignore it either. They find out what that person knows that the others don’t and see if the other ten people would alter their opinion with that new information.

On receiving new information, consider how much you believe it is valid information, and how much it should move your opinion. Trust is earnt from prior performance, not personality or volume of voice or position on an org chart.

Feedback loops — Constant desire to understand why forecasts are right and wrong

Good forecasters look eagerly at failures (theirs and others) and learn why. The goal is to improve over time, not be paralyzed in advance yearning for a perfect risk-free forecast leaving people without any guidance to make decisions. This learning feedback loop is often an overlooked obligation of performing forecasts. If there is no desire to improve forecast reliability, the forecast is more of a belief rather than a well-considered analysis.

Learn over time

The goal of every forecast model we build is that it will improve reliability over time. Avoid expecting perfect. Aim to be better than current alternatives, and learning why.

Not estimating or forecasting at all if a red-flag for me. It means that the forecaster isn’t trying to get the right answer, they are trying to avoid being wrong (a different problem). I’m not saying that every forecast shouldn’t be performed with the least effort, I’m saying avoidance is a sign of no learning. A problem.

A key test for me when presented with a forecast is to ask how many forecasts have been tested against actual outcomes, and were the reasons incorporated back into the way this forecast takes place. If there is no appetite for feedback, then what you really are being presented with is “a more educated guess.” Now my judgement for reliability is how much I trust you!

When data conflicts; Deciding which side to err.

Good forecasters know what direction to sway opinions in the face on uncertainty (for safety or to avoid ruin). Radio and TV news stations know that they get more complain calls when they predict clear weather and it turns out to rain or snow. If they get a variety of computerized model results, some showing rain and some showing clear at about equal odds, they will say “Rain” more likely than not. The risk of the public understanding that it didn’t rain is much more likely than them understanding it did rain even though you said it wouldn’t — they shoot the messenger rather than blame random chance.

If a project running late might alter a decision process, then erring on the longer side might make sense. Rather than hide the bias you added, make it clear that the forecast outcomes include shorter durations, but your belief is the stated longer forecast is safer for decision making. Discuss these added biases and assume longer and shorter forecasts, especially if a decision would alter given those options.

Early in my forecasting work, I determined that the distribution of how long software teams take to complete work (cycle time) followed a predictable pattern. It could have been one of about five known pattern of occurrences (probability distributions), I published it was the one that made forecasts ever so slightly long to err on the side of longer average cycle-time duration. It took another three years until I had more data that proved I was probably right, but initially I guessed, and guessed in a “safe” direction in case people actually paid attention and used that information!

Answer Better Questions

The question “What will the weather be tomorrow?” isn’t really the question. The underlying question is often what will I wear, is my baseball game likely washed out, is it going to be windy enough to go kite surfing, is it going to snow leaving a good layer of powder snow for me to play in. When we get asked “when will it be done?” it’s likely that isn’t the real question either.

There seems to be an insatiable search for answering “When will we be done?” Yet, this is probably the least important question to answer. It’s the “Are we there yet?” question from the back seat of the car on vacation from any number of children and bored toddlers. It’s most often a sign of frustration, and the answer isn’t going to comfort those asking, because they really want it to be “now,” and no other answer is adequate.

To answer the real underlying question, we need to understand the impact of the answer we give and the impact of that being wrong. If it’s just an “interest” question then the answer can be “probably,” but if being late jeopardizes human life or financial ruin, then we might need to answer to more detail. Until we understand the real question and work with the person asking to understand impact, answering is premature.

There are two common dysfunctions when attempting to answer the WWIBD (when will it be done) question. The first is every request for a forecast is assumed life and death. The second is the assumption an accurate answer isn’t possible and a flat out guess is made. Neither is good. When asked to forecast, try and understand the use of the answers you give, start asking questions about the answer you are charged to give -

1. How is this forecast answer being used?

2. What is the impact if we are wrong, either too high or too low?

3. Is there a forecast answer that will tip a decision one way or the other?

4. Do you already have a preferred answer, for example have you already told the CEO the release date?

Knowing the real ground rules for the forecast you are about to give, helps formulate a plan that answers that question with the least effort. You can simply stop if you hit a tipping point, or at a “pretty sure” answer for a very low impact decision. You will still however be asked, “when will we be done?” When asked, look to answer a better question, for example,

1. How big is the feature compared to known others?

2. How long would that take if we did nothing else?

3. How much of that can I get by next month if we focused on this feature alone?

Why are these questions better? They have fewer built-in assumptions. They are relative to something known or they reduce uncertainty by fixing one of the inputs. Whenever given a forecasting problem, always look to refine the question to the easiest question that answered will help make a decision, or highlight a looming calamity.

This book describes the three key questions most other questions can be answered with: How Big, How Long and How Much. These are the basis of most software and IT related forecast needs. Doing them fast, and doing them better than alternatives with less effort is our eternal forecasting goal.

Kanban, Scrum, SAFe and Forecasting

Forecasting is needed no matter what process you use to develop your software or IT features and projects. A good process choice might make forecasting easier or even unnecessary, but in general the outcome is the same. You have something you need people to do or build, and you want to answer questions about how long, how much or what options are available to you and your organization. Process doesn’t change this fact, it just lays out the rules of how.

Nothing in this book should say stop anything that is working for you. To the contrary, throughout this book you will hear a common mantra — do what works until you find something else that also works and is less effort. Your development process choice falls squarely into this belief system. Do what works for you.

Forecasting is more of a communication mechanism. It shows that after some analysis, the following options are possible. You have to still make a decision, there should always be a human in the loop of important decisions, due to the complexity of often what’s NOT stated. Assumptions that are un-aired. The common Agile software development processes actually spell out some of these assumptions. How long between expected “see something” (working software demo), how things are prioritized for start order are all good examples of assumptions spelled out by process.

Forecasts we present need to spell out the rest of the assumptions. For example, team sizes, how work is split into smaller chunks before starting, what extra work might be necessary to deliver final product to consumers. These assumptions will be present no matter what process we use.

Summary

This chapter has introduced what forecasting is and how to do it better. Forecasting is a necessary skill, and we all forecast multiple times a day when living our work and home lives. The key is to take good practices and learn how to improve. You aren’t “born” a good forecaster, you just have better habits in forming opinions, habits that can be learnt with practice.

[1] There are two main bridges that cross a large lake (Lake Washington). The I-90 and 520. These bridges are some of the longest floating bridges in the world, where the roadway is floating on pontoons. In high wind, waves form and break over the roadway. It’s exciting.

[2] https://www.theweathernetwork.com/news/articles/arctic-storms-bring-another-winter-heatwave-to-north-pole/79190#

Next chapter: Chapter 2 — Forecasting, Strategy

--

--