Mental Models & Product #4: Probabilistic Thinking

Isabel Gan
Mental Models & Product
12 min readFeb 27, 2021
giphy.com

It’s that time of the year where (at least at Indigo) the new fiscal year’s starting up, quarterly planning’s underway, and new project ideas are springing up. You might be trying to find ways to pitch a new idea to your leadership, or find ways to uncover what the upcoming quarter will look like, or maybe decide which Pinterest recipe you are going to make for the week (I know I’m trying to figure that out).

In my last article, it made me think a lot about uncovering what my circle of competence looks like not only in my portfolio but also in my life. This got me thinking about the difference between things I can control and things I can’t control, and ways to focus on navigating that bigger question through working on things that I can control. Team capacity? Can’t control that. Broad portfolio? Can’t really control that. Managing stakeholder expectations? I can control that. How about a bloated backlog? I can control that through strategic decisions. That brings me to the question that started this series: “how can I make better, well-informed decisions?”

Introducing probabilistic thinking: a common framework mentioned in business school and other fields like psychology and even cancer research.

What is probabilistic thinking?

To achieve goals quickly and improve decision-making, this is an excellent mental model to enhance the precision and effectiveness of our decisions. Probabilistic thinking is trying to estimate using our knowledge, beliefs, logic, and math to estimate the likelihood of any specific outcome.

Fun fact — most of our lives are based on probability. For example, I’ve recently been running long distances, and this is planned based on the time available for the run and my body’s physical ability to complete the run. This is through asking questions that challenge my assumptions, like “can I run 10km and get home for a shower to make it on time for my next meeting?” or “will my body be able to run the distance based on other physical activities that I do?”

Our lack of perfect information about the world gives rise to all of probability theory, and its usefulness. We know now that the future is inherently unpredictable because not all variables can be known and even the smallest error imaginable in our data very quickly throws off our predictions. The best we can do is estimate the future by generating realistic, useful probabilities. — Farnam Street

Most of the time, given that we live in complex social systems, outcomes are not binary and are usually probabilistic. Most of the time, we cannot guarantee that a specific scenario will occur 100% of the time. However, we can increase or decrease our certainty based on any new evidence that may come in.

(source: math.arizona.edu)

In Farnam Street, Shane talks about three important aspects of probability that are important to understand when learning about probabilistic thinking.

  1. Bayesian thinking
  2. Fat-tailed curves
  3. Asymmetries

Let’s start with Bayesian thinking.

The core concept focuses on the fact that as we encounter new information, we should probably take into account what we already know when we learn something new. This helps us use all relevant prior information as we make decisions.

The Bayesian view of probability (also known as Bayes’ Theorem) measures the plausibility of an event given incomplete knowledge. It starts with a statement of knowledge prior (usually this comes in the form of a prediction). To improve the state of knowledge, an experiment is designed and executed to “get new evidence”. Both the prior and the experiment’s results have a joint distribution (the probability of two events happening together) that leads to a new and improved belief. This can be seen in this fun little equation:

P(A|B) = [P(B|A) P(A)] / P(B)

P stands for probability, A stands for the prior knowledge/belief, and B stands for new evidence. P(A) is the probability the prior belief is true, while P(B) is the probability that the new evidence is true. P(A|B) is the probability of the prior belief if the new evidence is true, while P(B|A) is the probability of the evidence if the prior belief is true.

(source: statistical engineering)

Let’s look at an example. Let’s say Alice has been trying for a baby and wants to figure out whether she is pregnant or not. And let’s say that pregnancy tests in this example are 99% reliable — so 99 out of 100 people who are pregnant will test positive while 99 out of 100 people who are not pregnant will not test positive. So if Alice tested positive in the pregnancy test, she probably is 99% pregnant right? Wrong.

Looking at the above equation: P(A|B) = [probability of testing positive if she is pregnant*probability of being pregnant prior to getting tested)]/probability of testing positive whether or not she is pregnant.

To find out what P(B) is, it includes false positives and true positives, so we have to first find out the probability of a false positive. To do that, we should multiply the rate of false positives (0.01) to the percentage of people who are not pregnant (0.99), which comes up to 0.0099. The calculation to get P(B) can be found in comment #4.

So, P(A|B) = [0.99*0.01]/0.0198=0.5

This means that the probability that Alice is pregnant if she tests positive is 50%.

(source: scientific american)

Never, ever leave a number all by itself. Never believe that one number on its own can be meaningful. If you are offered one number, always ask for at least one more. Something to compare it with. — Hans Rosling

All this to say, our initial belief is only as valid as the evidence. Bayesian thinking encourages us to use all relevant prior information in making decisions. Hans Rosling mentions this in his book “Factfulness” (note: will probably be referencing this book multiple times in this series) about utilizing comparing and dividing to avoid misjudging something’s importance.

From an abstract point of view, Bayes’ Theorem focuses on not labelling our prior knowledge as right or wrong (not binary). This will be an ongoing cycle of challenging and validating what we believe we know. It’s about uncovering what we might already know that we can use to better understand the reality of the situation.

Next up, fat-tailed curves.

A fat-tailed distribution exhibits a large skewness, which consists of an unexpectedly thick end or “tail” toward the edges of a distribution curve.

Before we chat about fat-tailed curves, we need to learn about its friendly sibling, the normal distribution curve.

The normal distribution curve is bell-shaped (thus, also known as the bell curve), which encompasses two basic terms: the mean (average) and standard deviation (amount of variation or dispersion). Most of the values cluster in the mean and the rest tapers symmetrically towards either extreme. Here are a few normal distribution examples I found on the internet:

  1. Height
  2. Rolling a dice
  3. Tossing a coin
  4. Income distribution in an economy
  5. Students’ average report

Now let’s talk about fat-tail distribution curves.

Think about the fat tail curve being the bell curve’s dangerous, volatile brother. They seem similar enough, with common outcomes clustering together. But like most brothers, they always have distinctive traits to tell them apart. For this set of brothers, the differences lie in their tail. In bell curves, the extremes are predictable. Fat tails, on the other hand, exhibit large leptokurtosis (fatter tails and a higher peak at the mean), which means it has a greater chance of extreme positive or negative events.

How does this relate to a mental model we can use? Let’s use an example.

Suppose we hear about the fact that we would have a greater risk of dying from choking on our own spit (which I embarrassingly do way too frequently during 1–1 meetings) than being killed by war. The priors (in this case, the statistics), seem to back it up where 500 people choked on their own spit and died last year in the country, whereas only 100 people in our country died from war. In this case, it might seem that the risk of war is low since the data showing recent deaths is very low, but taking a look at the shape of the curves might tell a different story.

In the case of the risk of another war breaking out, it follows more of a fat-tailed curve where there is a greater chance of extreme negative events. On the other hand, dying from choking on our own spit follows more of a bell curve, where the outliers have a fairly well-defined scope.

By ensuring that we are thinking ahead and positioning ourselves to benefit from an unpredictable future that may vary in extremes, we can utilize fat-tailed curves to be one step ahead of the game in contingency planning.

Lastly, asymmetries.

Most daily decisions involve uncertainty about the probability of outcomes arising from ambiguity and incomplete knowledge. When making decisions under ambiguity, it can result in favourable and unfavourable asymmetric effects.

(source: harvard)

There are two mechanisms that drive asymmetry: while unfavourable information may decrease estimates of a good outcome occurring, it reduces aversive uncertainty. Also, when information is subjectively interpreted, unfavourable information is less likely to be integrated when evaluating a possible outcome.

Probability estimates are more likely to be wrong when one becomes “over-optimistic” than “under-optimistic”. Many uncertain outcomes are inherently asymmetric, with longer downside tails than upside. It can be easy to focus primarily on the “most likely” case while forgetting to calculate the true expected value impact of several such asymmetries together.

Shane Parrish uses a really great example of people’s ability to estimate the effect of traffic on travel time. If we were to leave on time:

  • The probability of arriving 20% early — rarely
  • The probability of arriving 20% later — most of the time

Therefore, this shows that our estimation errors are asymmetric, skewing in a single direction.

By recognizing whether we are overestimating the confidence in our probabilistic estimates, we can be careful during high-stake situations to contingency plan for not meeting those targets.

How does that tie into product?

As I’m writing this article, I’m thinking a lot about our upcoming Q1 roadmap planning and how we can prioritize while being in a ton of ambiguity. How do we combine our assumptions, our resources, our priorities, and the fiscal year’s ambiguity into the upcoming roadmap?

By using the probabilistic thinking mental model, we want to think in shades of probability to identify what matters, understand the odds of something taking place, checking our existing assumptions (and what we don’t know) before making a decision. In the case of roadmap planning, how can I gauge the upcoming priorities based on the probability of impact and the probability of it becoming a company-wide priority? Based on that, how can I anticipate capacity changes during the quarter and their impact on the ability to take on said commitments?

As product managers, we have to make many (many many many) decisions by figuring out the probable accuracy of unreliable information based on stakeholders input, market trends, fluctuating (or not) product metrics. Especially in e-commerce, expectations from our customers can vary drastically within a month and when trying to plan for the upcoming 3 months may be like trying to throw a dart blindfolded.

In previous roadmap planning sessions, I have always relied on digging deep into understanding market expectations and key business priorities for the upcoming quarter before shortlisting the quarterly candidates that can be taken into a prioritization framework (namely I’ve been using RICE). This also includes company-wide programs and OKRs that will drive many of the upcoming decisions that will unfold in the quarter.

Instead of talking about different prioritization frameworks, let’s take a step back to find ways to identify the “big rocks” (large company-wide initiatives, pervasive customer pain points, etc.) that can help prevent unnecessary changes and complete overhauls of the plan. While identifying the “big rocks”, we also need to leave room for contingencies that may result in agile environments, organization changes, and/or market updates.

With probabilistic thinking, the key here is to improve the accuracy of our decisions by generating realistic and useful probabilities.

The first thing to do would be to uncover the status of each candidate is. This can be done by grouping them into current, near-term, and future columns.

Candidates under the current column usually have defined areas of focus, specified scope, clear indication of priority, and even specs and design (depending on your Definition of Ready with your teams) — this means that candidates that are current usually have a higher probability of certainty and of execution. This may include candidates that are currently in development, but also “discovery” candidates that are being planned, analyzed, designed prior to bringing it to our development teams.

Candidates under the near-term column usually have a wider area(s) of focus compared to those in the current column, with some flexibility of scope changing based on market expectations, competitive findings, or customer feedback. Usually, these candidates have a conservative probability of certainty and execution given the scope is not 100% defined yet. These could be projects and programs that either have had a company-wide introduction but no formal kickoff have been done. With candidates like that, we will need to allocate time to understand effort (if there are proposed solutions) and impact on our customers.

Lastly, candidates under the future column comprise candidates that have a broad scope, with wide room for flexibility regarding scope changes. These are high-level candidates based on ideas that require more input from decision-makers to understand the clear job to be done. These candidates have a low probability of certainty and execution and are usually known as “blue sky” candidates. These could be program ideas that ladder up to company-wide goals but there are no clear outlines on roles & responsibilities, nor clarity on objectives of the program/outcomes of the program.

The higher the probability of certainty and execution of candidates, the lower the probability of asymmetric errors given that there is lesser ambiguity among those candidates with higher certainty and steps to execute.

In contrast, candidates who are under the future — and even near-term — column should be treated similarly to a fat-tailed distribution, where we should assume that their unpredictable future may result in unexpected extreme outcomes. This could mean that they could suddenly be a top business priority due to a market change or business goal that would override existing priorities, or that they could be complete throwaway work that took a lot of time/effort to build. For highly unpredictable candidates like that, it is important to challenge and validate what we think we know. Similar to using Bayes’ Theorem, we can look at our priors (that is, going to be a priority that would be worked on) and assign the candidates to a probability of being true. But we cannot let our priors of those candidates get in the way of taking in any new information during the quarter that may challenge the prior.

It is our job to ensure that when planning our roadmap, we have a clear, prioritized backlog as a contingency plan, in the case where our priors turn out to be false. As the quarter progresses, we should be committed to taking in any new information that may reduce the probability of our prior being true, or even result in our prior being replaced completely. The key is to have other candidates that we know are certain in the probability of bringing joy to our customers, that will “keep the fridge full” (fridge being our backlog) and for our teams to do meaningful work.

The key when roadmap planning while navigating under uncertainty is to set clear primary goals, as well as KPIs so that stakeholders are clear about our product strategy and understanding our available resources and our constraints. While doing so, it is also important to be agile enough in leaving space to challenge and validate what we believe we know so that we can adjust our roadmap accordingly in the case of shifts in expectations.

Our goal is to get early buy-in during our planning stage by collaborating with stakeholders and our teams to uncover existing assumptions, identify use cases, highlight important discovery work that needs to be done, and commit to objectives collectively as a group. This would help identify major themes and anticipated impacts/trade-offs that would help support the associated probability to each candidate.

Roadmap planning can be an extremely daunting task, especially with quarters that are highly uncertain, the multitudes of prioritization frameworks out there, and the variation in product portfolios within each team. However, by taking out our instinct for determinism (believing something that is either true or false) and find ways to embrace uncertainty by identifying the most likely outcomes and best decisions to make. When it comes to uncertain outcomes, instead of trying to find an absolute answer to those outcomes, we can update our knowledge by incorporating new, relevant information as it becomes available while ensuring that we’re planning enough time to do so throughout the quarter.

Hey, this is fun. So why not join the party at my newsletter? 🎉

--

--

Isabel Gan
Mental Models & Product

Growth PM @ Unbounce | writing about all things product & mental models