The Only Proven Method To Identify Your Riskiest Assumption

Startups live and die on the validity of their assumptions. As a startup founder, you make assumptions about your customer, your product, and your market. Being wrong in any one of these areas can take you out of the game.

This post was originally published by Neo Innovation on their official blog. I am republishing it here so that Startup Patterns readers can have easy access to tools and tactics for small teams, Kanban being one of those tools.

It’s common these days for startup founders to talk of “lean”, of validating their assumptions with data. But very few actually walk the walk, actually use data to calculate risk in a way that is truly scientifically rigorous. Some assumptions are riskier than others. And there is likely one that is disproportionately more responsible for your overall success than the others at any particular stage. Figuring out which it is can save you valuable time and money — and maybe mean the difference between the life and death of your company.

You probably have a list of assumptions you’re making right now, as you stumble toward product-market fit, not sure which one is concealing the ticking time-bomb that will blow up in your face. If you have done a business model canvas, and you should have, you’ve likely generated a dozen or more assumptions. Some you may have an intuitive sense of their risk level based on how uncertain you feel about them, or how impactful they will be on your overall business model. But how can you know which one is really the riskiest? It turns out there is only one scientifically valid method to identify it quantitatively.

Risk, The Lean Startup Way

Before I outline the specific method I use, let’s look at how risk plays out in the life of a startup. At Neo, we like to point our clients to this graphic:

This diagram shows how we approach experiments over the life of a project. In the beginning of any project our level of uncertainty is very high, and thus only a correspondingly low level of fidelity in our experiments is justified. We start out using very lightweight methods to validate assumptions. Tactics like interviews, paper prototypes, and simple landing pages help us cheaply and quickly evaluate whether or not we have a problem worth solving.

As we collect data from doing those lightweight experiments, as we reduce our level of uncertainty about the product we are building and the market we will be selling to, we earn the right to move to gradually higher fidelity experiments, eventually evolving into a fully functional system. We would then continue to iterate on that system using the same data informed techniques, only at a larger scale with more fine-grained experiments.

Over time, we have developed a sense for which types of experiments to run in which order. For example, we always start with a focus on the customer, making sure they have a painful enough problem to solve. Then we move to testing the value proposition in a variety of marketing channels, and so on.

Once the product is built, and customers are signing up and actually paying for it, it becomes a lot trickier to prioritize assumptions. We have a found customer problem that is painful enough that they will pay money for the solution. But evolving from your first batch of customers to a scalable business model is a whole different game. It will require engineering and automation, for which you’ll need to hire a team, and maybe take investment. The stakes become much higher, as you have other people’s money and livelihoods at stake. Prioritizing your riskiest assumption at this stage is incredibly hard, and incredibly important.

Not Quite Scientific Method

I have seen any number of loose, hand-wavy methods proposed to organize your risks and calculate which ones are the biggest threat to your company. Some suggest you build a simple 2×2 grid with easy/hard on one axis and critical/trivial on the other, and prioritize those items that fall in the upper left quadrant. Others leverage engineering estimates to score the different assumptions, prioritizing those that are going to be easiest to do first. Still others suggest asking advice from advisors, investors, or outside experts.

For example, in a quick Google search on identifying your riskiest assumptions, I uncovered a few widely shared posts on the topic.

This one, by Grace Ng, got the most shares from what I can tell. It’s a very good article in which she outlines a very reasonable process for operating in a lean way to validate your assumptions. Many of the techniques she suggests we also do here at Neo. But, when it comes to identifying your risks, Ng’s article falls short, simply referring to the template which doesn’t really say anything about how to actually get to the riskiest one quantitatively.

Another article, by Ryan Hoover, founder of Product Hunt, emphasizes the importance of identifying your riskiest assumption. But, again, he doesn’t offer any solid verifiable way of calculating it. It seems like just a gut-feel approach.

A third, by Shardul Mehta, is actually called How to Identify the Riskiest Parts of Your Product Strategy. Yet, there as actually no mention of how to put your assumptions in any kind of risk-based ordering using quantitative data. He seems to again use a gut-feel approach to prioritize risks as well.

Finally, this post by Diana Kander offers the most quantitative-looking method of calculating risk. In it she suggests building a matrix that combines your confidence in the assumption with the impact of being wrong to get to an overall risk score. It’s definitely a good instinct to weigh uncertainty against impact in a numerical way, and this is a method many management consultants and advisors also recommend. It’s popular, and deceptively convincing, but it’s still wrong, technically.

To be clear, I have a lot of respect for each of these authors. But I am going to poke holes in this one aspect of each their posts. Much of what they are suggesting around organizing your experiments is absolutely correct. But in terms of measuring and prioritizing risks, all of these methods are wrong from an evidence-based, data-driven standpoint. And I’ll tell you exactly why.

Risk = Probability * Impact

In his excellent book, “The Failure of Risk Management: Why It’s Broken, and How to Fix It,” Douglass Hubbard lays out a compelling case for how non-quanititative, gut-feeling estimates of risk are worse than ineffective, and are actually dangerous. He starts by defining risk in a very specific way. Risk is nothing more nor less than the quantitatively defined impact of an event, times the (also quantitatively defined) probability of that event occurring. If you’re not dealing with a quantified measure of these two factors, you’re not really measuring risk.

Hubbard spends the first two thirds of the book outlining several of the major approaches to risk measurement, and then systematically critiques each one. Here are some key lessons from the book.

Experts can be wrong. When asked to provide numerical scores on risk factors, even very experienced domain experts can be off by several orders of magnitude, from each other’s answers and from any relevant historically validated data. There is just something about human psychology that makes it very hard for us to accurately and reliability calculate probabilities. Unless a team is calibrated to calculate risk (which is accomplished by practice estimating probabilities, seeing how far off you were and why, and then iteratively improving your estimates) a team is going to give very divergent risk scores that are based on nothing remotely scientific or quantifiable.

The worst practice according to Hubbard of this expert-led non-scientific risk measurements are the impact vs. probability matrices. The numerical scores are not based on any verifiable probability data and is therefore nothing but a guess. What is worse, these practices have the effect of making us think we’ve properly accounted for risk when we really haven’t at all.

Thus, the only truly scientifically proven way to calculate risk, the method used by the risk management pros, actuaries, and statistics geeks, is an approach that uses validated historical data of both impact and proability of occurances, and which continuously updates as you collect new information.

Use Data To Calculate Probability, Then Iterate

So basically, in order to really measure risk, you have to use a quantitative method that combines probability and impact in a systematic way, and you have to iteratively update it as you collect more data.

Here is my step by step method for doing just that with your startup:

Step 1: Build a financial model of your startup.

The first thing to do is to make sure that you have a reasonably accurate financial model of your startup, like a standard pro-forma P&L. You can do it in a spreadsheet in less than an hour usually. You basically need to be able to track all of our costs and revenues across time, and calculate your bottom line on a rolling, cumulative basis. Don’t forget to cover all aspects of your customer acquisition funnel including cost of acquisition through any paid marketing. I require this spreadsheet exercise of all the founders I work with. For many, it is the first time they have put together a spreadsheet model of their business at this level of detail.

Step 2: Collect data on each of the stage in your funnel.

You can really only use this probabilistic method when you already have customers traversing your product funnel. Your spreadsheet should have several weeks’ worth of data in it so that you can get decent conversion percentages from each stage of the funnel. I won’t suggest a specific number of customers, because it depends on your model.

Related article: Quantify Startup Risks with Monte Carlo Simulation

Step 3: Assign a value to a single conversion, based on the financial model.

Given the data you have collected for each stage, you should now be able to calculate the value of each conversion based on its contribution to your revenue goal. Start as high up the funnel as you can. You should be able to determine the value of each visitor to the site, the value of each click on the call to action button, the value of of each “send me more information” request. This economic framework is the key to prioritization.

For example, say customers purchase your product for $100. 10% of the visitors to your site purchase. So, then each visit is worth $10 because of the probability that a visitor will convert. This is just basic sales thinking, right? Well, do that for the whole funnel, at each conversion stage.

Step 4: Use a histogram to measure variance.

Now, we get to the interesting part. Based on the data you have after a few intervals, you should be able to graph a simple distribution curve as a histogram representing the fluctuations in that stage’s conversion rate. The shape of the distribution will tell you about the volatility (or variability) of that stage’s conversion rate. And here is why that matters. Variables that have settled into a consistent level will prove much harder to change, and thus represent less opportunity for improvement. But the areas of the funnel that have wide variance tells you, first, that you’re at high risk in that area, because there is a greater probability that a given customer will fall far afield of the mean of the distribution, and second, it means that customers do not react consistently to that phase of your experience. There is opportunity here for movement to a more stable conversion rate.

Essentially, these distributions for each stage of your funnel allow you to extrapolate the probability that a given conversion rate will fluctuate up or down, and by how much. Combined with the values you calculated for each stage, you’ve now got a quantitative measure for which areas of your funnel are weakest–and those are the best opportunities for improvement.

And while my example uses a product sign-up and revenue generating funnel, you can use this same method for any workflow that generates regular, time-series data.

Next Steps

By this point, you have come a long way toward using a scientifically valid method of risk analysis. You are using quantitative data and applying probability to actual measurable impact. You have the data you need to order your whole backlog of experiments by highest risk to the business.

Happy prioritizing, and let me know if you get stuck!

If you want to start looking at more rigorous modeling methods, such as Monte Carlo simulations, drop me a line, and I’ll be happy to chat with you about it.

Here are my slides from The SF Startup CTO Summit


Originally published at www.startuppatternsbook.com on June 11, 2015.