Success Metric vs. Fail Condition — To the Pain!

A while back we wrote about the difference between an assumption and a hypothesis and Steven Diebold wrote a great response. Steve schooled us on a few topics and pointed out our lack of clarity in some areas. One thing he brought up deserves more debate: The Success Metric.

I reject the idea of a success metric. -Tweet This

Bonus: For a more complete version of how to design great experiments, here is our open source Real Startup Book.

Declaring Victory!

A success metric is the idea that given any hypothesis, there is a metric which will indicate that, in the lean startup and experiment design jargon, the hypothesis is validated. Or simply put, it’s a good idea.

Setting the success metric seems easy. For example, our hypothesis might be:

The value proposition “Faster download speeds for your BitTorrent client” (Version A) will generate more sign ups than the value proposition “Conceal your IP address when downloading Game of Thrones with BitTorrent.” (Version B)

(Not the greatest hypothesis, but let’s roll with it for now.)

The metric for measuring this hypothesis will be the % of unique visitors that sign up on version A vs. version B. If the conversion rate for B is 5%, and the conversion rate for A is 25%, the hypothesis is considered validated. Victory!

Now let’s look at some situations where it’s a bit harder to declare a clear victory.

Flying Penguins

If our hypothesis is “some penguins can fly” we can very easily set a success metric that would prove this hypothesis. If we see at least one flying penguin (outside of the cinema), then clearly some penguins can fly.

So we go look at 10 penguins in the zoo and…they can’t fly.

But maybe these are the wrong kind of penguins. We can go to a different city, go to another zoo, and look at another 20 penguins.

They still can’t fly.

Maybe it’s only penguins in zoos that can’t fly. So we get a boat, go to Antartica, and look at 1,000 penguins in the wild.

They still can’t fly.

But maybe they just don’t like to fly while people are watching! Clearly they wouldn’t have wings if they couldn’t fly, so we probably just haven’t found the flying ones...yet.

The Slippery Slope of Failure

This is a common problem with startups:

“Maybe these customers didn’t want to buy our product, but I’m sure if we keep looking we’ll find the ones that will.”

If we define our success metric at 20%, when the conversion comes in at 19%…it’s close enough.

When the conversion is 15%…there’s room for improvement.

When it’s 10%…clearly we need to spend more time optimizing.

When it’s 5%…well…some people are still interested!

When it’s 1%…maybe we’re not explaining it well enough.

When it’s 0%…did we forget to install analytics?

It’s almost impossible to accept failure. There’s always a potential rationalization. After all…we just haven’t succeeded…yet.

Just like the penguins.

Success Metrics make for bad science. -Tweet This

The Scientific Method!

There is a general problem well known to science…

We can never prove a hypothesis. We can only fail to disprove it. -Tweet This

We can try over and over to disprove our hypothesis until we have tried so often that we give up and accept the truth of the matter.

That’s why we have the Theory of General Relativity instead of the Fact of General Relativity. Although the Theory of General Relativity allows us to launch rockets into space and lets our phones geolocate us, it’s still just a theory. Eventually, we may find some situation where the theory breaks down and won’t explain all the facts (e.g. quantum physics). Then we’ll have to come up with a new theory.

So, instead of trying to prove a hypothesis with a success metric, we should try to disprove the hypothesis with a Fail Condition.

Setting the Fail Condition

How many penguins do you need to observe before we are convinced that penguins can’t fly?

10? 50? 1,000? The more penguins you look at, the higher your level of confidence in your conclusion that penguins can or cannot fly.

Science has clear criteria for what is an acceptable level of confidence (six-sigma), but we don’t have that luxury in entrepreneurship or in lean startup.

Fortunately, we don’t need it. We don’t need to prove to everyone that penguins can’t fly. We just need to prove it to ourselves. Because ultimately, our goal is to build a business.

If our business was to sell penguins cool flight goggles, we need to know that high wind speeds while flying is a serious problem for most penguins. If most penguins can’t fly, this is probably not a good business.

So what % of penguins need to be able to fly for this business to be worth investing our time? 50%? 30%? 1%?

Focusing on just Adélie penguins, if we need a market of 2 million penguins to be able to make this a profitable business and there are about 3.75 million Adélie penguins, then we need almost 50% of any penguins we survey to be able to fly for this business work. So how many do we need to look at?

If we look at 10 and NONE can fly, then even with a margin of error ~7% due to a small sample size…this is a bad business.

Semantics

This is more than just semantics.

Of course, a very smart and practiced individual might be able to set a success metric and be very rigorous when applying it.

19%? Nope…we set a Success Metric of 20%, let’s scrap this business.

Those are words no entrepreneur will ever utter.

As entrepreneurs, we are biased towards our vision, towards optimizing, towards self delusion. The purpose of lean startup is to guard against this sort of cognitive bias.

So stop trying to validate your ideas, invalidate them instead! -Tweet This

Key Takeaways

  • You can never validate a hypothesis, only fail to invalidate it. -Tweet This
  • Set a Fail Condition instead of a Success Metric. -Tweet This
  • Penguins can’t fly. -Tweet This

Bonus: For a more complete version of how to design great experiments, download our open source Real Startup Book:

This post was originally published here on Grasshopper Herder — Lean Startup Blog. Don’t miss a post…Subscribe!