Experimentation is one of the best things that has happened in product innovation. Data is replacing opinion.
Unfortunately, many teams have adopted experimentation practices from academia. There are three main differences between product and academic experimentation:
- The goal of product experimentation is maximization of impact. The goal of academic experimentation is minimization of false positives.
- The majority of product costs occur after the experiment is complete. Academic experimentation do not have post experiment costs.
- The stage of product maturity impacts how you should balance the benefit and cost side.
Consider Three Bets
Bet #1: a single roll of a fair dice. If you roll a 1, 2 or 3, you win $1,000,000. If you roll a 4, 5, 6, you lose $1,000,000.
This bet has an expected payout of $0. The expected upside is equal to the expected downside. Most people would refuse this bet due to loss aversion.
Bet #2: a single roll of a fair dice. If you roll 1, 2, 3 or 4, you win $1,000,000. If you roll 5 or 6, you lose $1,000,000.
With this bet, you win 33% more often than you lose leading to an expected payout of $333,333. On the surface it might look attractive, but you have to weigh it against a 33% probability of a catastrophic outcome.
Bet #3: 100 rolls of a fair dice. If you roll 1, 2, 3 or 4, you win $10,000. If you roll 5 or 6, you lose $10,000.
This bet has the same expected payout as Bet #2. However, you would be wise to take this bet as you have 99.95% chance of achieving a profitable outcome. With a high number of rolls, the variance is significantly reduced from Bet #2.
Publishing in academic journals is closer to Bet #2. You might only publish a few papers per year and retracted papers can be very damaging to the author. A paper entitled Career Effects of Scandal: Evidence from Scientific Retractions finds that “Relative to non-retracted control authors, faculty members who experience a retraction see the citation rate to their earlier, non-retracted articles drop by 10% on average.” With these risks, a reasonable strategy is to only proceed when you are confident that you do not need to retract. In fact, some academics want to change the threshold for significance from a p-value (i.e. false positive rate) of 0.05 to 0.005.
Evaluating product experiments is closer to Bet #3. A product team might conduct a hundred experiments in a year. A mistake on any one experiment really doesn’t matter. What matters is not having too many false negatives and missing out on upside you could deliver to your product.
Costs are also very different between writing scientific papers and shipping software. There are two types to consider: upfront costs and future costs.
There is a large variation in the upfront cost to build and run an experiment. Product experiments are typically cheaper than academic research which can span years. In either case, these upfront costs shouldn’t influence whether you accept or reject the experiment (see Sunk Cost Fallacy).
The more interesting cost to consider are the future costs. When writing an academic paper, you publish your paper and you are effectively done. With product development, most of your costs may be incurred in the future.
Future product costs will depend on the specific feature and may vary from 0x to 10x of your original investment. If the change is trivial (e.g. changing the color of links), then there may be no additional complexity. But in other cases, you may be adding an additional interaction on a primary page in your user experience. Every future modification to this page will need to consider how it impacts your feature.
There are many factors that determine the future cost. Features that introduce a lot of additional complexity are more expensive. Features that have a longer life span are more expensive. Features in a parts of the product that change frequently are more expensive.
With that in mind, you should consider what the future costs will be before accepting an experiment. High future costs reduce the capacity of the team. If the expected costs are small, you may require a small expected benefit. If the expected costs are large, you may require a large expected benefit.
Product experimentation has largely been patterned after academic publishing. However, it should now be evident that both the cost and benefit functions are different. The right experimentation framework for a product team will balance false positives, false negatives, future costs and future benefits.
The balance depends on the stage of the product. For example, as a start-up, the main thing to focus on is avoiding false negatives (i.e. rejecting potentially good ideas that can lead your product to get traction). The product may not be around in a year, so why worry about the future cost. Talking to and observing customers and see what they tell you and what they do will probably be the best indicator of whether a feature should be built and whether it was built the right way.
Determining the proper experimentation framework is one of the most important tasks for a product team as it impacts every subsequent investment.
— — —
Read more about how we run experiments at Convoy here: The Power of Bayesian A/B testing.