Dealing With Uncertainty Using Scientific Approaches

Nick Kharas

Published in

The Startup

6 min readMay 19, 2020

Why testing and experimentation are key to successful business outcomes

[Co-authored with Mircea Davidescu]

Testing is the vaccine for business uncertainty that has been effectively used by winners. Image modified from DBCLS.

Only three things in life are certain — death, taxes, and uncertainty.

Real life is not deterministic. It is probabilistic. If you are an executive decision maker, you might already know that. For a case-in-point, let’s compare two of the most iconic retailers who reinvented retail in America and the world: Amazon and JCPenney. While Amazon is reaching unbelievable heights every quarter, JCPenney is languishing at the bottom of a bankruptcy barrel. How did it all come down to this? Wasn’t JCPenney, at one point, ahead of its time, being one of the first retailers to go online way back in 1994? Yet, once disaster struck, they never recovered.

This “disaster” started with a series of missteps between 2010 and 2013, when JCPenney launched a massive rebranding exercise along with changes to its marketing and discount strategies. The efforts that were put in place to win more consumers and instead ended up alienating them. You can put the blame on a lot of decisions, but the picture at 10,000 feet is clearer: JCPenney implemented huge changes, and a lot of them at once, without testing them on their consumers.

Implementing changes without any testing or experimentation is like playing Russian Roulette with your business.

Great dynasties like JCPenney are not killed by making wrong decisions: they are killed by not knowing they were wrong until it was too late. Before overhauling your business strategy at great expense, why not validate if the change you propose works on a smaller scale? Give yourself a chance to fail. Amazon’s colossal success story was not because they were never wrong, but because they built a culture that fostered innovation and provided safe spaces to fail. How did they do it?

The key is in testing and experimentation, something that the most successful companies in the world, like Facebook, Google, Amazon, and Microsoft, have ingrained in their culture. They expose their newest ideas to a smaller subset of their users. This is the “test” group. At the same time, they maintain a “control” or “placebo” group of users who haven’t been exposed to any change yet. This group will likely not change their behavior, but if they do, you can be sure it was not due to the test. They provide a “baseline” against whom the incremental effect of any behavioral change in the “test” group is measured.

Time and again, these companies have succeeded because they failed fast and moved on to something new. If your “test” group does not respond to any change and continues to show consistent behavior with the “control” group, you have failed on a much smaller scale. However, you have probably saved millions of dollars on a change that wouldn’t have worked anyway.

“When there is fear of failure, there will be failure.” -General George S. Patton

Testing and experimentation isn’t new at all. We have been doing it for years. A good example is conducting clinical trials, where you test whether or not a new drug will work. It is only recently that the FAANG companies have turned this into a philosophy for business. And why not? We all know how massively successful they are. So, shouldn’t we all start doing experiments? What can go wrong?

A poorly-designed test can kill your business just as effectively as not testing. This is because poorly-designed tests give biased and unreliable outcomes, providing misleading information that you mistook for useful insights. How do we save us from ourselves then? As any critical thinker would say — ask questions!

Why do you want to run this test?
What metric will correctly measure the effect of the change?
Who should be a part of your test?
How long should your test run?

Why?

What is the desired effect of the change? Do you want to earn higher ad revenue? Or, do you just want to sustain user engagement? This is not a data science or analytics question: this is business strategy. Talk to your chief executives and key decision makers about what are the metrics they care about, and that there is a goal for the test.

“You’ve got to be very careful if you don’t know where you are going, because you might not get there.” — Yogi Berra

Your “why” does not always need to be about the future. Sometimes, you need to measure the effect of a change that just happened. COVID-19 radically altered peoples’ spending habits overnight. Everyone from grocery chains and restaurant owners to apparel retailers could do absolutely nothing but watch their businesses either get either overwhelmed or decimated. You cannot roll back what happened. However, two months into the new reality, you can still formulate hypotheses from past events, test them using the data you have, and use your findings to make informed decisions for the future.

What?

What should be a measurable effect: a metric that you want to track. You can collect a lot of engagement metrics, but not all of them will be relevant for your change. If you want to measure user retention, then you need to measure your Daily Active, Weekly Active, or Monthly Active metrics. If you want to make more money from ads, you would be better off measuring the click-through ratio of ad impressions on your app.

Having a clearly defined “what” will also help you hone in on the parts of the test that matter, and make it cleaner to analyze and use in decisions. You’re not doing your ad test to find out if your customer logs on to your website at a different time. However, don’t overdo it and get tunnel vision. You likely want to see the test’s impact on a few of your KPI’s before you roll the initiative further.

Who?

Ultimately who you do a test on will have a big impact on whether the test is useful. There are both business and technical considerations to keep in mind, and they fall into three buckets:

What type of customer is the target of your test?
How big of a sample size is needed?
How should I divide test and control?

When you are thinking about your consumer, your goal should be to decide the scope of the test such that you can get cleaner and more powerful insights. You can segment your consumers on static or behavioral attributes. A smaller targeted test will prove more reliable for insights than testing across the board, even in segments where you never intended to roll out the product.

Having a target in mind is important not only for segmenting consumers, but also understanding the desired effect size. A bigger test will give you more statistical power, which lets you reliably measure a smaller effect size. Remember that the larger your test, the costlier it will be to execute it. Therefore, your sample size should be determined by the minimum effect size that is just enough for you to measure the incremental impact of your change reliably.

After a certain point, what really matters is not how big the test sample is, but how representative it is of your target population. The gold standard for testing has always been randomization, but there are also more nuanced strategies available.

How Long?

We all love fast feedback. Your experiment design should too. When you think about how long your test should run, your goal should really be to get insights as fast as possible. The longer you run your experiment, the more expensive it will get, either by pushing more customers in a bad test, or by the opportunity cost of rolling out a good idea sooner. Thankfully, we do have statistical techniques that help optimize the length of the experiment. For example, you can use a multi-armed bandit approach with feedback loops that frequently present versions of the test are effective, and avoid those which are not. Once you start getting very convincing results indicating that a test is working, you can apply effective stopping of the test.

“A tale, told by an idiot is full of sound and fury, signifying nothing!” — William Shakespeare

Finally, what have we learnt?

Effective testing and experimentation is all about thinking of “tomorrow, and tomorrow, and tomorrow”, creating a long-term vision of what your company wants to do, and thereby planning your tests to trail blaze that course. Plan your testing strategy for the long-term, including not just in basic ideas but in terms of metrics, sample sizes, and preventing testing overlap. Too many companies either don’t understand the value of testing and don’t do it at all, or they go wild with poorly designed tests that in the end don’t tell them anything. Make sure your tests signify something, and tell you an actionable story that you can use to steer the business.