The Startup
Published in

The Startup

Experimentation and Causal Inference

What Harvard Business Review Didn’t Tell You About Experiment

Run controlled experiments to make smart business decisions

Photo by Adeolu Eletu on Unsplash

The other day while doing some research for my doctoral thesis, I accidentally bumped into a Harvard Business Review article. At first, I felt inspired by the fact that experimentation has (finally) received some traction from a prestigious business journal like HBR. As a Data Scientist, I’m always inspired by the discovery of a method’s new business value.

However, I’m left with mixed feelings, to say the least, after finishing the entire story. In essence, the HBR article indoctrinates the idea of running experiments but doesn’t delve into the specifics and let alone the more advanced topics.

The lack of technical depth and popular understanding of the topic reminds me of the importance of connecting academic research with the industry.

In this post, we will learn the types of experimental designs, their differences, and when to use what in business scenarios.

What are the experiments?

An experiment is a procedure carried out to test a hypothesis(definition). To reach valid conclusions, the process needs to be implemented meticulously to eliminate confounders.

Depending on how it is implemented, the experiment can be further divided into two categories: lab and field. For the record, I’m leaving out a natural experiment from our discussion since it is beyond our control.

Lab experiment is the one that carries out in lab environments. Researchers have strict control over the experimental process, including participant recruitment, variable control, surroundings, metrics, etc.

Field experiment is the one that administers in a real-life environment. Researchers can only decide who get the treatment and nothing else. The field design has a low level of control. Also, participants do not know they are being studied.

Why experiments?

With a high level of control, well-designed experiments, or Randomized Controlled Trials, are considered the Gold Standard of causal inference. This is so because RCTs can control for confounding variables and only allow one variable to change. Thus, we can attribute the difference in the outcome variables to the intervention.

How do lab and field differ?

They come in different forms and expenses. In comparison, the in-house lab option is cheaper and quicker to reach results. Potentially, there are three ways where they may depart from one another (Coppock and Green 2015).

1. Different Research Subjects

Lab experiments rely on convenient sampling and do not care so much about having a representative sample. For example, a lot of lab researches recruit university students as subjects who do not reflect the general population. This is called selection bias. As a result, the experimental findings may not be generalizable to a wider population.

Business Reflections:

  • Do your recruits represent your customer base?
  • What is your current business need: to explore or to test-hypothesis?
  • To what extent do you want to scale up to?

2. Different Context Treatment

Lab experimental conditions are not real. Participants know perfectly they are being observed and will act accordingly. What’s more, lab participants react to the intervention condition through oral instructions, instead of experiencing the condition in a real-life scenario.

These factors may not get what we are looking for. People change their behaviours if they are under observed. This is called the social desirability effect.

In comparison, the field experiment does not let participants know they are being studied. Thus, it generates results closer to when people would behave in real life.

Business Reflections:

  • What are the confounding variables?
  • Is it possible to completely control for the spillover effect?
  • For lab experiments, do the experimental conditions look real enough?
  • Do our customers know they are being observed? How to control for the social desirability effect?

3. Metrics

Lab and field experiments collect feedback in different ways. Lab experiment collects user opinions and behaviours in a structured way by asking clear questions and observing behaviours. However, field experiments lack such clarity and collect feedback days or even months after the intervention has been administered.

Behavioural psychology shows it takes time for our brains to process information, which may lead to different observations over time. In this sense, only a long-term field experiment generates accurate information.

Business Reflections:

  • Customers may change their opinions and behaviours over time as they become used to the product.
  • A/B tests under lab settings do not result in the most credible conclusions.
  • It’s critical to track long-term changes in customers. Perhaps, through frequent follow-ups? How to deal with customer churn and selection bias?

Do they differ?

No. Lab and field experiments lead to the same results, to a large extent (Coppock and Green 2015). Naturally, the next question we come in mind is: do we go for the cheaper option (lab) all the time? To answer this question, we need to bring in business factors into the equation.

  1. Resource
  • Time: what is the timeline? Can you wait for 3 months for any findings?
  • Budget: money talks. Large-scale experiments are expensive.
  • Manpower: to what extent is your team capable of scaling up?

The rule of thumb is to go for a lab experiment with limited resources.

2. Stage of research

  • Early-stage: a mini-lab experiment is great for hypothesis generation.
  • Latter stages: field experiment can test and validate the hypothesis.

In general, lab experiment helps understand the causal mechanism, and field experiment answers the real-life questions.

3. The scope of generalizability

  • Limited scope: just want to test out some initial product ideas, then doing a lab experiment is good enough. Selection bias isn’t a threat.
  • Full-scale up: an iterative process of business and research design would be recommended as there is so much in stake if customers behave inconsistently. Selection bias kills the business.


While experimental designs are great choices for quantifying the causal effects, they fail to identify the causal mechanism for latent variables. In particular, human beings are complex animals with subtle emotions. To better understand business needs, we should combine experimental designs with qualitative approaches, like interviews, focus group.

1. Experiments are useful.

2. Lab and field lead to the same results. Most of the time.

3. Three determinants: resource, stage of development, and generalizability.

4. Experiments can’t do it all, and we need to bring back qualitative methods.

Enjoy reading this one?

Please find me on LinkedIn and Twitter.

Also, check my other posts on Artificial Intelligence and Machine Learning.




Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +756K followers.

Recommended from Medium

Sparkify Churn Rate Analysis

Northwind — SQL and Hypothesis testing

On Creating a Decentralized Decision Making (DDM) System

Aligning Your Data and Methods your Mission

Self-contained reports from Jupyter Notebooks

Keeping Up With Data — Week 13 Reading List

Healthcare Data and Postel’s law. Computer Says Yes.

Helping the United Nations Infer Document Labels using Knowledge Graphs

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Leihua Ye, PhD

Leihua Ye, PhD

Data Scientist @ Walmart; PhD @ University of California. Data Science | Experimentation & Causal Inference | Career Development

More from Medium

Green Flags that You’re Making Responsible Data Connections

Where to export next? How to build an Export Matrix from scratch — Part 2

How do you get ahead of the game by using external data?

How Data Scientists Can Develop Business Acumen