Book Summary: “Trustworthy Online Controlled Experiments” [Part III.]

3 min readOct 13, 2022

(Preview) This part introduces some techniques that can complement experiments. It is often difficult to understand the underlying mechanisms of users’ behavior from experiments. Also, when ethical issues arise or it is too costly to run experiments, then alternative methods can be considered.

This series is my summary of the book on AB testing, “Trustworthy Online Controlled Experiments” (by Ron Kohavi, Diane Tang, and Ya Xu)

Link to other parts of the series:

Part I. Introductory Topics for Everyone

Part II. Selected Topics for Everyone

Part IV. Advanced Topics for Building an Experimentation Platform

Part V. Advanced Topics for Analyzing Experiments

Part III. Complementary and Alternative Techniques to Controlled Experiments

This part introduces some techniques that can complement experiments. It is often difficult to understand the underlying mechanisms of users’ behavior from experiments. Also, when ethical issues arise or it is too costly to run experiments, then alternative methods can be considered.

Ch10. Complementary Techniques

Summary: Techniques to complement experiments are “log-based analysis”, “human evaluation”, “user experience research (UER)”, “focus groups”, “surveys”, and “external data”. (1) log-based analysis ables analysts to understand users’ views, behaviors, or interactions to understand experiments. But, it is hard to understand specific reasons for users’ certain actions. (2) human evaluation is to hire people to collect information on response to a new product. But, the hired people can react differently to general users. (3) user experience research (UER) can use technologies such as eye-tracking to collect specific data about users. (4) With a focused group, the researcher can ask open-ended questions by asking questions again to the surveyee. But, the number of samples is smaller than the UER. And the preference of a focused group can differ from that of general users. (5) Research can sample from the population to survey. However, survey bias can exist. That is, some people don’t respond to surveys. (6) Analysts can compare results from other tech reports or research that use external data. However, the researchers may have limited knowledge of other researchers.

New or curious concept (or questions): I wonder if CTR metrics differ from the “log-based analysis” in this chapter as some information other than log can be included.

Ch11. Observational Causal Studies

Summary: When the firm cannot run experiments (e.g. ethical issues), the researcher can implement an observational study instead. Examples of observational studies are interrupted time series, interleaved experiments, RDD, IV, PSM, and DID.

New or curious concept (or questions): For “interleaved experiments”, the book says they mix two groups X and Y, but I don’t understand why.

Three takeaways (from Part III.):

(1) With experiments, it may be hard to understand the users fully. In such cases, surveys or user experience research can help analysts to understand the users’ behaviors better.

(2) It is important to understand the advantage and limitations of each technique.

(3) Observational studies complement experiments when it is hard to conduct one. Techniques such as causal inference (RDD, IV, PSM, DID) can be used.

Please feel free to leave any comments or questions! Thank you for reading my post.

Book Summary: “Trustworthy Online Controlled Experiments” [Part III.]

Written by Weonhyeok Chung