Wish Engineering And Data Science

Follow publication

Wish Engineering And Data Science Blog

Follow publication

Trustworthiness is vital to A/B testing

Our efforts to ensure a reliable A/B testing platform

Qike (Max) Li

Published in

Wish Engineering And Data Science

3 min readMay 24, 2022

Contributors: Max Li, Chao Qi, Eric Jia

At Wish, we deeply value the reliability of our experimentation platform. We published ‘Measure A/B Testing Platform Health with Simulated A/A and A/B Tests’ and ‘A/A Testing Establishes Trust in Experimentation Platform’. Those two publications demonstrate our efforts to ensure a reliable A/B testing platform. This post summarizes the study.

A/B testing plays a critical role in decision-making in data-driven companies. It is typically the final say of go or no-go for a product launch. Inaccuracies in the A/B testing may degrade all business decisions derived from A/B tests. At Wish, we continuously evaluate the reliability of the A/B testing platform. In the published articles, we share our efforts to ensure a trustworthy A/B testing platform by controlling type I errors (false positives) and type II errors (false negatives) through A/A testing, pre-assignment test, and simulated A/A and A/B tests.

Type I and type II error (*Image by Ming Gong from Wish)*

We run various A/A tests for different scenarios, such as different steps in the conversion funnel (e.g., impression, product click, adding to the shopping cart, and product purchase), client-side and server-side experiments, logged-out and logged-in experiments, etc. Since the experiment buckets (e.g., control and treatment) in A/A tests are identical, any statistically significant results returned from your A/A tests are false positives.

We run pre-assignment tests for each experiment. A pre-assignment test is a retrospective A/A test that uses the data X (e.g., 60) days before the start date of an experiment. A statistically significant result from the pre-assignment test indicates a biased A/B test

We also run hundreds of simulated A/A tests and A/B tests every day to measure the false positive rate and false-negative rate respectively. The simulated A/A and A/B tests reproduce the metric calculations of real A/A and A/B tests but with simulated offline randomizations. Further, in the simulated A/B tests, we simulate various scenarios of feature impact and evaluate power (1-false negative rate) of the A/B tests in those scenarios

The more we improve our A/B testing platform, the more we realize that devil is in the details. A systematic evaluation of the health of the A/B testing platform is paramount. Stay tuned for more studies in this area.

Thanks to Chengxi Shi, Song Wei, Todd Hodes, and Rob Resma for creating the A/A tests and to Ming Gong for making the image. We are also grateful to Pai Liu for his support, and Pavel Kochetkov, Lance Deng, Delia Mitchell for their feedback. Data scientists at Wish are passionate about building a trustworthy experimentation platform. If you are interested in solving challenging problems in this space, we are hiring for the data science team.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in Wish Engineering And Data Science

997 Followers

Last published May 31, 2023

Wish Engineering And Data Science Blog

Written by Qike (Max) Li

181 Followers

23 Following

Data Scientist @ Wish| ML & Experimentation Enthusiast | PhD in Statistics (https://www.linkedin.com/in/qike-li/)

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

More from Qike (Max) Li and Wish Engineering And Data Science

Move: The Programming Language’s History, Current Ecological Landscape, and Future Potential

Rooch Network

Ren

Move: The Programming Language’s History, Current Ecological Landscape, and Future Potential

Move is a programming language that was created by Facebook in 2018. It was designed specifically for use with the Libra blockchain (later…

Apr 21, 2023

Why Aren’t We Paying Attention? The Crypto Opportunity of a Lifetime

Coinmonks

Alertforalpha

Why Aren’t We Paying Attention? The Crypto Opportunity of a Lifetime

Let’s get real for a second. The crypto market is giving us every single sign that something massive is happening. Governments, banks, and…

Mar 10

Is the Bitcoin Bull Run Over? The Truth No One Wants to Hear

Coinmonks

Alertforalpha

Is the Bitcoin Bull Run Over? The Truth No One Wants to Hear

The Cold Hard Reality of Crypto Markets

Mar 28

The Emergence and Growing Dominance of Layer2 Optimistic & ZK-rollups

Rooch Network

Ren

The Emergence and Growing Dominance of Layer2 Optimistic & ZK-rollups

Blockchain technology has long since lit the fuse for starting a decentralized digital revolution in the world of global finance. The most…

Apr 9, 2023

See all from Qike (Max) Li

See all from Wish Engineering And Data Science

Recommended from Medium

This new IDE from Google is an absolute game changer

Coding Beauty

Tari Ibaba

This new IDE from Google is an absolute game changer

This new IDE from Google is seriously revolutionary.

Mar 11

198

Age of Awareness

Cliff Berg

They Know a Collapse Is Coming

The CIO of Goldman Sachs has said that in the next year, companies at the forefront will begin to use AI agents as if they were employees —…

Jan 22

849

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jessica Stillman

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Oct 30, 2024

763

Fired From Meta After 1 Week: Here’s All The Dirt I Got

Sebastian Carlos

Fired From Meta After 1 Week: Here’s All The Dirt I Got

This is not just another story of a disgruntled ex-employee. I’m not shying away from the serious corporate espionage or the ethical…

Jan 8

441

An abstract illustration of a vast, dreamy desert landscape under a starry night sky. A small figure sits by a campfire, dwarfed by the large silhouette of a serene face blending into the sand dunes, creating a surreal and contemplative atmosphere.

The Startup

Jano le Roux

How This 17-Year-Old Quietly Built a $1.12M/Month AI App

I stumbled upon his exact strategy from A to Z and it's brilliant.

Dec 3, 2024

180

If You’re Planning to Leave Trump’s America, Don’t Come To Europe

Fourth Wave

Mona Lazar

If You’re Planning to Leave Trump’s America, Don’t Come To Europe

Reality looks different on the other side of the Atlantic

Dec 9, 2024

749

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech