FinML — Optimising AB tests with multi-armed bandits

Hendrik
Tide Engineering Team
1 min readJun 1, 2021

Key takeaways

  • There is a cost of running A/B test in that one of the variants is going to be inferior. The difference between running the inferior variant and the optimal variant of the experiment is called the Bayesian regret.
  • Multi-Armed bandits are a method to address this problem by mathematically formulating the exploration / exploitation tradeoff and minimising the regret.
  • Multi-armed bandits are especially useful when the opportunity cost of lost conversions is high or when we are aiming to optimise revenue with low traffic.
  • There is a version of multi-armed bandits that is contextual, i.e. where the recommendation depends on user attributes. Will recommends not using this though as it introduces too much complexity (a model on top of a model).

Video

FinML — Optimising AB tests with multi-armed bandits

Supporting materials

--

--