Is poker AI ready to make money?

Yuxi Li
4 min readJul 13, 2019

--

There is a Science paper titled Superhuman AI for multiplayer poker. It introduced Pluribus and defeated top poker professionals. Naturally, one may wonder, how about implementing it and letting it make money?

This paper is the first ever I have read in top journals and conferences on multi-player Texas Hold’em Poker. This would foster more research on poker AI and multi-player games, which have a wide range of applications, e.g., auction, finance, fraud prevention, negotiation, and cybersecurity.

Let’s review recent development of research on poker. Michael Bowling et al. published a Science paper in 2015, Heads-up Limit Hold’em Poker is Solved; and another Science paper in January 2017, Deepstack: Expert-level artificial intelligence in heads-up no-limit poker, which defeated poker professionals. Noam Brown et al. published in December 2017 a NIPS paper, Depth-limited solving for imperfect-information games; and a Science paper in January 2018, Superhuman AI for heads-up no-limit poker: Libratus beats top professionals.

Now let’s talk about some issues.

First, the current research on poker, including the one just published, focuses on simplified scenarios, which are not close to the real poker game yet.

As mentioned in the Facebook blog about Pluribus, “there were six players at the table with 10,000 chips at the start of each hand”. That is, the current experimental results were about many of the first hands of the game. Real poker is more complex, and strategies are usually affected by various amounts of chips, etc.

Second, the policies designed in the paper are usually mixed and stochastic, but not adaptive, and won’t exploit the weakness of opponents. Are such “fixed” policies optimal? Not 100% sure. Probably not.

Third, poker is complex, and there are many possible hands. Tens of thousands of hands with top professionals can say something. However, even so, can we call it “superhuman”? Is it statistically significant? There are only few who could win WSOP Main Event, or just sit in the final table, more than once — the game has a high variance with a luck component.

Fourth, there should be comparisons with the state-of-the-art. It is time-consuming and costly to play with human players. It is relatively easy for two or more poker AIs to form a multi-player table, play millions of hands or more, and the results would be more convincing. We have decent poker AI like DeepStack anyways. When we do research, we usually compare with baselines.

Fifth, are the results of AlphaGo convincing? Theoretically, we can not prove the AlphaGo algorithm could find an optimal strategy. AlphaGo validated its power by experiments too. However, we can basically trust the results of AlphaGo. One key factor is, in games like Go and chess, the two players know the current status perfectly, and the game is deterministic. In such games, we can usually interpret a win as being stronger.

Sixth, hopefully we can see real world poker AI, which considers a sequence of hands, various amounts of chips, various opponents’ styles, etc. Hopefully we can see competitions among top poker AIs. This is a great start for poker AI, and we will see it becomes stronger.

I should mention that, there may be errors and thus comments and criticisms are welcome.

So, is poker AI ready to make money? YMYD: Your Money, Your Decision!

— — — — — — — — —

Notes:

  1. This Science paper is not so technical, so it is relatively easy to understand. And the Facebook blog contains most of the content.
  2. If you want to make money with poker AI, now may be the time to prepare for it. Personally, I prefer a deep learning approach, which has shown its powerful representation capacity in many problems. Deep reinforcement learning is gaining traction. It is desirable to implement both DeepStack and Pluribus, and run competitions with variants of them, which may also help improve them or design a new poker AI.
  3. Michael Bowling said AlphaGo and DeepStack are cousins. DeepStack uses Counterfactual Regret Minimization (CFR), which follows the paradiam of policy iteration. Libratus/Pluribus also uses CFR.
  4. When talking about the contribution to multi-player games, the authors of the Pluribus paper should mention a recent result: Deepmind published in July 2018 on arXiv about results of Catch The Flag, and published in May 2019 in Science, Human-level performance in 3D multiplayer games with population-based reinforcement learning.
  5. The authors compare the computation requirements for execution with AlphaGo. The AlphaGo paper was published in 2016, and later came the papers of AlphaGo Zero and AlphaZero; it is proper to compare with the later two papers. As well, DeepStack, as published in Feburary 2017 in Science, can be executed on a notebook.
  6. There are two ways to play poker: tournament and cash game, and strategies are usually different. The current poker research focuses on cash games, in simpliefied scenarios. A superhuman poker should excel in at least one, preferably both of these two ways of poker playing.

--

--