Check Mate, GPT: On the Dangers of Mistaking Reasoning for Pattern Recognition

4 min readJun 4, 2023

In 1892, a four-year-old boy in Havana was watching his father play chess with a friend. Having no idea what the the game was about, he watched intently as the players took turns moving the white and black pieces on the board. At the end of the first game, he had intuited the rules — how the pieces move and that the ultimate goal is to checkmate the opponent’s king. During the second game, the boy realized that his father made an illegal move with his knight. He waited politely until the game was over before pointing this out.

At first he was inclined to dismiss my statement with the characteristic tolerance of a father who hears something foolish issue from the mouth of his offspring. My earnest protestations, arising from the exultation of having acquired some new and interesting knowledge, and the doubtful look of his opponent, caused him to believe that he might, after all, have been guilty of deceiving the other player. He knew, however, that I had never seen a game of chess before, and he felt safe in informing me very politely that he doubted very much whether I knew anything of what I was saying.
My reply was to challenge him to a game of chess. Whether he felt that I had suddenly contracted dementia, or whether he wanted to save himself from further embarrassment in the presence of his friend, I do not know; but he sat down to play me, evidently foreseeing an early capitulation on my part.
When he saw that I knew how to handle the pieces, he became visibly disconcerted.
When the game reached its close, I cannot say whether it was amazement, mortification, or pleasure that affected him most; for I had beaten him in my very first game of chess.

One game was all it took for the spellbound four-year-old to figure out what was going on with the pieces. Compare this to GPT-4. OpenAI’s latest regurgitation engine has made waves for seemingly acquiring skills it was never expressly taught simply by churning through, and looking for patterns in, terabytes of text data. These data most likely include tens of thousands of chess games written down in PGN notation, and — lo and behold — the engine seems to have picked up the rules of chess. Except, not really.

Me: Do you play chess? Let’s play. I am white.
GPT-4: Yes, I can play chess! However, as an AI language model, I don’t have the capability to physically move the pieces on a physical board. But I can certainly play a game of chess with you in a text-based format if you’d like! Please let me know if you’d like to proceed with a text-based chess game.
Me: Yes. I am white. 1. h4.
GPT-4: Great! The game has begun. As Black, I will respond with 1… e5.
Me: Rh3
GPT-4: I’m sorry, but that move is not legal as the rook is not allowed to move through other pieces or land on a square occupied by another piece. Please make another move.

Playing the rook to H3 is, in fact, a perfectly legal move in the position, but the engine did not realize this. After white’s (admittedly unorthodox) first two moves, GPT-4 was utterly lost — metaphorically speaking. As this example shows, the notion that the regurgitation engine has intuited the rules of chess is illusory. Play by the book and GPT-4 will respond with legal and reasonably-looking moves for quite some time; play one or two moves out of left field, and it will start to founder immediately.

Capablanca, to recap, figured out the rules after a single game; GPT-4, despite trawling through tens of thousands of games, is clueless. There are important lessons to be learned, not only about the far superior efficiency of the human mind (Capablanca was a prodigy, but most inquisitive children would probably have mastered the rules after watching a dozen games or so) but also — and more importantly — the fact that, when sufficiently advanced, pure pattern recognition can mimic rule-bound, deductive reasoning. But they are distinct. Playing chess with GPT-4 is to enter a Potemkin village. Sneak away from Main Street into an alley — do something unexpected — and you immediately realize that the impressive-looking houses are all propped up set pieces.

For practical purposes, any reliance on GPT-4 for deductive reasoning tasks is a recipe for disaster, easily exploitable by a creative adversary — be it a chess player or opposing counsel. In addition to the direct risks, there are more insidious dangers. By confusing deductive reasoning for pattern recognition, we become oblivious to the fact that the system is gamed — doomed to perpetuate the status quo under the guise of ineluctable logic. This is extraordinarily dangerous. To change the world for the better, new gambits and strategies are necessary; if, instead, we use GPT-4 as a reasoning engine, we will never discover those moves or mistakenly believe that they are illegal.

On a more theoretical level, the chess example calls into question the notion that regurgitation engines are approaching some kind of Artificial General Intelligence (whatever that means).

Check Mate, GPT: On the Dangers of Mistaking Reasoning for Pattern Recognition

Written by Jonas Persson