Can a Pile of Matchboxes Be Intelligent?

Published in

Sopra Steria Norge

6 min readAug 30, 2023

There’s been an enormous amount of hype around AI in recent years, and rightfully so; some of the latest breakthroughs are rivaling science fiction. AI beats world champions at board games, creates paintings and photographs, writes poems, creates music, videos, codes, folds proteins, and diagnoses patients with rare diseases. It seems like there’s no limit to what it can do.

The word AI gets thrown around a lot, and it gives the illusion that there is this one thing that keeps getting exponentially more competent. When that competence reaches certain thresholds, we start assigning sentience to it because we associate certain kinds of competence with sentience. For example, in 2016, the world champion Go player, Lee Sedol, commented that AlphaGo demonstrated some surprising moves that felt inspired and creative. In 2022, a Google engineer came out who said that their chatbot (LaMDA) had become sentient.

One of the competencies that we often mistake for sentience is the ability of a system to learn and adapt. It intuitively feels that this type of behavior requires intelligence and at least some degree of sentience. There is, however, a fascinating example of an analog system designed in 1961 that can help put things into perspective.

Teaching Matchboxes to Play

We are already desensitized to computers being able to learn in the digital space. However, way before personal computers, a system was designed that could beat a human player at the game of tic-tac-toe (Noughts and Crosses), and it did this by learning the game from scratch.

The system is called MENACE (Matchbox Educable Noughts and Crosses Engine), and it was built by artificial intelligence researcher Donald Michie in 1961.

The concept is simple. I’ll provide a step-by-step breakdown to reproduce the mechanism.

Step 1 — Get Some Matchboxes and Colored Beads

The first step is to have some empty matchboxes and colored beads at hand — a lot of them. We’ll circle back soon to the exact math, but it’s not important at this point.

Matchboxes and Colored Beads (Image by Author)

Step 2 — Assign A Color to Each Square on the Board

Once we have our supplies ready, we can check which color beads we have available and assign a bead color to each square on the board.

Assign a Color to Each Square (Image by Author)

Step 3 — Create a Matchbox for Every Legal Board State

Our next task is to dedicate a matchbox for each legal board configuration that requires us to make a move. At this point, we run into limitations due to the sheer number of possible states, so we need to be resourceful. The total number of available legal moves is 5478, and that’s way too many matchboxes for anybody to manage.

There are, however, some strategies for reducing the number of matchboxes required.

Here are the strategies adopted by Donald Michie:

MENACE always starts the game. That means that the matchboxes only have to account for the possible moves of a single player.
Rotated board positions can be stored in the same matchbox because the board configuration is the same regardless of the angle we look at it.
Mirrored board positions can also be stored in the same matchbox because the strategy for choosing the next step remains the same.

Adopting these strategies reduces our required matchboxes to a total of 304.

A Matchbox that Represents the State of the Board (Image by Author)

Once we have our matchboxes ready, we draw the board configuration on top of it, and for each empty square, we place the appropriate color beads into the matchbox. These colored beads represent our move possibilities.

We repeat this for each legal move on the board, leaving us with 304 matchboxes with board states on top and beads representing the available legal moves inside them.

Step 4 — Make a Move

Now that we have our matchboxes ready, it’s time to make a move.

Making the first move requires us to:

Find the matchbox with the empty board on it. Since the board is empty, this matchbox will have all the available bead colors inside, representing every square of the board.
Shake up the beads inside the matchbox and remove a single bead from the box without looking. Make the selection as random as possible.
Place your move on the square that is represented by the bead color.

After the opponent makes their move, repeat the process by finding the matchbox with the corresponding board drawing on it.

Step 4 — Reward or Punish the Move

After each move, we’re keeping the box and the bead so that we can track our move sequence. We’re adopting a technique called reinforcement learning to help the matchboxes learn to play.

Here’s the strategy:

If we win, we put 3 beads of the same color back into the box.
If we lose, we don’t put the bead back into the box.
If we draw, we put the bead back into the box.
We do this for all the boxes that were part of our move sequence.

Step 5 — The Matchboxes Learn

Amazingly enough, if we play enough games using this strategy, our matchboxes will learn to accumulate the bead colors most likely to lead to a winning move sequence and thus learn to play the perfect game. In 150 games, the matchboxes get to a level where they consistently draw or win against humans.

It is completely mind-boggling that a pile of matchboxes could learn, through experience, to beat a human player.

Virtual Matchboxes

Theoretically, we could use the same approach to beat humans at chess or go. The problem is that we quickly run out of matchboxes. If we had a matchbox for every single grain of sand from all the beaches and deserts in the world, we still wouldn’t have enough matchboxes to model chess.

But here’s the thing. We don’t need matchboxes because we have computers, and what’s great about computers is that they can handle the little virtual matchboxes at mind-numbing speed. The other great thing is that they can manage an astronomical number of them.

Also, we have strategies and algorithms to do more with fewer virtual matchboxes since using the MENACE approach at a certain scale becomes wasteful and impractical.

The interesting thing, though, is that we’re essentially doing this same thing with today’s AI, but we have developed infinitely fancier ways of going about it. We have strategies where we make the AI play against itself, a generator against a discriminator, or simply feed it with a gigantic dataset.

Parting Thoughts

Although surprisingly fruitful, our approach to teaching a machine anything relies on extreme brute force. Instead of beads, we shuffle bits at the speed of light, and at the end of the day, we end up with a pile of matchboxes we call weights. These weights produce incredible results, but other than efficiency, there’s nothing special in the digital domain. They’re just a pile of virtual matchboxes, sitting stagnant, lacking any wants, needs, opinions, or feelings.
If we provide the correct input, the output seems intelligent, which could be because it inherently is, but it just as well could be that it seems that way because, within it, we see a reflection of ourselves.
In closing, while the marvels of AI continue to unfold day by day, it is essential for us to appreciate them for what they indeed are: extremely useful tools enabled by the power of modern computing. The pile of matchboxes is intelligent by some definition of the word; however, it is about as sentient as today’s AI.