Solving Sudoku: Part I

11 min readOct 13, 2017

A few weeks ago I was in my flat, preparing for a job interview, when the unthinkable happened. The internet went down. After literally seconds of unending tedium, I noticed something on the coffee table. A timeless scroll featured in many-a London household, the Metro. After skimming through three-day old sports news, I arrived at my destination only to find that tragically, the crossword had already been filled in.

Truly disaster stricken, left with no other choice that didn’t require a noose, I began the Sudoku. After filling in two boxes and realising my deplorable and pathetic state, a thought occurred that would prove to be my salvation. Whilst it may be boring to solve one Sudoku, perhaps it would be interesting to solve any Sudoku. This blog series is about just how interesting that problem was.

For any readers who enjoy solving meaningless problems as much as me, I encourage you to follow along with the tutorial so you can fix, break and improve upon what I’ve done. I used Python 3, specifically Anaconda, and recommend PyCharm as a snazzy, free editor. I’ll assume you’re familiar with Python syntax and data structures but will try to explain the rest as best as I can.

For those readers who have lives, just stick around for the ride, and bear with me if and when it becomes mind-numbing.

Rules of the Game

Sudoku is a game about putting digits between 1 and 9 into a square, 9x9 grid, subdivided into 9 boxes. Usually there’ll give you some numbers to start with, like so:

However we can’t just start putting numbers anywhere we please, lest we induce chaos. This ain’t ‘Nam — there are rules. To complete Sudoku, each row, column and box (call these units) must contain all the digits between 1 and 9. Since each unit has 9 spaces, the numbers in each unit must be unique.

**Figure 2:** Units containing cell B2 and associated peer cells

Above we can see all of the peers for the 8 in the top left box. Note that it is the only 8 amongst its peers. This gives us our first rule:

The value of a cell cannot be repeated amongst any of its peers.

All well and good I hear you say, but “how do we solve Sudoku?” (for those instead asking why, see Nietzsche). Since a number cannot appear in any of its peer locations, it leads to a simple process of elimination.

By confirming the 9, we can then eliminate that from all of its peers and continue in that fashion until we’ve finished the board. Easy, yet time consuming and soul destroying for us fleshy mortals, rather a better job for our robot overlords. The first thing I recommend doing when building a robot overlord, is to give them a name. A friend suggested that my pathetic, tawdry ramblings were fractionally more tolerable if I personified the subject. Whether this is true or in fact furthers my alienation, is a question that remains unanswered. Nevertheless, I dubbed my program Bernard. You may a choose a different name, such as Lyra or Hector, but please do not name your program Lance, that would be unnecessarily cruel.

Structuring the Board

You and I know that this is a job a computer can do well. The hard part is convincing the computer of that. We need some system to describe the board to the machine in a way it understands and finds easy to work with. We can start by labelling the rows and columns so we can reference each cell easily:

It might seem logical to use a 2x2 array as that is what we’re given as humans. However if you think about it, there’s no real reason the board has to be a square, or even that it has rows and columns at all. Those are all visual cues to help humans solve it. Visual cues that a computer without eyes will be insulted by. The important information is that each cell has a number of peers that cannot contain its value and that each unit requires each of the 9 digits. If we were trying to explain this method to a human, it might be useful to have a book from which we could look up the peers for each cell, just as Figure 2 shows all of the peer positions for B2. Bernard can’t read books, but he can read Python dictionaries: so we’ll use one of these to provide this lookup. Similarly, we’ll give him a reference list of all the possible units, which he can use later to validate his claims.

The function below generates these reference objects using some cross products and list comprehension:

Snippet 1 : Useful reference artefacts for Sudoku

The next thing we need to do is describe the input puzzle to the program in a way it can understand. Once again, the computer doesn’t care about the shape of the board, just as long as we assign the right digits to the right positions. I’m going to define the input as a string describing the board from left to right, top to bottom, with 0 or . used to include blank cells. We can ignore all other characters and build up our dictionary. The below function cleans up the string and uses the zip function to map the values to each coordinate.

Snippet 2 : Parses string input into a puzzle dictionary

You can practise parsing the example grid from Figure 1, the string format is in this snippet.

We have the input grid, but the empty cells are not particularly useful as . characters. In implementing our elimination strategy, it is better think of the unknown cells as a list of possibilities, possibilities we can eliminate when we have confirmed a peer. In Figure 3 the elimination is slow, going one by one until we establish the answer. However, we can eliminate 8 possibilities straight away from their associated units for every digit we’re given in the puzzle. Then when we have only one possible digit left for a cell, we can eliminate further. The jargon for this is called constraint propagation.

Snippet 3 : Attempts to solve puzzle from one simple constraints

Note that we’re using strings instead of lists here, which is useful in this case because in Python strings are immutable whereas lists are not. Note also that the grid is a dictionary which is mutable, although it is not problematic in this case. None of this matters for now, but we’ll see later that these choices for data structures come in handy.

If we run this for our puzzle, it does quite well but doesn’t finish it. We can see the remaining possibilities below (using this handy display function):

**Figure 5:** Attempted solution for the easy puzzle, using a single constraint

There are some choices that aren’t immediately clear, but if we take a look at C4 there’s a clue. In the top-middle box there are no other cells that have 5 as a possibility. Similarly with C5 there are no other cells with 3 as a possibility. That means we can confirm those values and eliminate further, giving us another rule:

If a unit has only one possible place for a given digit, it must be that digit.

So we can add that constraint to our eliminate function and see how it performs.

Snippet 4 : Implements the second constraint

With this added knowledge, Bernard can now solve the puzzle:

So that was pathetically easy. This is usually the point where some putrid, fusty mortal exclaims:

Of course it can solve the easy ones, anyone can do that. I’d like to see it try one of those hard ones from the Guardian!

Bernard can solve that one in mockingly trivial fashion as well, which speaks to the efficacy of this seemingly simple strategy. Although they’re not all that easy, a quick search brought up “the most devious brainteaser ever devised”: a Sudoku puzzle by Finnish mathematician Dr. Arto Inkala in 2012. This one is apparently even tougher than the previous two he published in 2006 and 2010. I would make some comment about him wasting his time on meaningless endeavours if the irony weren’t so obvious and palpable.

puzzle = """
 8  .  . | .  .  . | .  .  .
 .  .  3 | 6  .  . | .  .  .
 .  7  . | .  9  . | 2  .  .
---------+---------+---------
 .  5  . | .  .  7 | .  .  .
 .  .  . | .  4  5 | 7  .  .
 .  .  . | 1  .  . | .  3  .
---------+---------+---------
 .  .  1 | .  .  . | .  6  8
 .  .  8 | 5  .  . | .  1  .
 .  9  . | .  .  . | 4  .  .
"""

display_grid(solve_puzzle(puzzle))

**Figure 7:** Remaining possibilities for cells of the difficult puzzle after one round of constraint satisfaction.

Now we’re talking. Our simple logic has barely made a dent into this puzzle, with 60 cells still uncertain and many of those with lots of remaining possibilities. When a human is solving a particularly difficult puzzle, they may choose to guess a value and see if the board works out or if it leads to a contradiction. That human may find it difficult to employ this strategy on the puzzle above and still perceive the game to be entertaining, unless they were trapped in a forced labour camp or a Butlins. However, Bernard is not limited by a gnat-like attention span and a penchant for shiny things: he will persevere. It would be cruel of us to send him on a goose chase though, we should first try to minimise the amount of work.

Looking at the possibilities above, it would make sense to guess either 3 or 9 for H7. This would maximise our probability of getting the right answer, as its a 50–50 choice. Whereas if we tried to guess A9, we’d have a 1/7 chance of picking the right number. As such we should always guess the cell with the fewest remaining possibilities. When we have made a choice, constraint propagation ought to significantly reduce the number possibilities in the remaining cells.

The last bit we need to build is the guessing function, a means of determining if we made the wrong decision. It stands to reason that if we make the wrong decision and propagate the changes through fully, eventually we will reach a point where we contradict ourselves and we eliminate all possibilities, making a solution impossible. Similarly to our two rules for determining numbers, this can be written as:

If there are no remaining possibilities for a cell, the board is invalid.
If there are no possible positions for a digit in a unit, the board is invalid.

We refactor our code to identify invalid boards based on these rules and then attempt solutions, throwing away bad ones, until we find the right one. This is a brute force method often referred to as backtracking. However, we have vastly limited the number of choices by using constraint propagation each time.

Snippet 5 : Hybrid solver with constraint propagation and backtracking

We return None whenever we detect a contradiction in the eliminate function. That state is then propagated through all of the functions so they all can exit as quickly as possible. The validate_sudoku function simply checks if we have completed the puzzle. Note that when we recurse guess_digits (line 72), we use grid.copy(). This is because Python dictionaries are mutable and we are attempting different, contradictory permutations. Earlier when we decided to use immutable strings for the possibilities instead of lists or sets, we are now reaping the benefit as we would require grid.deepcopy() if the possibilities were mutable and that is a slower operation.

Now we can run this function on the “Everest of numerical games” and see how Bernard does:

**Figure 8:** Arto Inkala’s 2012 difficult Sudoku completed

Boom. Completed in roughly 0.04s. If you’re reading this Arto, sorry mate.

Perhaps then, this algorithm is only good at puzzles designed for humans and there are certain types of puzzle that are resistant to this algorithm. Wikipedia has a page on Sudoku Algorithms and our hybrid method gets a shout out halfway down the article. In the brute force section they give an example of a Sudoku that is intentionally difficult for pure brute force algorithms, but that one is trivial when applying our two constraints and doesn’t even require any guessing, completing in approximately 0.01s.

This was the interesting bit to me, that a near-optimal solution for solving Sudoku with a computer is very similar to how a human would naturally approach the problem. Whilst there are many techniques employed to solve Sudoku, following only the simplest two is sufficient to complete most puzzles. For extremely difficult Sudokus we need to employ some brute force methods, but a human might be forced down a similar path in those situations. The computer is mainly adding speed and rigour.

We now have a robot that can solve any Sudoku, the question is what do we do with it? The misanthrope in me liked the idea of being able to ruin people’s puzzles whilst riding on the tube. However it is impractical to type out one of these puzzles on a laptop. I thought it would be cool if Bernard could “see” and so the next post in this series is about my efforts to try to give Bernard eyes.

Challenge

For the keenest among you, I would suggest trying the following puzzle:

 .  .  . | .  .  6 | .  .  .
 .  5  9 | .  .  . | .  .  8
 2  .  . | .  .  8 | .  .  .
---------+---------+---------
 .  4  5 | .  .  . | .  .  .
 .  .  3 | .  .  . | .  .  .
 .  .  6 | .  .  3 | .  5  4
---------+---------+---------
 .  .  . | 3  2  5 | .  .  6
 .  .  . | .  .  . | .  .  .
 .  .  . | .  .  . | .  .  .

It was discovered by the original creator of the algorithm from this post and it should be really slow against the code I’ve given. See if you can modify the program to solve it in less than 0.1s.

Acknowledgements

The code and solution I have presented here is inspired by Peter Norvig’s excellent solution for this problem. I was working on the problem and had a similar method when I came across Peter’s work. It was clear that whilst bearing similarity at its core, Peter’s program was far more efficient and elegant. As such I adopted much of those improvements as I thought it better to espouse the best solution I knew of rather than an inferior one of my design. His remains the best solution to this problem I have seen and was a core motivation for continuing this project further.

Whilst writing this blog I also came across this article by Naoki Shibuya that explain’s Peter’s code in great detail.

P.S.

For the lazy among you who have skipped reading or performing the tutorial yourselves or simply work for Blue Peter, here’s a link to the finished solution.