Sudoku solver redux

Edmund von der Burg

Way back in May and June 2005 I published the following posts on my personal blog (ecclestoad.co.uk). Now that I’ve scrapped that blog I’m republishing it here.


May 25, 2005 — Sudoku solver in four lines

Slightly silly perhaps but I thought it would be fun to write a piece of code to solve these sudoku puzzles that keep popping up. As that appeared to be trivial I decided to see if it could be done in four lines or less. Here it is:

It is slightly arcane in its usage but what do you expect from four lines of Perl? The best way to run it is to save it to a file (eg sudoku.pl). You need to feed it the start grid as well. This is one long line with the top row first, the second row next and so on. Use zeros for the blanks. Do not separate the rows with a space, just push them all together.

For example the following puzzle was given in the Economist (21 May 2005) on page 75:

. . . . 1 . . . .
3 . 1 4 . . 8 6 .
9 . . 5 . . 2 . .
7 . . 1 6 . . . .
. 2 . 8 . 5 . 1 . (dots are used for the blanks)
. . . . 9 7 . . 4
. . 3 . . 4 . . 6
. 4 8 . . 6 9 . 7
. . . . 8 . . . .

This would give you the following line for input:

000010000301400860900500200700160000020805010000097004003004006048006907000080000

If you saved the input in a file called input.txt you could then run the solver like this:

perl sudoku.pl < input.txt

The output is the same format as the input — that is to say all the rows on one line.

It will only work for 9×9 grids containing numbers 1 to 9. Hope you enjoy it.


May 27, 2005 — Obfuscation as a learning tool

A couple of days ago I posted an obfuscation that solves sudoku puzzles in four lines of Perl. To compress the code down to this level required quite a few dirty little tricks which would have no place in production code. Here I am going to postmortem the obfu and point out what there is to learn from it.

First the code as an obfu:

And then tidied up by perltidy:

Now I will run through the code explaining what each bit does and musing about it.

$a = 8;
$G{ int( ++$a / 9 ) . $a % 9 + 1 } = $_ for split //, <>;

This code takes the input and builds a hash. The hash %G is keyed on two numbers, such as ‘23’ or ‘39’. This number is the (y,x) coordinate on the sudoku grid, with the origin in the top left. I used (y,x), rather than (x,y), as it allowed me to print the grid out later on more easily.

The creation of the key is quite interesting as we are numbering the values coming in on the list in a sort of base 9. Think about the change in numbers as you go off the first row and onto the second: 17, 18, 19, 21, 22, 23. As you can see it needs to jump from 19 to 21. The y coordinate is done by throwing away the fractional part of a division, the x coordinate is obtained using a modulus. The increasing value to work with is provided by $a which starts out as ’8′ and then is pre-incremented during the creation of the key.

@A = 1 .. 9;

Simplicity itself. @A is just an array of 1 to 9. If you look in the rest of the code there are many times when I needed to run from 1 to 9 and so as a space saver it was worth creating this array.

sub c {
int( ( $_[0] - 1 ) / 3 ) * 3;
}

This is a helper function that is used later on when finding which numbers are currently in the block that we are looking at. It is passed a single digit from 1 to 9. It returns 0, 3 or 6 for 1..3, 4..6 or 7..9. See later for how the return value is used.

sub G {

Create a sub routine G. This is the meat of the obfu and most action occurs in here. The basic idea is to find the first square on the grid that does not have a value. Then find all the possible numbers that could go in. For each number put it in and recurse. If we reach the end of the grid exit, otherwise try the next value. If there are no more values to try backup to previous squares.

for $y (@A) {
for $x (@A) {

The heart of the routine is two loops, both from 1..9. This gives us the coordinates of the square in the grid that we are working on.

$p = $t = $G{ my $c = $y . $x } && next;

Several things happen in this line. The most important one is to see if there is a number in the square. As all empty squares are set to 0 (false) this can be done with a simple boolean test. If there is a number then $G{‘yx’} is true which causes the next to get called. This is taking advantage of Perl lazy evaluation of the terms, ie. if the first test fails (there is no number) then there is no need to evaluate the second test, the next.

In the interests of cramming code in there are a few other things going on. Firstly a variable $c is set up containing the current (y,x) location. This assignment happens inside the curly braces of the hash. This works because the return value of an assignment is the value assigned. This is also used to give initial false values to $p and $t which are used later. The values in $p and $t are always false as if it was true then the loop would have jumped on.

$t .= $G{ $_ . $x } . $G{ $y . $_ } for @A;

Finally we start to actually determine what numbers could go into this square. The $t string is appended with the numbers that are already taken (t is for taken). This is done for both the current row and the current column in one statement.

for $f ( 1 .. 3 ) { $t .= $G{ c($y) + $f . c($x) + $_ } for 1 .. 3 }

This double loop appends the numbers that are taken in the current block onto $t. There are two variables floating around, $f and $_ which both run from 1..3. These are added to the return value from the c subroutine discussed earlier which results in the squares in the current grid being selected.

The funny use of two for loops in this way is to save space.

G( $G{$c} = $_ ) && return for grep $t !~ m/$_/, @A;

There are several things going on here. There is a loop over the possible number that could go into the square: for grep $t !~ m/$_/, @A. As $t contains the numbers that are taken the grep only returns the numbers from 1 to 9 that are not taken, ie possible values.

For each possible number G( $G{$c} = $_ ) && return is executed. This sets the current square to the possible value. It then recurses into G. If the return value is true then it returns. In fact it returns true as unless specifically given a value return will return whatever is in $_, which as it happens was just set to the return value from G. This will always be true for the return to get called and so true is always returned.

You might wonder why the current square is assigned its value as an argument to the subroutine G? The answer is that it saves one character doing it this way. Compare G( $G{$c} = $_ ) to $G{$c} = $_, G(). As G does nothing with arguments it is passed this is fine.

return $G{$c} = 0;
}
}

If the code did not return above then it should now. If we reach this stage then every possible number for this square was tried and none of them led to a solution of the sudoku. We want to set the current square back to false and the allow the code to try changing some previous numbers to see if they lead to a solution. Again the return value from an assignment is used as a space saver.

die map { $G{$_} } 9 .. 99;
}

This is dirty. I needed to shave off a couple of characters to get into four lines and so I did this. If we reach this point then the two for loops for $x and $y have reached the end. What has has happened is that the final square has been assigned a number. The subroutine has recursed and the two for loops have run and next has been called for every square as they all have numbers in them. Hence we end up here. To save those last few characters I used die instead of print. Horrible.

There is no need in the the map to worry about the values in G for keys such as 10 or 20 that have no values. They simply result in undefs being printed which produces no errors as use warnings has not been specified.

G

So here we are at the end of the code. All we need to do now is run G for the first time.

Rather interestingly I have spotted a space saver writing this post. There is a variable $p that was set to false but never actually gets used. It used to be used for possible values that could go into the square but became unneeded due to changes. However I forgot it in there. Bug fixing by code reading in action.

Another space saver is that as the code dies when a solution is found there is no need to return as the code would already have died. This means that another line can be shortened:

G( $G{$c} = $_ ) && return for grep $t !~ m/$_/, @A; # beforeG( $G{$c} = $_ ) for grep $t !~ m/$_/, @A;           # after

This means that the die can be tidied up a bit and we can even format the output:

die map { $G{$_} } 9 .. 99;             # beforedie map { $G{$_} || "\n" } 9 .. 100;    # after

The new, better, shorter obfu:

I hope that you will agree that even though there are only four lines of code here there is plenty to think about. Not all of it is ‘good’ code, in fact most of it is very bad code. However the artificial constraint of four lines leads to some interesting techniques that take advantages of some of the more esoteric parts of the language.


June 2, 2005 — Sudoku solver in three lines explained

In previous posts I produced a four line sudoku solver and then went on to explain it. After having watched a C programmer’s approach to it I am down to under three lines. Here it is and how it works.

Above the obfu version, below the same code having been fed through perltidy. The code works as before — you feed in the sudoku grid on one line using zeros for the blanks. The code then spits out the solution in the same form.

The core difference is that in my four line version I stored the sudoku grid on a hash, whereas here I use an array. For convenience here is a sudoku grid where the numbers represent the index in the array:

0  1  2    3  4  5    6  7  8
9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26
27 28 29 30 31 32 33 34 35
36 37 38 39 40 41 42 43 44
45 46 47 48 49 50 51 52 53
54 55 56 57 58 59 60 61 62
63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80

I’ll now run through the code explaining the various interesting bits.

use integer;
@A = split //, <>;

It is rare that obfus use modules but in this case it saved a few characters. Later on in the code I do a fair amount of integer mathematics, and without the use integer I would have to wrap up six of the expressions like so: int( $_ / 9 ).

The second line puts the grid from STDIN onto an array @A.

sub R {
for $i ( 0 .. 80 ) {
next if $A[$i];

The subroutine R is so called because it recurses. The main part of it is a for loop that runs over the whole grid. However if there is a value in the grid it skips on to the next position.$i is now the current position in the grid that we are looking to find a value for.

my %t = map {
$_ / 9 == $i / 9
|| $_ % 9 == $i % 9
|| $_ / 27 == $i / 27 && $_ % 9 / 3 == $i % 9 / 3
? $A[$_]
: 0 => 1
} 0 .. 80;

This is a bit hairy. I am creating a hash %t that will have as its keys the numbers that can not be used in this position, ie the ones that have been taken. The map contains abool_expr ? if_true : if_false construct to determine what the key should be.

The first three lines in the map are the boolean expressions to determine if $_ is in the same row, column or grid as $i. $_ / 9 == $i / 9 is for the row (remember that the division is integer so something like 12/9 equals 1). || $_ % 9 == $i % 9 is for the column and || $_ / 27 == $i / 27 && $_ % 9 / 3 == $i % 9 / 3 is for the grid (first part for grid row, second part for grid column). These are quite straight forward once you’ve gotten your head round them.

Really the booleans should have brackets round them to clear things up, however because Perl will stop evaluating them as soon as it finds one that is true the final && is ok.

Remember that the map is assigning a key/value pair to the hash. Well if these tests return true then the key is $A[$_], otherwise it is 0. The value is hard coded to true. I was surprised to find that the code for the key did not need to be in brackets, evidently the ?: is more tightly bound than => which is as you would expect.

R( $A[$i] = $_ ) for grep { !$t{$_} } 1 .. 9;

Three things happen here. The for loop runs for all the numbers that can be used in this position thanks to the grep { !$t{$_} } which filters out the numbers that have been used — they have a true value in the hash %t.

R( $A[$i] = $_ ) sets the current position to one of the possible values and then recurses.

return $A[$i] = 0;

If none of the values in the for loop lead to a solution, or if there were no values to try then we reset the current position to zero and return. This allows other values to be tried in the ‘higher’ calls to R.

}
die @A;
}

If the outermost for loop ever reaches the end (ie $i == 81) then we have our solution. Print it out and stop processing.

R

The only thing left to do is to actually start running the code.

So there we are, sudokus solved in three lines of Perl. In my opinion this is cleaner than my previous offerings as the logic is tighter. Still this obfu is not as short as it can get. There are two superfluous semicolons and the logic in the map could be shortened too, as well as some other bits.


November 5, 2005 — Obfuscation as an earning tool

Several months ago I posted an entry which solved sudokus in just a few lines. As a result I got a job and so have been somewhat distracted of late. The interesting thing is that the obfu (I believe) helped in the interview process, which was unexpected — after all it is not ‘production’ code, whatever that is.

Ho hum. I now work(ed) for 3B.net on a variety of projects.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade