Building A Sudoku Solver In Java With Dancing Links
Dancing Links is a technique suggested by Donald Knuth for solving Sudoku grids quickly.
Sudoku is a puzzle game, defined in 1979 by Howard Garns, that became extremely popular in the early 2000s. Building a program that automatically solves a Sudoku grid is always an interesting exercise for beginners in programming. In this article, I propose you to discover how to develop a Sudoku solver in Java using a highly efficient technique called Dancing Links.
As I just explained, Sudoku is, therefore, a puzzle game in the form of a grid. The goal of the game is to fill this grid with a series of all different numbers that are never found more than once on the same line, in the same column or in the same block. In a classic Sudoku of size 9, these blocks will therefore have sizes of 3 by 3 and the values that can be assigned to the cells will range from 1 to 9. For more complex Sudoku instances, the blocks can be 4 by 4 (Sudoku of size 16 with values from 1 to 16) or 5 by 5 (Sudoku of size 25 with values from 1 to 25).
A First Approach
Reading the Sudoku rules clearly shows that this game is actually a problem of satisfying constraints. A first simple approach for solving Sudoku problem is to apply a recursive BackTracking algorithm in which we will try to solve a grid by assigning to each cell all possible values before moving on to the next cell if the previous assignment respects the 4 constraints of a valid Sudoku grid. When an assignment does not meet these constraints, we go back and try to assign a new value. By doing so, a Sudoku solving algorithm can be easily programmed.
In order to better highlight the power of the Dancing Links method in terms of performance, I will carry out this first implementation for comparison purposes. I therefore start by modeling a Sudoku grid as a 2-dimensional array of integers. Each cell in this table can have a value from 0 to 9 with the 0 indicating that the cell has not yet been assigned with a legal value.
Checking The Constraints
Then, I define the methods to check if a grid is valid or not. The verification of the uniqueness of the values on a line is done by going through the line to which the cell you want to value belongs. If the value being assigned is already present in the line, then true is returned, indicating that the value is already present.
The verification of the uniqueness of the values of a column is done in the same way by going through all the cells of the column to which the cell you want to value belongs. If the value being assigned is already present in the column, then true is returned. This gives the following implementation for the methods isInRow and isInCol :
I then have to check that within the junction box of the cell I want to value, the assigned value is not already present. A small mathematical operation is performed using the modulo operator to position itself at the beginning of the junction box of the current cell. Then I go through the different cells of the box and return true if the value we want to assign is already present. Finally, I finish by defining an isOk method that calls these 3 methods in order to return true only if the assignment of a value to a given cell respects the Sudoku constraints:
People used to solving Sudoku grids will probably have noticed that the fourth constraint related to a Sudoku grid has not yet been verified at this stage. Indeed, for a grid to be valid, it is also necessary that all the cells of the grid are valued. This constraint will be implemented directly in the solving algorithm as you will see.
Once the validation methods of a grid are in place, I will be able to move on to the implementation of my BackTracking algorithm. The latter is based on a recursive approach within a solve method. I will then go through the grid in search of the first empty cell, i. e. valued by 0. Once I have found this cell, I try to assign it a valid value.
When a valid value is found, the solve method is recalled again to move to the next level of my BackTracking algorithm. If the assignment is not valid at the current level, I return false and reset the cell to 0, which has the effect of going up a notch in my BackTracking algorithm in which a new value is tested on the current cell.
This approach allows me to obtain a solution to my grid when all assignments have been performed in a valid manner. Thus, once out of the solve method, the grid property of my Sudoku class will contain the grid in its resolved form. It is worth noting that my algorithm stops as soon as a solution is found. However, this is not a real problem since a correct Sudoku grid is only supposed to have a single solution.
I thus obtain the following code for the solve method which contains the core of my solving algorithm:
Solving A Grid
With this code, I can instantiate my Sudoku class with a grid size 9 to solve and then launch my Java program to get the grid resolution. Figure 1 shows the work done by my program and you can see that the solving is executed instantly since the execution time is about 50 ms on an average computer.
Nevertheless, the solving algorithm I have implemented quickly shows its limits for more complex grids of size 16 or 25 for example. This is where the Dancing Links method comes into play and the famous Algorithm X on which I will rely in the following of this article.
Dancing Links And Algorithm X
In 2000, Donald Knuth, a famous U.S. computer professor, published an article entitled “Dancing Links” in which he described the use of the Algorithm X to solve Sudoku grids. This article focuses first on the problem of exact coverage and the Algorithm X to solve it quickly and efficiently.
In a second step, Donald Knuth shows how to transform the problem of solving a Sudoku grid into an instance of the problem of exact coverage. Once this modeling is done, it is easy to apply the Algorithm X and solve very complex Sudoku grids. All this with excellent performance.
Exact Cover Problem
First of all, it seems important to focus for a moment on the exact cover problem. First stated by the American computer scientist Richard Karp, the exact cover problem can be stated as follows:
“Given a binary matrix, i. e. composed only of 0 and 1, it is necessary to find a set of rows containing exactly one 1 in each column.”
Consider the binary matrix presented in Figure 2. A solution to the exact cover problem for this matrix could be the subset of lines A, D and E surrounded in red.
In his article on Dancing Links, Donald Knuth describes the Algorithm X that solves the exact cover problem. This is again a recursive search algorithm using the BackTracking method. Nevertheless, its application is particularly effective, as you will see later.
Applied to binary matrices, the algorithm can be formulated as follows:
- If matrix A is empty, the problem is solved. Return success;
- Otherwise, choose a column c in matrix A;
- Choose a row r in column c;
- Include line r in the partial solution;
- For each index j such that A[r,j] = 1:
- Remove the index column j from matrix A;
- For each index i such that A[i,j] = 1:
— Remove the index line i from matrix A;
- Repeat the algorithm recursively on the reduced matrix A.
Dancing Links Enter The Dance
The Dancing Links technique described by Donald Knuth will allow me to implement the formulation of the Algorithm X applied to binary matrices in a remarkably ingenious way. At the heart of the Dancing Links, you will thus find a quadrupled chained data structure. Figure 3 thus presents the graphical representation of the binary matrix detailed in Figure 2 above in the form of a quadrupled chained list.
Implementing a binary matrix representing an instance of the exact cover problem via a quadruple-chained list will prove to be very advantageous in several respects. First, this structure allows to gain memory space by taking advantage of the fact that often matrices representing an instance of the exact cover problem are sparse. Then, the manipulations performed at each recursive call apply to the initial representation of the matrix. All manipulations are done in situ, there is no copying and creation of intermediate matrices.
Finally, the operations of removing and reinserting columns on the cover matrices will be carried out in constant time. This is mainly due to the fact that within a double-chained circular list, the deletion of a node x can be easily achieved by playing with the pointers as follows:
x.left.right = x.right;
x.right.left = x.left;
Considering that x.right and x.left have not been modified in the meantime, the reinsertion of node x at the same position in the list will then be done as follows:
x.left.right = x;
x.right.left = x;
It should be noted that the principle remains the same for vertical removals or reinsertions.
Sudoku Grid As A Cover Matrix
Once these theoretical fundamentals are in place, I can move on to the implementation part. Now you will see how to solve a Sudoku grid using the Dancing Links method and the Algorithm X. The first step will be to model a Sudoku grid in the form of a cover matrix.
This modeling will take into account the 4 constraints defined by the Sudoku set for a resolved grid to be considered valid:
- Cell constraint: Each cell can contain only an integer between 1 and SIZE which corresponds to the size of the grid.
- Line Constraint: Each line can contain only SIZE unique integers between 1 and SIZE.
- Constraint Column: Each column can contain only SIZE unique integers between 1 and SIZE.
- Box Constraint: Each box can contain only SIZE unique integers between 1 and SIZE.
Each line of the cover matrix that I will define will contain all these constraints. Thus, considering a SIZE size grid with a number of constraints per cell equal to CONSTRAINTS and with a range of values between 1 and MAX_VALUE for a cell, the coverage matrix will then have the following size:
(SIZE * SIZE * MAX_VALUE) x (SIZE * SIZE * CONSTRAINTS)
In my Sudoku class, I will be able to implement these specific methods in order to transform a Sudoku grid instance into a cover matrix:
At this stage, the execution of my program allows you to see that the creation of the cover matrix for a Sudoku grid instance of size 9 is correct (figure 4).
Modeling The List Items
From this cover matrix, I will now be able to create the quadruple-chained list that will allow me to implement Donald Knuth’s Algorithm X and Dancing Links approach. With this in mind, I define a DancingNode object in order to model a node in the list.
This object will have 4 properties respectively named left, right, top and bottom which will point to other nodes in the list in 4 directions. Finally, a ColumnNode column property is used to link the current node to the belonging column in the cover matrix.
In terms of its methods, the DancingNode class will implement the operations of removing and reinserting nodes within the list that I have detailed above, both horizontally and vertically. As for the ColumnNode class, it is an object inheriting from DancingNode which will propose the cover and uncover methods allowing, as their names indicate, to cover or discover a column in the data structure.
This gives the following code for the DancingNode class and the ColumnNode class:
Creating The Quadruple-Chained List
The DancingNode and ColumnNode objects created, I will be able to define the quadruple-chained list representing the cover matrix previously generated. This is done within a createDLXList method taking the cover matrix as input and returning the ColumnNode node as output. Refer to Figure 3 to see that this is the node at the top left of the data structure.
The creation is carried out in two steps. First, all the columns are created. Then, I will create the nodes by going through the cover matrix to find the cells valued at 1 in order to add the necessary nodes.
All this gives the following code:
Implementation Of The Algorithm X
My Sudoku grid instance to be solved now represented as a quadruple-chained list format cover matrix, I will be able to implement the Algorithm X previously described in pseudo-code.
This gives the following process method within the DLX class:
Conversion Of The Result
The solving of the exact cover problem is started via a call to the process method with parameter 0. Once the solving is completed, the subset solving the modelled problem is represented in the DancingNode result object list. My last work consists in transforming this list in order to obtain the equivalent resolved Sudoku grid.
To do this, I will go through the list of DancingNode objects corresponding to my solution. For each node, the associated value and the position in the Sudoku grid are retrieved. I then only have to assign this value to the position obtained in my grid.
All this is implemented within the following convertDLXListToGrid method:
Dancing Links In Action!
The last step consists in assembling the different parts of the program in order to solve a Sudoku grid using the Algorithm X and the Dancing Links technique. As a reminder, the solve method that I define within the Sudoku class breaks down the work as follows:
- Conversion of the grid into a cover matrix via convertInCoverMatrix call.
- Creation of the DLX object with the passage at the entry of the cover matrix. Within the DLX object constructor, the createDLXList method is called to represent the cover matrix as a quadruple-chained list.
- Call the solve method of the DLX object instance. The latter is in charge of launching the resolution via the Algorithm X and the use of Dancing Links on the quadrupled chained list.
- Retrieving and converting the list of DancingNode, solving the exact cover problem, into a Sudoku grid solved using the convertDLXListToGrid method of the Sudoku class.
This gives the following code for the solve method:
Execution On A Sudoku Of Size 25
Naturally, my program based on Dancing Links and Algorithm X is highly efficient on the Sudoku of size 9 solved at the beginning of the article by the first program based on a simplistic BackTracking. However, my implementation is expected on more complex Sudoku grids. This is how I pass a difficult Sudoku grid of size 25 as an entry.
While my first algorithm still runs after 5 minutes, the Dancing Links-based algorithm easily solves this grid in just 382 ms! (Figure 5). It can therefore be said, without taking too many risks, that the Dancing Links and the Algorithm X are an ultra-effective solution to the Sudoku problem.
The building of a basic Sudoku solver is an interesting task for any beginner in programming. Nevertheless, the implementation of a simplistic BackTracking algorithm very quickly shows its limits as the grids to be solved become more complex and larger. This is when the Dancing Links technique and the Algorithm X come into play.
By proposing a transposition of the Sudoku problem into the exact cover problem, this approach makes it possible to obtain excellent results as you have seen in this article, even though it is still working on BackTracking. Even better, the implementation of such a solver will arouse the curiosity of the most experienced developers, which is never an easy task.