Sudoku & Graph Theory

Understanding graph theory is central to building your own Sudoku solver.


February 01, 2006
URL:http://www.drdobbs.com/cpp/sudoku-graph-theory/184406436

Sudoku is a logic puzzle in which there are 81 cells (vertices) filled with numbers between 1 and 9. In each row, the numbers 1,2,3,..,9 must appear without repetition. Likewise, the numbers 1,2,3,..,9 must appear without repetition in the columns. In addition to the row and column constraints, the numbers 1,2,3,..,9 must appear in the nine nonoverlapping 3×3 subsquares without repetition. So in short, the puzzle board is separated into nine blocks, with nine cells in each block (see Figure 1).

Figure 1: Sudoku puzzle.

There are several possible rules you can use to successfully fill in missing numbers. In this article, we examine two rules — Chain Exclusion and Pile Exclusion — for solving Sudoku puzzles. These rules are at the heart of a Windows-based Sudoku solver that we built using Visual C++. (Executables and the complete source code for this solver are available here). The goal of this logical Sudoku solver is to prove that only one possible number can be assigned to each vertex, and to find that number for each vertex in which the number is not defined. Illogical Sudoku puzzles can also be solved, but require guesses (Implementation, OK button).

We refer to possible numbers that should be assigned to a row, column, or one of the nine 3×3 subsquares as a "Permutation Bipartite Graph" or nodes. A node consists of a vector of n>1,n=2,3,4... vertices and all possible numbers that can be assigned to these vertices, such that there exists at least one possible match between the vertices of the vector and the numbers 1,2,...n.

For example, the following are nodes:

({1,2,3,5},{2,3},{2,3,4},{3,4},{4,5}, n=5
({1,2,3,7},{3,6},{3,4},{1,4},{5,6,7},{4,6},{2,7},{8,9},{8,9}, n=9

A possible match for the first vector is easy:

1 -> {1,2,3,5}
2 -> {2,3}
3 -> {2,3,4}
4 -> {3,4}
5 -> {4,5}

A possible match for the second vector is more tricky:

2 -> {1,2,3,7}
3 -> {3,6}
4 -> {3,4}
1 -> {1,4}
5 -> {5,6,7}
6 -> {4,6}
7 -> {2,7}
8 -> {8,9}
9 -> {8,9}

A number can be only assigned to a vertex that contains the possibility of assigning that number. For instance, only the following possibilities are accepted:

7 -> {2,7} or 2 -> {2,7}.

Pile Exclusion and Chain Exclusion provide the basis of logical elimination rules.

To understand Pile Exclusion, consider the following nodes:

({1,2,3,5},{3,6},{3,4},{5,6},{1,7,8,9},{4,6},{5,7,8,9},{4,6},{6,7,8,9},{1,4}, n=9

The numbers 7,8,9 appear only in three vertices:

{1,7,8,9},{5,7,8,9},{6,7,8,9} 

Because there is at least one possible match in the Permutation Bipartite Graph, one vertex will be matched to 7, one to 8, and one to 9. Thus, you can erase the other numbers from these three vertices to get the following three augmented vertices:

{1,7,8,9} -> {7,8,9} 
{5,7,8,9} -> {7,8,9} 
{6,7,8,9} -> {7,8,9} 

and the entire Permutation Bipartite Graph becomes:

({1,2,3,5},{3,6},{3,4},{5,6},{7,8,9},{7,8,9},{4,6},{7,8,9},{1,4}), n=9

As for Chain Exclusion, consider these nodes:

({1,2,3,7},{3,6},{3,4},{1,4},{5,6,7},{4,6},{2,7},{8,9},{8,9}, n=9

In the second, third, and sixth positions in the vertices vector, you have:

{3,6},{3,4},{4,6}

Only the numbers 3,4,6 can be assigned to these vertices. From this, you infer that 3,4,6 are not a matching option in any of the remaining vertices. Thus, you can erase these numbers from all the other vertices, resulting in a new, more simple graph:

({1,2,7},{3,6},{3,4},{1},{5,7},{4,6},{2,7},{8,9},{8,9}, n=9

You can do the same thing with {1}, so that the resulting graph is:

({2,7},{3,6},{3,4},{1},{5,7},{4,6},{2,7},{8,9},{8,9}, n=9

Algorithms

We now present an algorithm that finds all such chains in polynomial time. The first stage is to find the best bipartite matching using the Ford Fulkerson maximum flow algorithm or any other bipartite matching algorithm (see Introduction to Algorithms, Second Edition, by Thomas H. Cormen et al., MIT Press, 2001).

Let W be the set of n cells in a row, column, or subsquare.

That means that there are n cells or vertices in W when the entire Sudoku puzzle has n×n cells or vertices. Let D be the set of numbers that are matched to W. In an n×n Sudoku, D is the set of numbers 1,2,3,...,n.

Let Sk be a subset of k cells out of the n cells in which the possible numbers are exactly k<n numbers Tk. We say that Sk is a "chain" if it doesn't contain a smaller set Qr such that r<k to which only r numbers can be matched. Moreover, in our model, if the number v in D can be matched to the cell u in W, then we say that our Permutation Bipartite Graph G contains the edge ∈ =(u,v) between the vertex u from W and v from D. In addition, let M denote a maximal match between W and D. A match is a set of edges connecting cells or vertices in W to numbers or vertices in D. Each cell in W is allowed to have only one edge in M that connects it to a number in D.

Each number in D is allowed to have only one edge in M that connects it to a cell or vertex in W. Let E denote the entire set of possible vertices between W and D.

Because the vertices of Sk are only connected to vertices of Tk, the condition |M|=|W| implies that each element s∈Sk is matched to exactly one element t ∈ Tk and vice versa. Start scanning a vertex u1 ∈ W. Suppose u1Sk. We check all the vertices vi1 in D such that the edges (u1,vi1) exist in E. Because Sk is connected only to Tk,vi1 are all in Tk. We mark u1 such that we will not consider u1 again. We remove u1 from the list of vertices the algorithm will visit. Now we look for all the edges (ui1,vi1) in M that are connected to one of these vertices vi1. Obviously, ui1 are in Sk for all indices i1 because all the vertices in Tk are matched by M to vertices in Sk. We continue the process recursively. Now, for each ui1 we look for all vi2 in D that are connected to at least one of the ui1 vertices by edges. We mark ui1 such that these vertices will not be considered again. Again, the vi2 must be in Tk and we continue and look for all the edges (ui2,vi1)∈M. Obviously, ui2 are in Sk.

Because u1 and the vertices ui1 were removed from the list of vertices that the algorithm visits, there are fewer vertices that can be visited by the algorithm. The process is repeated until all the vertices in W it can visit are reached.

Because all the vertices it can reach are in Sk, what is left to prove is that there is no vertex u in Sk that is not visited by the algorithm. Suppose there is such a vertex u. Then, obviously, the vertex v in Tk that is matched to u could not be visited either; otherwise, u could be visited. But then, v is also not connected to any one of the vertices in Sk that were visited. So we can define Qk-1=Sk-{u}, which is connected to |Sk|-1 vertices in D. That is a contradiction to the requirement that Sk is minimal.

A simpler algorithm is the Pile algorithm. Let G be a Permutation Bipartite Graph. We would like to find if there exist vertices vi1,...,vik in D such that they are all connected to the same k vertices in W, uj1,...,ujk. Such vertices are called "Pile" or "Set." The algorithm is trivial. Start traversing the vertices vk in D serially in a loop. Activate a second loop within the first loop and count all the other vertices in D that are connected to the same k vertices uj1,...,ujk in W that vk is connected to. If there are k such vertices vi1,...,vik in D, then you are done. Although this algorithm can be improved, it is efficient and its runtime is |W|2/2.

After finding a chain, the algorithm can erase all the edges that are connected to the vertices in Tk that are not connected to Sk. This operation is called "Chain Exclusion." That is, if ∈=(u,v) and v∈Tk and u∈Sk , then ∈ is removed from the Permutation Bipartite Graph G.

After finding a Pile, remove all the edges that connect uj1,...,ujk to D-{vi1,...,vik}.

Implementation

We've used the Chain Exclusion and Pile Exclusion algorithms described here to build a Windows-based Sudoku solver in Visual Studio C++ 6.00. Executables and the complete source code are available electronically here. You will need to rename the file Sudoku.ex1 to Sudoku.exe. Alternatively, you can compile the entire project Sudoku.dsw or Sudoku.dsp with Visual Studio 6.00. We've included a simple console application, Logic.ex1, that demonstrates Chain Exclusion. This file should be renamed to Logic.exe. In Sudoku.exe, the Logic button calls the class Square3.cpp, and Square3.cpp calls Bipartite.cpp. The Logic button fills as many cells as possible with numbers, by using logic only. You can generate puzzles that usually have more than one solution by clicking on the Test button when the Unique option doesn't have a check mark. If the Unique option is checked, the Sudoku puzzle is solved by logic only. The Very Hard option automatically checks the Unique option and takes 5 seconds. You can also enter your own puzzle and press either the Logic button or the OK button. If the puzzle can't be solved because of contradictions, OK won't fill it and Logic will fill as many cells as possible. When the contradiction is encountered, it inserts a double number in row, column, or square. That can be verified by the Check button.

In addition, Sudoku.exe also solves illogical puzzles via the OK button that calls recursion that in each iteration fills in numbers in locations in which there are less degrees of freedom. For more information on Sudoku puzzles, see the Daily Sudoku) or the Times Online).


Eytan, Raviv, and Eitan are software engineers in Israel. They can be contacted at [email protected], [email protected], and [email protected], respectively.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.