A puzzle is a grid in which a subset of squares have been filled with a digit 1 to 9 and the rest are unfilled.Here is a typical well-formed puzzle and its solution:A solution to a puzzle is a grid where every square is filled, every unit contains all of the 9 digits, and the filled squares of the original puzzle remain unchanged in the solution.
A well-formed puzzle has exactly one solution.
4 . . |. . . |8 . 5 4 1 7 |3 6 9 |8 2 5 . 3 . |. . . |. . . 6 3 2 |1 5 8 |9 4 7 . . . |7 . . |. . . 9 5 8 |7 2 4 |3 1 6 ------+------+------ ------+------+------ . 2 . |. . . |. 6 . 8 2 5 |4 3 7 |1 6 9 . . . |. 8 . |4 . . 7 9 1 |5 8 6 |4 3 2 . . . |. 1 . |. . . 3 4 6 |9 1 2 |7 5 8 ------+------+------ ------+------+------ . . . |6 . 3 |. 7 . 2 8 9 |6 4 3 |5 7 1 5 . . |2 . . |. . . 5 7 3 |2 9 1 |6 8 4 1 . 4 |. . . |. . . 1 6 4 |8 7 5 |2 9 3
Let's start to think about how to write a program to solve Sudoku. Clearly we will need to represent digits, squares and units, and we will see that it is useful (but not necessary) to represent peers. We will number the columns 0 to 8 (left to right) and the rows also 0 to 8 (top to bottom). The squares will be numbered 0 to 80 (left-to-right, top-to-bottom). Below we show the numbers of the squares and the 3 units and 20 peers for square number 19:
00 01 02| 03 04 05| 06 07 08
09 10 11| 12 13 14| 15 16 17
18 19 20| 21 22 23| 24 25 26
---------+---------+---------
27 28 29| 30 31 32| 33 34 35
36 37 38| 39 40 41| 42 43 44
45 46 47| 48 49 50| 51 52 53
---------+---------+---------
54 55 56| 57 58 59| 60 61 62
63 64 65| 66 67 68| 69 70 71
72 73 74| 75 76 77| 78 79 80
01 | | | | 00 01 02| |
10 | | | | 09 10 11| |
19 | | 18 19 20| 21 22 23| 24 25 26 18 19 20| |
---------+---------+--------- ---------+---------+--------- ---------+---------+---------
28 | | | | | |
37 | | | | | |
46 | | | | | |
---------+---------+--------- ---------+---------+--------- ---------+---------+---------
55 | | | | | |
64 | | | | | |
73 | | | | | |
The numbering scheme suggests representing a grid as a one-dimensional array of length 81. One might think that a two-dimensional 9x9 array would be the obvious data structure. But two factors argue against this: First, it will be more common to iterate over all the squares in a grid (and less common to iterate over the rows and then the columns). Second, two-dimensional arrays are a bit more complicated to deal with in the programming languages we will use.
Note: It is also possible to consider puzzles of different sizes. Instead of having boxes of size 3x3, each box could be 2x2 or 4x4 or 5x5, etc. We'll use the variable name Nbox for the box size, and N for the whole puzzle size. So the traditional Sudoku puzzle has Nbox = 3 and N = 9, and therefore 81 squares; a puzzle with Nbox = 5 has N = 25 and Nbox4 = 625 squares. Of course, each puzzle must have N "digits" as well (we can use letters as digits when N > 9).
We can implement the notions of digits, squares, units, and peers in the programming language Python as follows:
def grouper(n, items):
"Group items n at a time: grouper(3, 'abcdefgh') => ['abc', 'def', 'gh']"
return [items[i:i+n] for i in range(0, len(items), n)]
def cross(Rows, Cols):
"All squares in these rows and columns"
return [N*r+c for r in Rows for c in Cols]
Nbox = 2 # Each box is of size Nbox x Nbox
N = Nbox*Nbox # The whole grid is of size N x N
rows = range(N)
cols = rows
squares = range(N*N)
digits = '123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ!'[:N] # E.g., '123456789' for N = 9
digitset = set(digits)
rowunits = [cross([r], cols) for r in rows]
colunits = [cross(rows, [c]) for c in cols]
blocks = grouper(Nbox, rows) # E.g., [[0, 1, 2], [3, 4, 5], [6, 7, 8]] for N = 9
boxunits = [cross(rblock, cblock) for rblock in blocks for cblock in blocks]
allunits = rowunits + colunits + boxunits
units = [[u for u in allunits if s in u] for s in squares]
peers = [set(s2 for u in units[s] for s2 in u if s2!=s) for s in squares]
The next step is to define the Sudoku playing grid. We will actually define two representations for grids:
There is one complication: we will soon start filling in Sudoku grids with digits. If we reach an inconsistency--a grid where the same digit is used twice in a unit--we will use False to indicate the inconsistency. Thus, the grid data type is defined (implicitly) as either an 81-element list or the value False.
"4.....8.5.3..........7......2.....6.....8.4......1.......6.3.7.5..2.....1.4......" """ 400000805 030000000 000700000 020000060 000080400 000010000 000603070 500200000 104000000""" """ 4 . . |. . . |8 . 5 . 3 . |. . . |. . . . . . |7 . . |. . . ------+------+------ . 2 . |. . . |. 6 . . . . |. 8 . |4 . . . . . |. 1 . |. . . ------+------+------ . . . |6 . 3 |. 7 . 5 . . |2 . . |. . . 1 . 4 |. . . |. . . """
def parse(gridstring):
"""Convert a string into a grid: a list of values. Accepts digits (or '.' or '0'
for empty); all other characters ignored. In result, '.' used for empty."""
gridstring = gridstring.replace('0', '.')
grid = [c for c in gridstring if c in digits or c == '.']
assert len(grid) == N*N
return grid
def show(grid):
"Convert a grid to a gridstring laid out in 2D, with lines between boxes."
return "INCONSISTENCY" if grid is False else show_template.format(*grid)
show_line = '\n' + '+'.join(['--'*Nbox]*Nbox) + '\n'
show_template = show_line.join(['\n'.join(['|'.join(['{} '*Nbox]*Nbox)]*Nbox)]*Nbox)
And here is an example of the functions at work:
>>> puzzle1 = "4.....8.5.3..........7......2.....6.....8.4......1.......6.3.7.5..2.....1.4......" >>> parse(puzzle1) ['4', '.', '.', '.', '.', '.', '8', '.', '5', '.', '3', '.', '.', '.', '.', '.', '.', '.', '.', '.', '.', '7', '.', '.', '.', '.', '.', '.', '2', '.', '.', '.', '.', '.', '6', '.', '.', '.', '.', '.', '8', '.', '4', '.', '.', '.', '.', '.', '.', '1', '.', '.', '.', '.', '.', '.', '.', '6', '.', '3', '.', '7', '.', '5', '.', '.', '2', '.', '.', '.', '.', '.', '1', '.', '4', '.', '.', '.', '.', '.', '.'] >>> print show(parse(puzzle1)) 4 . . |. . . |8 . 5 . 3 . |. . . |. . . . . . |7 . . |. . . ------+------+------ . 2 . |. . . |. 6 . . . . |. 8 . |4 . . . . . |. 1 . |. . . ------+------+------ . . . |6 . 3 |. 7 . 5 . . |2 . . |. . . 1 . 4 |. . . |. . . >>> ''.join(parse(puzzle1)) '4.....8.5.3..........7......2.....6.....8.4......1.......6.3.7.5..2.....1.4......' >>> ''.join(parse(puzzle1)) == puzzle1 True
Now let's define what it means to be a solution: as we implied before, a solution is a grid where every unit is filled with a permutation of the digits 1 to 9, and the filled squares of the original puzzle remain unchanged in the solution:
def is_solution(grid, puzzle):
"""A grid is a solution to a puzzle if each unit is a permutation
of the digits and if the filled squares remain unchanged."""
filled_squares = [s for s in squares if puzzle[s] in digitset]
def is_permutation(unit): return set(grid[s] for s in unit) == digitset
return (grid is not False
and all(is_permutation(u) for u in allunits)
and all(grid[s]==puzzle[s] for s in filled_squares))
Humans solve Sudoku problems by using logical deduction to infer that a certain digit is (or is not) the value of a square; such deductions are repeated until all squares are filled with a single known digit. If the inference rules are sound and complete and the puzzle is well-formed, the human will only need a pencil--never an eraser. We could write an algorithm to do deductions like that, but the trick is in knowing when you have enough inference rules to solve every possible puzzle.
Therefore we will take a different route. Technically, it is known as model checking rather than deduction, and it consists of searching for a solution that works by trying out possibilities one by one. We will describe a series of search algorithms, starting with the hopelessly inefficient and ending with one that can solve over 5000 of the hardest Sudoku puzzles per second. However, when implemented by a human, this strategy requires an eraser as well as a pencil.
First fill in all the empty squares with digits. Then check to see if we have a solution. If not, then fill the empty squares with a different combination of digits. Repeat with different combinations until a solution is found.
Sometimes generate and test is the best you can do. Herb Simon, in his classic book The Sciences of the Artificial (page 194), describes the task of opening a safe that has ten dials, each with 100 numbers on it. If the safe is well-designed there is no better approach than trying all 10010 combinations.
Here is the implementation of generate and test for Sudoku:
def solve(gridstring):
"Generate every combination of digits for the empty squares; test each one."
from itertools import product
puzzle = parse(gridstring)
for candidate in product(*[(v if v in digits else digits) for v in puzzle]):
if is_solution(candidate, puzzle):
return candidate
return None
And here it is solving a tiny puzzle. There are nine empty squares,
so generate and test first tries to fill them with the digits 111111111, then
111111112, then 111111113, and so on until a solution is found with 231121233:
>>> tiny = '...4 4.3. .4.. .241' >>> print show(parse(tiny)) . . |. 4 4 . |3 . ----+---- . 4 |. . . 2 |4 1 >>> print show(solve(tiny)) 1 2 |3 4 3 4 |1 2 ----+---- 4 3 |2 1 2 1 |4 3
It took 0.8 seconds to solve this tiny 4x4 puzzle. Now consider the 9x9 puzzle defined above as puzzle1. It has 17 filled squares, and thus 64 empty squares. So there are 964 possibile combinations (about 1061) to consider. How long will it take to run through them?
Suppose we have access to a secret new custom 1024 core 10 GHz CPU that features special hardware instructions that can generate and test a position in one clock cycle, and let's say we could afford to buy a billion of these, and while we're shopping, let's say we also pick up a time machine and go back 13 billion years to the early days of the universe, and a fleet of starships with which we visit 10 billion galaxies and in each galaxy convince the inhabitants of 10 billion planets to each buy a billion of these computers, and we all start our programs running, managing to distribute the cases perfectly so that no computation is duplicated or wasted. Then we can estimate that we'd be less than 4% done with this one puzzle by now.
Generate and test is not the algorithm you're looking for. Move along.
Fortunately for us, Sudoku is more like the defective safe. To be precise, it is like a safe with 81 dials, each with 9 numbers, and sometimes if you put one of the dials in the wrong position you hear a sound, but sometimes not (depending on the positions of the other dials). For example, in the lower-left corner of the tiny puzzle above, if we try 1, 2, or 4 we would immediately get feedback that those digits won't work, because they already appear elsewhere in the bottom row. Therefore, the lower-left corner must be filled by a 3. There are 262,144 ways to fill in the 9 empty squares, but right away we can eliminate 196,608 of them; we need only consider the ones that start with a 3.
Unfortunately for us, Sudoku does not give up all its secrets so easily. Sometimes, when we consider a square, we can't immediately tell what to put there. For example, in the upper-left of the tiny puzzle, only 4 can be eliminated as a possibility. That's ok--we're allowing ourselves the use of an eraser, so just guess one of the remaining three possibilities; if it doesn't pan out, erase it and try one of the others. This gives us the incremental search algorithm:
Start filling squares with digits, always making sure that the digit we put in a square is not the same as any peer. If we reach a square where every digit conflicts with a peer, back up and consider a different digit for the previous square. Stop when we successfully fill every square, or give up if we tried all possibilities without finding a solution.The progress of this algorithm can be described as a search tree, where each node represents a partially-filled (incremental) state of the puzzle, and a branch corresponds to filling in a square with a digit. (We display the new digits in red.) Here is a search tree for the tiny puzzle:
A few things to note about this search tree:
The code to implement incremental search is straightforward:
def search(grid):
"""Select an unfilled square, try each possible value in order, recursively searching.
When all squares filled: success; when no more digits to try: return False for failure."""
if grid is False: return False
s = select_square(grid)
if s is None: return grid
for d in possible_values(grid, s):
result = search(assign(grid[:], s, d))
if result: return result
return False
def select_square(grid):
"Return the first square that is not filled yet; or None if all squares are filled."
for s in squares:
if grid[s] == '.': return s
return None
def possible_values(grid, s):
"A square can be any digit that is not already taken by a peer."
return digitset - set(grid[s] for s in peers[s])
def assign(grid, s, d):
"For now, simply assign grid[s] = d."
grid[s] = d
return grid
def solve(gridstring):
"Parse gridstring and search for a solution."
return search(parse(gridstring))
The key function is search. It obeys the convention that if
it is passed False (to indicate an invalid grid) it returns
False, meaning that no solution can be found. Otherwise it
selects some unfilled square to work on. If select_square
returns None that is actually good news: it means that all
the squares are already filled, and we are done: we have a solution,
so we just return it. Otherwise, line 60 says that for each possible
digit that could fill square s we try to assign that digit to
s and search for a solution from there. If some call to
search succeeds, return it; but if a digit assignment causes
the search to fail, just keep going with the next digit assignment.
If all digit assignments fail, then the whole call to search
fails, and we back up to the previous recursive call.
Here we see solve (and therefore search) in action:
>>> print show(solve(puzzle1)) 4 1 7 |3 6 9 |8 2 5 6 3 2 |1 5 8 |9 4 7 9 5 8 |7 2 4 |3 1 6 ------+------+------ 8 2 5 |4 3 7 |1 6 9 7 9 1 |5 8 6 |4 3 2 3 4 6 |9 1 2 |7 5 8 ------+------+------ 2 8 9 |6 4 3 |5 7 1 5 7 3 |2 9 1 |6 8 4 1 6 4 |8 7 5 |2 9 3That's all there is to it! The only reason we don't stop this article now is that the program is still too slow for some purposes. We're not talking billions-of-years slow, but it did take 3 minutes and 45 seconds to solve this puzzle (a rather hard one). On a benchmark of 50 easy puzzles the program took a total of 9.5 seconds; a rate of 5 Hz. Not bad. If you're happy with that speed, you can stop reading now.
However, if you wanted to solve a million puzzles, you'd prefer a faster algorithm. Look again at the search tree above, and consider how we can find the solution faster. Here are some general strategies:
Consider this branch of the search tree from above:
As soon as we place a 1 in the upper-left corner, there is an inconsistency: there is no value that can go in the square in the second row, second column (because 1 is in the same box, 2 is in the samecolumn, and 3 and 4 are in the same row). But our incremental search does not discover that until three more levels in the tree, when it gets around to selecting the offending square. The idea of forward checking is that we don't wait: every time we place a digit, we check all the unfilled squares and if one of them has no possible values we fail immediately rather than waiting to select the offending square.
In this case incremental search is not too bad; it generates two nodes in the search tree that could have been pruned. But in other cases this could be millions of nodes--forward checking is definitely worth it for Sudoku.
However, there is a cost to forward checking: we need to figure out
if any non-selected square has no possible values left. To do that
efficiently we will keep a collection of remaining possible values for every square;
we can store the collection right in the grid. That is, grid[s]
will hold a collection of all the remaining possible digits for
square s. The collection could be represented by a
Python set or list, but I chose instead to use a
string of digits. So the value for the upper-left square in the tiny
puzzle (before it is assigned a value) would be '123'.
We'll change solve so that it calls initialize to set all the unfilled squares to their initial value according to this convention. We'll also change assign(grid, s, d) so that it eliminates d from all the peers of s and fails if any peer has no remaining possible values. The eliminate function removes a digit from the possibilities for a square, and also does forward checking---checks if a peer is reduced to no values. (We add an extra level of indirection--the function check calls forward_check--because we will soon introduce other kinds of checks.) With these added complications one function becomes easier: possible_values can now just look up the collection of possible values rather than having to compute it. The functions search and select_square remain unchanged.
def solve(gridstring):
"Parse gridstring, handle initial constraints, and search for a solution."
return search(initialize(parse(gridstring)))
def initialize(puzzle):
"""First set each square in the result grid to have the set of all possible values.
Then assign each filled square to the specified value."""
grid = N * N * [digits]
for s in squares:
if puzzle[s] in digitset: grid = assign(grid, s, puzzle[s])
return grid
def possible_values(grid, s): return grid[s]
def assign(grid, s, d):
"""Assign grid[s] = d and eliminate d from the peers of s.
Return the updated grid, or False if inconsistency detected."""
if d not in grid[s]: return False # d is not among the possibilities
grid[s] = d
if not all(eliminate(grid, p, d) for p in peers[s]): return False
return grid
def eliminate(grid, s, d):
"Remove d from possibilities for grid[s]. If checking finds an inconsistency return False."
if d not in grid[s]: return grid # Already eliminated d
grid[s] = grid[s].replace(d, '')
return check(grid, s, d)
def check(grid, s, d): return forward_check(grid, s)
def forward_check(grid, s):
"Fail if s has no possible value, or one value which is the same as a peer."
nvals = len(grid[s])
return not (nvals==0 or (nvals==1 and any(grid[p]==grid[s] for p in peers[s])))
How much does forward checking help? For the easy puzzles, not much. One reason they are called "easy" is that they don't require forward checking. For the harder puzzle1 puzzle the solving speed is improved by a factor of 5. But that's still 43 seconds. Let's keep improving.
This idea is called arc consistency; it gets its name from the practice of describing a constraint satisfaction problem as a graph where the nodes are the variables (in Sudoku, the squares) and the arcs are the constraints. We make the arcs (the peer constraints) consistent one by one, and we can end up traversing multiple links in the graph (whereas the forward checking algorithm only looks ahead one arc in the graph).
The implementation of arc consistency for Sudoku is a simple change to what we already have:
def check(grid, s, d): return arc_consistent(grid, s)
def arc_consistent(grid, s):
"Return true if s has multiple values left, or one that we can consistently assign."
nvals = len(grid[s])
return nvals >= 2 or (nvals == 1 and assign(grid, s, grid[s]))
Compared to forward checking, arc consistency reduces the time to solve puzzle1 from 43 seconds down to 0.16 seconds (a 266-fold speedup). The 50 easy problems are speeded up from a rate of 5 per second to 282 per second, a 56-fold speedup. In fact, 70% of the easy problems have their search tree pruned to a single node with this approach. Clearly, arc consistency is very effective on Sudoku problems (as it is on many constraint satisfaction problems), and one might be tempted to stop here. But let's try just a few more of the strategies that have proven popular for Sudoku (they're easy to implement).
def dual_consistent(grid, s, d):
"""After eliminating d from grid[s], check each unit of s and make sure there is some
position in the unit for d. If only one possible place left for d, assign it."""
for u in units[s]:
places_for_d = [s2 for s2 in u if d in grid[s2]]
nplaces = len(places_for_d)
if nplaces==0 or (nplaces==1 and not assign(grid, places_for_d[0], d)):
return False
return True
def naked_pairs(grid, s):
"""If two squares s and s2 in a unit have the same two values,
eliminate those values from every other square s3 in the unit."""
vals = grid[s]
if len(vals) != 2: return True
for u in units[s]:
for s2 in u:
if s2 != s and grid[s2] == vals: # Found naked pair (s, s2)
for s3 in u:
if s != s3 != s2:
if not all(eliminate(grid, s3, v) for v in vals):
return False
return True
| Filename | N puzzles | Description |
|---|---|---|
| puzzle1.txt | 1 | hard puzzle, called puzzle1 above |
| easy50.txt | 50 | easy puzzles |
| hardest.txt | 11 | puzzles from around the web advertised as "hardest" |
| top95.txt | 95 | hard puzzles |
| top870.txt | 870 | hard puzzles |
| top2365.txt | 2365 | hard puzzles |
| subig20.txt | 17,445 | hard puzzles |
| min17.txt | 49,151 | hard puzzles with 17 filled squares |
| ALL.txt | 69,091 | all of the above (duplicates eliminated) |
The code to gather statistics on the runtime of our algorithms on these puzzle sets can be seen in sudoku.py.
I ran the benchmark code on the five serious algorithms: incremental search (incr), forward checking (FC), and arc consistency by itself (AC), with dual consistency (AC+Dual), and with dual plus naked pairs (AC+D+NP). For each benchmark file we can see the name of the file, the number of puzzles, the average speed of solving (expressed two ways, as a frequency in Hertz and as the average number of seconds), the standard deviation of the results, and then, to give an idea of the distribution of times, the minimum, the first percentile, 25th percentile, median, 75th percentile, 99th percentile, and maximum times. Here are the results:
version name N Hz avg stddev min 1% 25% median 75% 99% max
incr puzzle1 1 0 225.729 0.000 225.729 225.729 225.729 225.729 225.729 225.729 225.729
FC puzzle1 1 0 42.132 0.000 42.132 42.132 42.132 42.132 42.132 42.132 42.132
FC easy50 50 5 0.172 0.272 0.004 0.004 0.010 0.050 0.161 0.987 0.987
AC puzzle1 1 6 0.159 0.000 0.159 0.159 0.159 0.159 0.159 0.159 0.159
AC easy50 50 285 0.004 0.005 0.001 0.001 0.001 0.002 0.003 0.025 0.025
AC hardest 11 22 0.044 0.113 0.001 0.001 0.002 0.010 0.014 0.401 0.401
AC top95 95 0 1.030 2.668 0.001 0.001 0.024 0.072 0.329 16.211 16.211
AC top870 870 4 0.221 1.111 0.001 0.001 0.006 0.020 0.060 6.943 16.197
AC+Dual puzzle1 1 62 0.016 0.000 0.016 0.016 0.016 0.016 0.016 0.016 0.016
AC+Dual easy50 50 188 0.005 0.001 0.005 0.005 0.005 0.005 0.005 0.009 0.009
AC+Dual hardest 11 67 0.015 0.010 0.005 0.005 0.006 0.011 0.018 0.035 0.035
AC+Dual top95 95 15 0.067 0.128 0.005 0.005 0.010 0.021 0.055 0.890 0.890
AC+Dual top870 870 33 0.030 0.074 0.005 0.005 0.008 0.014 0.028 0.298 1.273
AC+D+NP puzzle1 1 147 0.007 0.000 0.007 0.007 0.007 0.007 0.007 0.007 0.007
AC+D+NP easy50 50 183 0.005 0.001 0.005 0.005 0.005 0.005 0.005 0.007 0.007
AC+D+NP hardest 11 83 0.012 0.007 0.005 0.005 0.007 0.009 0.015 0.031 0.031
AC+D+NP top95 95 30 0.033 0.063 0.005 0.005 0.007 0.012 0.025 0.489 0.489
AC+D+NP top870 870 58 0.017 0.033 0.005 0.005 0.006 0.010 0.017 0.137 0.596
Note that for the easy50 file, the arc consistency algorithm
is fastest. That is because it is the simplest algorithm (it has no
unnecessary checks or bookkeeping), yet it does enough to solve all
the easy puzzles. For all the other benchmarks, it is worth the effort
to incorporate arc consistency plus unit consistency plus naked pairs.
Here are the results; the benchmarks uphold the idea that MinRV is a good strategy; it performs the best on all the benchmarks except easy50, where it falls just slightly behind firstV. This is true on the average time to solution, and even more so on the maximum time.
version name N Hz avg stddev min 1% 25% median 75% 99% max firstV easy50 50 183 0.005 0.001 0.005 0.005 0.005 0.005 0.005 0.007 0.007 firstV hardest 11 83 0.012 0.007 0.005 0.005 0.007 0.009 0.015 0.031 0.031 firstV top95 95 30 0.033 0.063 0.005 0.005 0.007 0.012 0.025 0.489 0.489 firstV top870 870 58 0.017 0.033 0.005 0.005 0.006 0.010 0.017 0.137 0.596 randomV easy50 50 165 0.006 0.003 0.005 0.005 0.005 0.005 0.006 0.019 0.019 randomV hardest 11 113 0.009 0.004 0.005 0.005 0.006 0.006 0.010 0.018 0.018 randomV top95 95 7 0.127 0.203 0.005 0.005 0.021 0.047 0.120 1.090 1.090 randomV top870 870 25 0.039 0.101 0.005 0.005 0.010 0.018 0.032 0.490 1.762 MinRV easy50 50 182 0.005 0.001 0.005 0.005 0.005 0.005 0.005 0.008 0.008 MinRV hardest 11 142 0.007 0.002 0.005 0.005 0.005 0.006 0.010 0.011 0.011 MinRV top95 95 57 0.017 0.017 0.005 0.005 0.007 0.012 0.021 0.132 0.132 MinRV top870 870 73 0.014 0.012 0.005 0.005 0.007 0.010 0.016 0.055 0.185 MaxRV easy50 50 164 0.006 0.002 0.005 0.005 0.005 0.005 0.005 0.017 0.017 MaxRV hardest 11 25 0.038 0.058 0.005 0.005 0.007 0.013 0.058 0.210 0.210 MaxRV top95 95 0 1.161 2.647 0.005 0.005 0.075 0.172 0.913 19.633 19.633 MaxRV top870 870 4 0.223 0.983 0.005 0.005 0.025 0.050 0.109 5.129 19.664Here is the code to implement the three new variable selection strategies and produce the benchmark results:
import random
def select_square(values):
"Return an unfilled square with the fewest possible values; or None if no unfilled squares."
def numvalues(s): return len(values[s])
unfilled = [s for s in squares if numvalues(s) > 1]
return None if unfilled==[] else random.choice(unfilled)
def select_square(values):
"Return an unfilled square with the fewest possible values; or None if no unfilled squares."
def numvalues(s): return len(values[s])
unfilled = [s for s in squares if numvalues(s) > 1]
return None if unfilled==[] else min(unfilled, key=numvalues)
def select_square(values):
"Return an unfilled square with the fewest possible values; or None if no unfilled squares."
def numvalues(s): return len(values[s])
unfilled = [s for s in squares if numvalues(s) > 1]
return None if unfilled==[] else max(unfilled, key=numvalues)
def constraining_value(values, s, d):
"Count the number of peers of s that would have 0 or 1 value left without d."
return sum((d in values[s2] and len(values[s2]) <= 2) for s2 in peers[s])
def possible(values, s):
return sorted(values[s], key=lambda d: constraining_value(values, s, d))
def possible(values, s):
return sorted(values[s], key=lambda d: -constraining_value(values, s, d))
def possible(values, s):
vals = list(values[s])
random.shuffle(vals)
return vals
benchmark([1,2,3,4], "RandVal")
And here are the results. We see that value ordering makes almost no
difference.
version name N Hz avg stddev min 1% 25% median 75% 99% max MostCVl easy50 50 184 0.005 0.000 0.005 0.005 0.005 0.005 0.005 0.008 0.008 MostCVl hardest 11 113 0.009 0.006 0.005 0.005 0.006 0.006 0.009 0.029 0.029 MostCVl top95 95 57 0.018 0.018 0.005 0.005 0.008 0.012 0.021 0.137 0.137 MostCVl top870 870 72 0.014 0.013 0.005 0.005 0.007 0.011 0.016 0.057 0.193 LeastCV easy50 50 181 0.006 0.001 0.005 0.005 0.005 0.005 0.005 0.009 0.009 LeastCV hardest 11 124 0.008 0.004 0.005 0.005 0.006 0.006 0.010 0.018 0.018 LeastCV top95 95 54 0.018 0.018 0.005 0.005 0.008 0.012 0.022 0.135 0.135 LeastCV top870 870 72 0.014 0.013 0.005 0.005 0.008 0.011 0.016 0.058 0.194 RandVal easy50 50 185 0.005 0.000 0.005 0.005 0.005 0.005 0.005 0.008 0.008 RandVal hardest 11 91 0.011 0.007 0.005 0.005 0.006 0.007 0.014 0.029 0.029 RandVal top95 95 52 0.019 0.018 0.005 0.005 0.008 0.013 0.023 0.123 0.123 RandVal top870 870 75 0.013 0.012 0.005 0.005 0.007 0.010 0.015 0.056 0.209
version name N Hz avg stddev min 1% 25% median 75% 99% max Normal top95 95 38 0.026 0.025 0.008 0.008 0.011 0.018 0.032 0.196 0.196 Normal top870 870 49 0.020 0.018 0.008 0.008 0.011 0.016 0.023 0.082 0.275 Normal top2365 2365 54 0.018 0.018 0.008 0.008 0.010 0.014 0.021 0.076 0.435 Normal subig20 17445 83 0.012 0.022 0.007 0.007 0.008 0.008 0.010 0.066 1.583 Normal min17 49151 49 0.020 0.052 0.007 0.007 0.008 0.010 0.016 0.179 3.385 Normal 200K 200000 47 0.021 0.051 0.007 0.007 0.008 0.011 0.017 0.178 3.732 1 . . |. . . |. 6 4 1 3 2 |8 5 7 |9 6 4 Solution in 3.732 seconds . . . |3 2 . |. . . 9 6 4 |3 2 1 |5 7 8 . 5 . |. . . |. . . 7 5 8 |9 6 4 |1 3 2 ------+------+------ ------+------+------ . 2 . |6 4 . |. . . 5 2 1 |6 4 8 |7 9 3 6 . . |. . . |. . 1 6 8 9 |5 7 3 |2 4 1 . . . |. . . |8 . . 3 4 7 |2 1 9 |8 5 6 ------+------+------ ------+------+------ . . . |. . 5 |3 2 . 8 1 6 |4 9 5 |3 2 7 4 . . |7 . . |. . . 4 9 3 |7 8 2 |6 1 5 . . . |. . . |. . . 2 7 5 |1 3 6 |4 8 9
version name N max N-ratio max-ratio Normal top95 95 0.196 Normal top870 870 0.275 9.2 1.4 Normal top2365 2365 0.435 2.7 1.6 Normal subig20 17445 1.583 7.4 3.6 Normal min17 49151 3.385 2.8 2.1 Normal 200K 200000 3.732 4.0 1.1
Therefore I decided to port the program to Java, which has good facilities for concurrency. The translation is fairly straightforward, but there are some changes; some because of the nature of Java, and some because I wanted to take advantage of this rewrite opportunity to make some choices that will enhance efficiency. Here are the main differences between the Python version and the Java version:
When we start adding threads, the performance increases almost linearly with the number of threads up to 6 (we reach a spedup of 5.66 and 5.36 in two runs with 6 threads). That is what I would expect with 6 CPUs. However, as we add more threads the gains increase at about half the slope up to 10 threads (where we have a speedup of 7.20 or 7.26 in two runs). From there we have a long flat plateau where the speedup is between 7 and 7.5 no matter how many threads we add. (This is 700 times faster than the Python program.) The top measured performance happens to be at 40 threads, with 43,873 puzzles solved per second, but this is more due to random variation than to any special property of the number 40. (if we repeated the experiment, the peak might come at 20, or 12. There is variation because the computer is doing other things (I didn't shut down all other processes) and because the Java JIT compiler performs slightly differently each run. I did let the JIT "warm up" by solving a few hundred thousand puzzles before I started the measurements.)
Don't let my reluctance stop you: it would be a great project to implement any of these techniques for Sudoku.