sudoku dictionary attack

sub1ime_uk at yahoo.com sub1ime_uk at yahoo.com
Mon Jun 20 12:24:41 EDT 2005


Thought I'd offer a method for solving all possible 9x9 sudoku puzzles
in one go. It'll takes a bit of time to run however (and 9x9 seems to
be about as big as is reasonably possible before combinatorial
explosion completely scuppers this type of program)...

Basic idea:-

Start with a grid initialised with:

123456789
234567891
345678912
456789123
567891234
678912345
789123456
891234567
912345678

Note that all rows and columns contain all 9 digits (but that the
sub-tiles aren't correct for a sudoku solution).

Next write a program which permutes all 9 columns, and then for each of
those permutations permutes the last 8 rows. This will, I believe,
generate all possible grids with no digit repetitions in any row or
column. It will generate 14,631,321,600 (9!*8!) possible sudoku
candidates. Finally, check each candidate to see if any 3x3 subtile has
a repeated digit in it and discard as soon as a repeat is found (sooner
the better). Any that come through unscathed get written as a 82 (81 +
lf) char string to an output file.

I've written a short program (in Python; see below) that tries out this
idea. Indications are that my HP nc6000 laptop can check around 30,000
candidates/sec and that c.0.15% of them are valid sudoku solutions.
That means it would take around 6.5 days to generate the between 20-30
million possible sudoku solutions it will find. That would require
around 2GB in uncompressed disk storage. Gzip does a VERY good job of
compressing files containing this type of data -- so I'd expect well
over 80% compression (might even fit on a CD then).

Once you've generated the solution data then comes the fun of searching
it efficiently which I leave as an exercise for the reader :-)

Regards,  sub1ime_uk (at) yahoo (dot) com

==================================================================
#!python
#
# sudoku.py - generate all valid sudoku solutions
#
# Usage: sudoku <N> <S>
#    eg: sudoku 9 3
# Whare:-
#        N is the grid size (ie 9 for 9x9)
#        S is the sub-tile size (3 for 3x3)
#
# (c) 2005 sub1ime_uk (at) yahoo (dot) com
#
import sys
from gzip import GzipFile
from time import time

def permute(s):
  if len(s)==0: return
  if len(s)==1:
    yield s
    return
  for i in range(len(s)):
    for t in permute(s[:i]+s[i+1:]):
      yield s[i:i+1]+t
  return

def populate(sz, ini):
  tile = []
  for i in range(sz):
    tile.append([])
    for j in range(sz):
      x = chr((i+j)%sz+ord(ini))
      tile[i].append(x)
  return tile

def subtilesok(t, c, r, n, s):
  for x in range(0, n, s):
    for y in range(0, n, s):
      d = {}
      for i in range(s):
        cn = c[x+i]
        for j in range(s):
          rn = r[y+j]
          d[t[cn][rn]] = 1
      if len(d.keys())!=n: return 0
  return 1

def displaytile(t, c, r, lf):
  lfstr=''
  print
  for i in r:
    row = []
    for j in c:
      row.append(t[j][i])
    r=''.join(row)
    lfstr += r
    print "    ",r
  print
  lf.write(lfstr+"\n")

def fact(n):
  if n<2: return 1
  return n*fact(n-1)

if __name__ == '__main__':
  st = time()
  logf = GzipFile("c:\\temp\\sudoku.gz", "w")
  N=int(sys.argv[1])
  S=int(sys.argv[2])
  estimate = fact(N)*fact(N-1)
  if N!=S*S:
    print "Subtile size", S, "not sqrt of tile size", N
    sys.exit(1)
  cols = [x for x in range(N)]
  rows = [x for x in range(1, N)]
  primarytile = populate(N, '1')
  count = 0
  answc = 0
  for colp in permute(cols):
    for rowp in permute(rows):
      count += 1
      if subtilesok(primarytile, colp, [0]+rowp, N, S):
        answc += 1
        ct = time()
        et=ct-st
        if et>0.0:
          print "Found: %d out of %d (%.2f%%) checked" % (answc, count,
(answc*100./count))
          print "Progress: %.2f%%" % ((count*100./estimate))
          print "Elapsed time: %.2f secs, checked: %d/s, found %d/s." %
(et, (count/et), (answc/et))
          print "Estimate time to go: %.2f hours" %
((estimate-count)*et/(count*3600.))
        else:
          print "%d / %d (%5.2f%%)" % (answc, count,
(answc*100./count))
        displaytile(primarytile, colp, [0]+rowp, logf)
  print
  print "Checked", count,"tiles. Found", answc,"answers."
  logf.close()
  sys.exit()
===================================================================




More information about the Python-list mailing list