Faster way to do this?
Skip Montanaro
skip at pobox.com
Wed May 21 09:06:19 EDT 2003
>> Another, quite a bit faster method. Would generating a list of all
>> possible words, then looking for them as dictionary keys be any
>> faster?
Greg> This will work, but be careful if your string length gets very
Greg> long. At 6 letters, you've got 720 permutations which would go
Greg> rather quickly and not each too much memory. At 8 letters, it
Greg> jumps to 40,320 permutations and at 10, it's already above 3.5
Greg> million.
In private email with Freddie (I think he got confused by my cc to him in my
original post) I sent him a slight modification of my original script
(appended below). It records all prefixes of the words it places in the
dictionary, greatly increasing the startup time (from about 1 second to
about 10 seconds), but also pruning the permutation tree quite a bit.
Runtime looks like this on my Powerbook:
% python findwords.py somnambulistexplos
loaded 832166 words into dictionary in 11.02 seconds
found 3749 words in 10.43 seconds
Warning: I wrote this script around 2am. I'm sure it could use further
improvement, and while I believe it accurately computed the hidden words
correctly, I'm not motivated to exhaustively check each of the 3749 words it
found in "somnambulistexplos". ;-)
Skip
import sys
import time
WORD_FILE = "/usr/share/dict/words"
start = time.time()
dictionary = {"p:": True}
wf = open(WORD_FILE, 'r')
for line in wf:
word = line.strip()
dictionary[word] = True
for i in range(1,len(word)):
dictionary["p:"+word[0:i]] = True
print "loaded %d words into dictionary in %.2f seconds" % (len(dictionary),
time.time()-start)
def findwords(dictionary, letters, prefix="", seen={}):
if "p:"+prefix not in dictionary:
return seen
for i in range(len(letters)):
newprefix = prefix+letters[i]
if newprefix in dictionary and newprefix not in seen:
seen[newprefix] = True
findwords(dictionary, letters[0:i]+letters[i+1:], newprefix, seen)
return seen
if __name__ == "__main__":
start = time.time()
seen = findwords(dictionary, sys.argv[1].lower())
print "found %d words in %.2f seconds" % (len(seen), time.time()-start)
More information about the Python-list
mailing list