sorting question
Andrew Henshaw
andrew_dot_henshaw_at_earthling_dot_net
Thu Feb 8 22:46:10 EST 2001
"Andrew Henshaw" <andrew_dot_henshaw_at_earthling_dot_net> wrote in message
news:t86nh2l57s7j43 at corp.supernews.com...
> Depending upon your constraints, another way to approach this problem may
be
> better. If you can get your word (or Word) dictionary, then you can build
a
> compiled form that collapses your search. The compiled form of a word
takes
> the letters of a word and sorts them. Use the compiled form as a
> dictionary key to the original word (or words). For example:
> 'bar' and 'bra' both compile to 'abr'. Thus
...snip..
> This would be blazingly fast, but would depend upon compiling your word
> list. There are lots of pretty complete text lists of English words
> available on the web.
...snip...
There is a word list available at
http://www-personal.umich.edu/~jlawler/wordlist.html
using this list and the code at the end of this note, I get the following
output:
>>> import Anagram
>>> a = Anagram.Anagram('c:/temp/wordlist.txt')
>>> a.unjumble('mujleb')
['jumble']
>>> a.unjumble('redaps')
['spared', 'spread']
>>>
(Interesting. It missed 'parsed', 'drapes', and 'rasped'. You might look
for a better list!)
On my machine, it took eight seconds to load and compile the list.
AH
#################################
import string
class Anagram:
def __init__(self, words_file_name):
self.word_dict = {}
for word in open(words_file_name).readlines():
word = string.strip(word)
if word:
compiled = self.compile(word)
if self.word_dict.has_key(compiled):
self.word_dict[compiled].append(word)
else:
self.word_dict[compiled] = [word]
def compile(self, word):
l = list(word)
l.sort()
return string.join(l, '')
def unjumble(self, word):
compiled = self.compile(word)
if self.word_dict.has_key(compiled):
print self.word_dict[compiled]
else:
print '**not found**'
More information about the Python-list
mailing list