sorting question

Andrew Henshaw andrew_dot_henshaw_at_earthling_dot_net
Thu Feb 8 22:46:10 EST 2001


"Andrew Henshaw" <andrew_dot_henshaw_at_earthling_dot_net> wrote in message
news:t86nh2l57s7j43 at corp.supernews.com...
> Depending upon your constraints, another way to approach this problem may
be
> better.  If you can get your word (or Word) dictionary, then you can build
a
> compiled form that collapses your search.  The compiled form of a word
takes
> the letters of a word and sorts them.   Use the compiled form as a
> dictionary key to the original word (or words).  For example:
> 'bar' and 'bra' both compile to 'abr'.  Thus
...snip..
> This would be blazingly fast, but would depend upon compiling your word
> list.  There are lots of pretty complete text lists of English words
> available on the web.
...snip...

There is a word list available at
http://www-personal.umich.edu/~jlawler/wordlist.html
using this list and the code at the end of this note, I get the following
output:

>>> import Anagram
>>> a = Anagram.Anagram('c:/temp/wordlist.txt')
>>> a.unjumble('mujleb')
['jumble']
>>> a.unjumble('redaps')
['spared', 'spread']
>>>

(Interesting. It missed 'parsed', 'drapes', and 'rasped'.  You might look
for a better list!)
On my machine, it took eight seconds to load and compile the list.

AH

#################################
import string

class Anagram:
    def __init__(self, words_file_name):
        self.word_dict = {}
        for word in open(words_file_name).readlines():
            word = string.strip(word)
            if word:
                compiled = self.compile(word)
                if self.word_dict.has_key(compiled):
                    self.word_dict[compiled].append(word)
                else:
                    self.word_dict[compiled] = [word]

    def compile(self, word):
        l = list(word)
        l.sort()
        return string.join(l, '')

    def unjumble(self, word):
        compiled = self.compile(word)
        if self.word_dict.has_key(compiled):
            print self.word_dict[compiled]
        else:
            print '**not found**'






More information about the Python-list mailing list