Performance issue

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Sat Apr 2 15:42:35 EST 2005


In <mailman.1239.1112456748.1799.python-list at python.org>, Tom Carrick
wrote:

> In my attempted learning of python, I've decided to recode an old
> anagram solving program I made in C++. The C++ version runs in less
> than a second, while the python takes 30 seconds. I'm not willing to
> think it's just python being slow, so I was hoping someone could find
> a faster way of doing this. Also, I was wondering if there was a more
> builtin, or just nicer way of converting a string to a list (or using
> the sort function on a list) than making a function for it.
> 
> The words.txt here is just a copy of FreeBSD's /usr/share/dict/words

Here's my attempt which builds an anagram dictionary ("sorted word" ->
list of anagrams) for fast lookup of anagrams::

#!/usr/bin/env python2.4
from itertools import imap, ifilter

WORDS = '/usr/share/dict/words'

def make_anagram_map(words):
    anagram_map = dict()
    for word in imap(lambda w: w.strip().lower(), words):
        sorted_word = ''.join(sorted(list(word)))
        anagram_map.setdefault(sorted_word, list()).append(word)
    
    return dict(ifilter(lambda x: len(x[1]) > 1, anagram_map.iteritems()))


def main():
    words_file = open(WORDS, 'r')
    anagram_map = make_anagram_map(words_file)
    words_file.close()
    
    while True:
        word = raw_input('Find anagrams of word (just enter to end): ')
        if not word:
            break
        try:
            print anagram_map[''.join(sorted(list(word.strip().lower())))]
        except KeyError:
            print 'No anagrams found for %r' % word

    # # Print all anagrams sorted by number of anagrams.
    # print '\n'.join(map(str, sorted(anagram_map.values(), key=len)))
    # print len(anagram_map)


if __name__ == '__main__':
    main()


Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list