Performance issue
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Sat Apr 2 15:42:35 EST 2005
In <mailman.1239.1112456748.1799.python-list at python.org>, Tom Carrick
wrote:
> In my attempted learning of python, I've decided to recode an old
> anagram solving program I made in C++. The C++ version runs in less
> than a second, while the python takes 30 seconds. I'm not willing to
> think it's just python being slow, so I was hoping someone could find
> a faster way of doing this. Also, I was wondering if there was a more
> builtin, or just nicer way of converting a string to a list (or using
> the sort function on a list) than making a function for it.
>
> The words.txt here is just a copy of FreeBSD's /usr/share/dict/words
Here's my attempt which builds an anagram dictionary ("sorted word" ->
list of anagrams) for fast lookup of anagrams::
#!/usr/bin/env python2.4
from itertools import imap, ifilter
WORDS = '/usr/share/dict/words'
def make_anagram_map(words):
anagram_map = dict()
for word in imap(lambda w: w.strip().lower(), words):
sorted_word = ''.join(sorted(list(word)))
anagram_map.setdefault(sorted_word, list()).append(word)
return dict(ifilter(lambda x: len(x[1]) > 1, anagram_map.iteritems()))
def main():
words_file = open(WORDS, 'r')
anagram_map = make_anagram_map(words_file)
words_file.close()
while True:
word = raw_input('Find anagrams of word (just enter to end): ')
if not word:
break
try:
print anagram_map[''.join(sorted(list(word.strip().lower())))]
except KeyError:
print 'No anagrams found for %r' % word
# # Print all anagrams sorted by number of anagrams.
# print '\n'.join(map(str, sorted(anagram_map.values(), key=len)))
# print len(anagram_map)
if __name__ == '__main__':
main()
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list