newb: comapring two strings

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Fri May 19 17:21:38 EDT 2006


Paul Rubin>Still terrible.  Use a better algorithm!<

I agree, it's O(n^2), but if you need to run this program just 1 time,
and the program is in C, and you don't want to use much time to think
and code a better algorithm (or you aren't able to do it) then maybe
that naive solution can be enough, (if the running time is around 30-90
minutes).
Total time to solve a one-shot problem is often the sum of thinking
time + programming time + running time :-)


Paul Rubin>Use this to generate all the variants of all the words in
the dictionary, and write those out into a file, each line containing a
variant plus the original word.  Then use a sorting utility like the
unix "sort" program to sort the file.  Those programs work efficiently
even if the file is too large to fit in memory.  Then read the sorted
file to find equivalence classes on the variants.<

If the words are all 5 character long, then I think you are creating:
(len(lowercase)-1) * 5 * 30000 = 3750000
strings of len 5+5 (plus enter if you save it into a file), this means
a file of about 37 MB. All those trings may even fit in memory if you
have lot of it.
Your solution looks nice :-) (It reminds me a little the word signature
trick used to find anagrams).

Bye,
bearophile




More information about the Python-list mailing list