Help Optimizing Word Search
Mike C. Fletcher
mcfletch at rogers.com
Tue Jan 11 23:38:15 EST 2005
To search for a word which is a jumble of a given set of characters in a
(sizable) lexicon, see this posting:
http://blog.vrplumber.com/427
your alterations would be to check for length == to length -
number-of-wildcards (with the wildcards removed from the translation
table, of course) and then some tweaks to the "expensive" loop to allow
for up to wildcard-count ValueErrors. There's some (informal) analysis
in the comments of that post regarding why its a fairly good mechanism
for searching large sets of words.
HTH,
Mike
Case Nelson wrote:
...
>Basically, the program needs to take in a random list of no more than
>10 letters, and find all possible mutations that match a word in my
>dictionary (80k words). However a wildcard letter '?' is also an
>acceptable character which increases the worst case time significantly.
>So if the letters are ['a','b','c'] check a, b, c, ab, ac, ba, bc, ca,
>cb, abc, acb, bac, bca, cab, cba where only a, ba and cab would be
>added to the dict of words. If the letters are ['?','?'] check a-z, aa,
>ab, ac, ad, ..., az, ba, bb, bc, bd, ..., zz
>
>
...
________________________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://www.vrplumber.com
http://blog.vrplumber.com
More information about the Python-list
mailing list