Help Optimizing Word Search

Mike C. Fletcher mcfletch at rogers.com
Tue Jan 11 23:38:15 EST 2005


To search for a word which is a jumble of a given set of characters in a 
(sizable) lexicon, see this posting:

    http://blog.vrplumber.com/427

your alterations would be to check for length == to length - 
number-of-wildcards (with the wildcards removed from the translation 
table, of course) and then some tweaks to the "expensive" loop to allow 
for up to wildcard-count ValueErrors.  There's some (informal) analysis 
in the comments of that post regarding why its a fairly good mechanism 
for searching large sets of words.

HTH,
Mike

Case Nelson wrote:
...

>Basically, the program needs to take in a random list of no more than
>10 letters,  and find all possible mutations that match a word in my
>dictionary (80k words). However a wildcard letter '?' is also an
>acceptable character which increases the worst case time significantly.
>So if the letters are ['a','b','c'] check a, b, c, ab, ac, ba, bc, ca,
>cb, abc, acb, bac, bca, cab, cba where only a, ba and cab would be
>added to the dict of words. If the letters are ['?','?'] check a-z, aa,
>ab, ac, ad, ..., az, ba, bb, bc, bd, ..., zz
>  
>
...

________________________________________________
  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://www.vrplumber.com
  http://blog.vrplumber.com




More information about the Python-list mailing list