Help Optimizing Word Search

Scott David Daniels Scott.Daniels at Acm.Org
Wed Jan 12 10:04:38 EST 2005


Paul Rubin wrote:
> "Case  Nelson" <case.nelson at gmail.com> writes:
> 
>>Basically, the program needs to take in a random list of no more than
>>10 letters,  and find all possible mutations that match a word in my
>>dictionary (80k words). However a wildcard letter '?' is also an
>>acceptable character which increases the worst case time significantly.
> 
> 
> For that size pattern and dictionary, simply compiling the pattern to
> a regexp, joining the dictionary together into one big string ("abc
> def ghijk..."), and matching the regexp against the big string, may
> well be faster than using some fancy algorithm coded completely in
> python.

If these queries happen often, build a dictionary of sorted letters to
lists-of-words.  The setup will be slow, but use of such a secondary
dictionary should be quite fast, with relatively easy dealing with ?s
by iterations.

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-list mailing list