regex-strategy for finding similar words?

Fri Nov 19 05:17:11 EST 2004

>There
>are faster algorithms available than the dynamic-programming one --
>google "Heikki Hyyro". These allow you to preprocess your input word
>and then compare with multiple dictionary candidates in O(kn) instead
>of O(kmn) time where the input word has length m, and you compare with
>k dictionary words each of average length n.

John,

thanks for the rich input!
I found the Hyyro/Navarro algorithm, and this looks interesting. 
First I will give python's own difflib.SequenceMatcher class a try - 
the project here doesn't necessarily need the fastest algorithm, 
coding time is an issue however. I found that a match ratio slightly 
above 0.85 captures most of my misspelling cases without producing 
errors.

best,
Christoph

regex-strategy for finding *similar* words?

regex-strategy for finding similar words?