How fuzzy is get_close_matches() in difflib?

John Henry john106henry at hotmail.com
Fri Nov 17 13:45:36 EST 2006


Learn something new everyday.  I always wondered how spell checkers are
done.  Thanks.

John Machin wrote:
> John Henry wrote:
> > I am just wondering what's with get_close_matches() in difflib.  What's
> > the magic?   How fuzzy do I need to get in order to get a match?
>
> Are you desperate to understand the inner workings of difflib, or do
> you want merely to do some fuzzy matching of strings using a well-known
> somewhat-more-understandable zillions-of-implementations metric?
>
> If the latter, google "Levenshtein distance" for the metric, and
> "Python Levenshtein" -- first hit  for me is an implementation in a
> Python C-extension. If you don't have the ability to compile a C
> extension, or don't need the speed, there should be a few pure-Python
> versions around; I'm specifically aware of one by Magnus Lie Hetland.
> It's less than a screenful of code; good idea to grab it anyway for
> educational purposes -- a quite Pythonic implementation of the
> traditional algorithm.
> 
> HTH,
> John




More information about the Python-list mailing list