fuzzysearch: find not exactly what you're looking for!

Tal Einat taleinat at gmail.com
Thu Feb 12 14:51:17 EST 2015


Hi everyone,

I'd like to introduce a Python library I've been working on for a
while: fuzzysearch. I would love to get as much feedback as possible:
comments, suggestions, bugs and more are all very welcome!

fuzzysearch is useful for searching when you'd like to find
nearly-exact matches. What should be considered a "nearly matching"
sub-string is defined by a maximum allowed Levenshtein distance[1].
This can be further refined by indicating the maximum allowed number
of substitutions, insertions and/or deletions, each separately.

Here is a basic example:

>>> from fuzzysearch import find_near_matches
>>> find_near_matches('PATTERN', 'aaaPATERNaaa', max_l_dist=1)
[Match(start=3, end=9, dist=1)]

The library supports Python 2.6+ and 3.2+ with a single code base. It
is extensively tested with 97% code coverage. There are many
optimizations under the hood, including custom algorithms and C
extensions implemented in C and Cython.

Install as usual:
$ pip install fuzzysearch

The repo is on github:
https://github.com/taleinat/fuzzysearch

Let me know what you think!

- Tal Einat

.. [1]: http://en.wikipedia.org/wiki/Levenshtein_distance



More information about the Python-list mailing list