Fuzzy Lookups

BBands bbands at gmail.com
Fri Feb 3 18:48:25 EST 2006


Diez B. Roggisch wrote:
> I did a levenshtein-fuzzy-search myself, however I enhanced my version by
> normalizing the distance the following way:
>
> def relative(a, b):
>     """
>     Computes a relative distance between two strings. Its in the range
>     (0-1] where 1 means total equality.
>     @type a: string
>     @param a: arg one
<snip>

Hello,

I adapted your approach to my needs and thought you might like to see
the result

def LevenshteinRelative(a, b):
    """
    Returns the Levenshtein distance between two strings
    as a relative quantity in the range 1 to 0 where
    1.0 is a perfect match.
    """
    # Calculate the Levenshtein distance. (returns an integer)
    dist = LevenshteinDistance(a, b)
    # dist is at most the length of the longer string.
    max_dist = float(max(len(a), len(b)))
    # dist is always at least the difference of the sizes of the two
strings.
    min_dist = max_dist - float(min(len(a), len(b)))
    try: # If max_dist and min_dist are equal use simple form.
        relative = 1.0 - (dist - min_dist) / (max_dist - min_dist)
    except ZeroDivisionError:
        relative = 1.0 - dist / max_dist
    return relative 

Thanks,

     jab




More information about the Python-list mailing list