Looking for library to estimate likeness of two strings
Tim Chase
python.list at tim.thechases.com
Wed Feb 6 17:28:37 EST 2008
> Are there any Python libraries implementing measurement of similarity
> of two strings of Latin characters?
It sounds like you're interested in calculating the Levenshtein
distance:
http://en.wikipedia.org/wiki/Levenshtein_distance
which gives you a measure of how different they are. A measure
of "0" is that the inputs are the same. The more different the
two strings are, the greater the resulting output of the function.
Unfortunately, it's an O(MN) algorithm (where M=len(word1) and
N=len(word2)) from my understanding of the code I've seen.
However it really is the best approximation I've seen of a "how
similar are these two strings" function. Googling for
python levenshtein distance
brings up oodles of hits.
-tkc
More information about the Python-list
mailing list