Comparing 2 similar strings?

nicolas.lehuen at gmail.com nicolas.lehuen at gmail.com
Wed May 25 03:44:48 EDT 2005


William Park a écrit :
> How do you compare 2 strings, and determine how much they are "close" to
> each other?  Eg.
>     aqwerty
>     qwertyb
> are similar to each other, except for first/last char.  But, how do I
> quantify that?
>
> I guess you can say for the above 2 strings that
>     - at max, 6 chars out of 7 are same sequence --> 85% max
>
> But, for
>     qawerty
>     qwerbty
> max correlation is
>     - 3 chars out of 7 are the same sequence --> 42% max
>
> (Crossposted to 3 of my favourite newsgroup.)

Hi,

If you want to use phonetic comparison, here are some algorithms that
are reportedly more efficient than Soundex :

Double-Metaphone
NYSIIS
Phonex

Of course, phonetic algorithms have a lot of disadvantages, the main
one being that they know about one way to pronounce words (usually a
rather rigid, anglo-saxon way) which may not be the right way (hence
the examples given before for Gaellic surnames). But these ones are far
"better" than soundex.

Regards,

Nicolas Lehuen




More information about the Python-list mailing list