Comparing 2 similar strings?

Dan Bishop danb_83 at yahoo.com
Thu May 19 18:03:22 EDT 2005


Steven D'Aprano wrote:
> On Thu, 19 May 2005 14:09:32 +1000, John Machin wrote:
>
> > None of the other approaches make the mistake of preserving the
first
> > letter -- this alone is almost enough reason for jettisoning
soundex.
>
> Off-topic now, but you've made me curious.
>
> Why is this a bad idea?

Because of situations like, for example, my mother's last name, which
originally started with "Y" but got anglicized to a name beginning with
"E".  Same name, different Soundex codes, and the problem occurs only
occurs because of Soundex's preservation of the exact initial letter.

> How would you handle the case of "barow" and "marow"? (Barrow and
> marrow, naturally.) Without the first letter, they sound identical.
Why is
> throwing that information away a good thing?

No one's suggesting throwing away the first letter's information, just
removing the special treatment for it.  "Barow" becomes 1600 and
"Marow" becomes 5600.




More information about the Python-list mailing list