Comparing 2 similar strings?

Chris Croughton chris at keristor.net
Thu May 19 17:34:14 EDT 2005


On Fri, 20 May 2005 01:47:15 +1000, Steven D'Aprano 
   <steve at REMOVETHIScyber.com.au> wrote:

> On Thu, 19 May 2005 14:09:32 +1000, John Machin wrote:
> 
>> None of the other approaches make the mistake of preserving the first
>> letter -- this alone is almost enough reason for jettisoning soundex.
> 
> Off-topic now, but you've made me curious.
> 
> Why is this a bad idea?

Why is the first letter any more important than any other?

> How would you handle the case of "barow" and "marow"? (Barrow and
> marrow, naturally.) Without the first letter, they sound identical. Why is
> throwing that information away a good thing?

Well, Soundex will quite possibly throw the information away anyway,
certainly it regards several letters as the same.  But why is the
difference between barrow and marrow more important than that between
help and held?  Or between hatter and hammer?

Regarding 'agains' as similar to 'iguanas' and 'Utahns', but not to
'again' or 'against', is silly...

Chris C



More information about the Python-list mailing list