newb: comapring two strings

Diez B. Roggisch deets at nospam.web.de
Thu May 18 18:59:01 EDT 2006


> Is there a clever way to see if two strings of the same length vary by
> only one character, and what the character is in both strings.
> 
> E.g. str1=yaqtil str2=yaqtel
> 
> they differ at str1[4] and the difference is ('i','e')
> 
> But if there was str1=yiqtol and str2=yaqtel, I am not interested.
> 
> can anyone suggest a simple way to do this?

Use the levenshtein distance.
http://en.wikisource.org/wiki/Levenshtein_distance


> My next problem is, I have a list of 300,000+ words and I want to find
> every pair of such strings. I thought I would first sort on length of
> string, but how do I iterate through the following:
> 
> str1
> str2
> str3
> str4
> str5
> 
> so that I compare str1 & str2, str1 & str3, str 1 & str4, str1 & str5,
> str2 & str3, str3 & str4, str3 & str5, str4 & str5.

decorate-sort-undecorate is the idion for this

l = <list of strings>

l = [(len(w), w) for w in l]
l.sort()
l = [w for _, w in l]


Diez



More information about the Python-list mailing list