locating strings approximately

Irmen de Jong irmen.NOSPAM at xs4all.nl
Wed Jun 28 19:50:01 EDT 2006


BBands wrote:
> I'd like to see if a string exists, even approximately, in another. For
> example if "black" exists in "blakbird" or if "beatles" exists in
> "beatlemania". The application is to look though a long list of songs
> and return any approximate matches along with a confidence factor. I
> have looked at edit distance, but that isn't a good choice for finding
> a short string in a longer one. I have also explored
> difflib.SequenceMatcher and .get_close_matches, but what I'd really
> like is something like:
> 
> a = FindApprox("beatles", "beatlemania")
> print a
> 0.857
> 
> Any ideas?
> 
>     jab
> 

I collected a few pointers in this article:

http://www.razorvine.net/frog/user/irmen/article/2005-05-28/53

It contains one or two additional methods besides the ones you mentioned.
Sorry, it's in Dutch, but you can find the links without much trouble i guess.


--Irmen



More information about the Python-list mailing list