Counting elements in a list wildcard
Edward Elliott
nobody at 127.0.0.1
Tue Apr 25 01:15:31 EDT 2006
Dave Hughes wrote:
> Another algorithm that might interest isn't based on "sounds-like" but
> instead computes the number of transforms necessary to get from one
> word to another: the Levenshtein distance. A C based implementation
> (with Python interface) is available:
I don't know what algorithm it uses, but the difflib module looks similar.
I've had good results using the get_close_matches function to locate
similarly-named mp3 files.
However I don't think "close enough" is well suited for this application.
The sequences are short and non-distinct. Difference matching needs longer
sequences to be effective. Phoneme matching seems overly complex and might
grab things like Tsu-zi. I'd just use a list of alternate spellings like
Ben suggested.
More information about the Python-list
mailing list