String matching based on sound?
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Mon Jan 29 18:04:34 EST 2018
On Mon, 29 Jan 2018 13:28:32 -0900, Israel Brewster wrote:
> In initial searching, I did find the "fuzzy" library, which at first
> glance appeared to be what I was looking for, but it, apparently,
> ignores numbers, with the result that "all 4 one" gave the same output
> as "all in", but NOT the same output as "all 4 1" - even though "all 4
> 1" sounds EXACTLY the same, while "all in" is only similar if you ignore
> the 4.
Before passing the string to the fuzzy matcher, do a simple text
replacement of numbers to their spelled out version: "4" -> "four".
You may want to do other text replacements too, based on sound or visual
design, for example to deal with Kei$ha a.k.a. Keisha, etc.
--
Steve
More information about the Python-list
mailing list