String matching based on sound?

Israel Brewster israel at ravnalaska.net
Mon Jan 29 17:28:32 EST 2018


I am working on a python program that, at one step, takes an input (string), and matches it to songs/artists in a users library. I'm having some difficulty, however, figuring out how to match when the input/library contains numbers/special characters. For example, take the group "All-4-One". In my library it might be listed exactly like that. I need to match this to ANY of the following inputs:

	• all-4-one (of course)
	• all 4 one (no dashes)
	• all 4 1 (all numbers)
	• all four one (all spelled out)
	• all for one

Or, really, any other combination that sounds the same. The reasoning for this is that the input comes from a speech recognition system, so the user speaking, for example, "4", could be recognized as "for", "four" or "4". I'd imagine that Alexa/Siri/Google all do things like this (since you can ask them to play songs/artists), but I want to implement this in Python.

In initial searching, I did find the "fuzzy" library, which at first glance appeared to be what I was looking for, but it, apparently, ignores numbers, with the result that "all 4 one" gave the same output as "all in", but NOT the same output as "all 4 1" - even though "all 4 1" sounds EXACTLY the same, while "all in" is only similar if you ignore the 4.

So is there something similar that works with strings containing numbers? And that would only give me a match if the two strings sound identical? That is, even ignoring the numbers, I should NOT get a match between "all one" and "all in" - they are similar, but not identical, while "all one" and "all 1" would be identical.








More information about the Python-list mailing list