Percentage matching of text
Tim Peters
tim.peters at gmail.com
Fri Jul 30 13:01:28 EDT 2004
[Bruce Eckel]
...
> What I'd like to do is find an algorithm that produces the results of
> a text comparison as a percentage-match. Thus I would be able to
> assert that my test samples must match the control sample by at least
> (for example) 83% for the test to pass.
>>> from difflib import SequenceMatcher as sm
>>> sm(None, 'abc', 'xyz').ratio()
>>> sm(None, 'abcd', 'abcd').ratio()
1.0
>>> sm(None, 'abcd', 'uvwx').ratio()
0.0
>>> sm(None, 'abcd', 'axyd').ratio()
0.5
>>>
SequenceMatcher works on sequences of hashable elements. Above, it's
working on sequence of characters (aka "strings" <wink>). Other
possibilites include sequences of lines ("files") and lists of
integers.
More information about the Python-list
mailing list