Percentage matching of text

Fri Jul 30 10:00:26 EDT 2004

Bruce Eckel wrote:
> Background: for the 4th edition of Thinking in Java, I'm trying to
> once again improve the testing scheme for the examples in the book. I
> want to verify that the output I show in the book is "reasonably
> correct." I say "Reasonably" because a number of examples produce
> random numbers or text or the time of day or in general things that do
> not repeat themselves from one execution to the next. So, much of the
> text will be the same between the "control sample" and the "test
> sample," but some of it will be different.
> 
> I will be using Python or Jython for the test framework.
> 
> What I'd like to do is find an algorithm that produces the results of
> a text comparison as a percentage-match. Thus I would be able to
> assert that my test samples must match the control sample by at least
> (for example) 83% for the test to pass. Clearly, this wouldn't be a
> perfect test but it would help flag problems, which is primarily what
> I need.
> 
> Does anyone know of an algorithm or library that would do this? Thanks
> in advance.
> 

Sorry, not in Python, but only in Perl
I think
ftp://ftp.funet.fi/pub/languages/perl/CPAN/modules/by-module/String/String-Approx-3.23.tar.gz
can be tweaked to do that.

-- 
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany