The "junk" parameter in difflib.sequencematcher

shuhsien shuhsien0404 at yahoo.com
Fri Oct 17 10:56:53 EDT 2003


Hi,

I am confused by the junk parameter in the difflib.sequencematcher. I
thought it would simply ignore everything that's returned true by the
junk function. However, I have results as follows:

>>> sequencematcher(lambda x: x == ' ', "lion", "li on").ratio()
0.88888888888888884
>>> sequencematcher("lion", "li on").ratio()
0.0
>>> sequencematcher(lambda x: x == ' ', "lion", " lion ").ratio()
0.80000000000000004

It's not ignoring the blanks, and when comparing "lion" and "li on",
when nothing is considered junk, the similarity ratio is 0!

What am I missing here?

-shuhsien




More information about the Python-list mailing list