SequenceMatcher bug ?

eliben eliben at gmail.com
Tue Dec 9 02:46:01 EST 2008


Hello,

This is about Python 2.5.2 - I don't know if there were fixes to this
module in 2.6/3.0

I think I ran into a bug with difflib.SequenceMatcher class.
Specifically, its ratio() method. The following:

SequenceMatcher(None, [4] + [10] * 500 + [5], [10] * 500 + [5]).ratio
()

returns 0.0

While the same with 500 replaced by 100 returns .99... something
Looking at the code of SequenceMatcher there's some caching going on
when the sequences are longer than 200 elements (and indeed, I can
reproduce the bug above 200 but not below). Can anyone confirm that
this misbehaves and suggest a workaround ?

P.S. quick_ratio() works fine, it seems.

Thanks
Eli




More information about the Python-list mailing list