SequenceMatcher bug ?

rdmurray at bitdance.com rdmurray at bitdance.com
Tue Dec 9 21:12:04 EST 2008


On Mon, 8 Dec 2008 at 23:46, eliben wrote:
> This is about Python 2.5.2 - I don't know if there were fixes to this
> module in 2.6/3.0
>
> I think I ran into a bug with difflib.SequenceMatcher class.
> Specifically, its ratio() method. The following:
>
> SequenceMatcher(None, [4] + [10] * 500 + [5], [10] * 500 + [5]).ratio
> ()
>
> returns 0.0
>
> While the same with 500 replaced by 100 returns .99... something
> Looking at the code of SequenceMatcher there's some caching going on
> when the sequences are longer than 200 elements (and indeed, I can
> reproduce the bug above 200 but not below). Can anyone confirm that
> this misbehaves and suggest a workaround ?

Python 2.5.2 (r252:60911, Sep 29 2008, 20:34:04) 
[GCC 4.3.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from difflib import SequenceMatcher
>>> SequenceMatcher(None, [4] + [10] * 500 + [5], [10] * 500 +
>>> [5]).ratio()
0.99900299102691925

--RDM



More information about the Python-list mailing list