SequenceMatcher bug ?

eliben eliben at gmail.com
Wed Dec 10 01:15:06 EST 2008


On Dec 10, 4:12 am, rdmur... at bitdance.com wrote:
> On Mon, 8 Dec 2008 at 23:46, eliben wrote:
> > This is about Python 2.5.2 - I don't know if there were fixes to this
> > module in 2.6/3.0
>
> > I think I ran into a bug with difflib.SequenceMatcherclass.
> > Specifically, its ratio() method. The following:
>
> >SequenceMatcher(None, [4] + [10] * 500 + [5], [10] * 500 + [5]).ratio
> > ()
>
> > returns 0.0
>
> > While the same with 500 replaced by 100 returns .99... something
> > Looking at the code ofSequenceMatcherthere's some caching going on
> > when the sequences are longer than 200 elements (and indeed, I can
> > reproduce the bug above 200 but not below). Can anyone confirm that
> > this misbehaves and suggest a workaround ?
>
> Python 2.5.2 (r252:60911, Sep 29 2008, 20:34:04)
> [GCC 4.3.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.>>> from difflib importSequenceMatcher
> >>>SequenceMatcher(None, [4] + [10] * 500 + [5], [10] * 500 +
> >>> [5]).ratio()
>
> 0.99900299102691925
>

Strange. I could reproduce the problem both on ActiveState Python
2.5.2 for Windows, and in the online Try Python evaluator:

http://try-python.mired.org/

Eli





More information about the Python-list mailing list