A bug in difflib module? (find_longest_match)

n00m n00m at narod.ru
Thu May 1 05:21:22 EDT 2008



Gabriel Genellina:
> En Thu, 01 May 2008 04:35:17 -0300, n00m <n00m at narod.ru> escribi�:
>
> > from random import randint
> >
> > s1 = ''
> > s2 = ''
> >
> > for i in xrange(1000):
> >     s1 += chr(randint(97,122))
> >     s2 += chr(randint(97,122))
> >
> > print s1[:25]
> > print s2[:25]
> >
> > import difflib
> >
> > s = difflib.SequenceMatcher(None, s1, s2)
> >
> > print s.find_longest_match(0, len(s1), 0, len(s2))
> >
> >
> >
> >>>> ============== RESTART ====================
> >>>>
> > yymgzldocfaafcborxbpqyade
> > urvwtnkwfmcduybjqmrleflqx
> > (0, 0, 0)
> >>>>
> >
> > I think it's line #314 in difflib "who's to blame" --
>
> Me too. Could you think of some alternative? Simply disabling that
> "popularity check" would slow down the algorithm, according to the
> comments.
>
> --
> Gabriel Genellina

No idea :)



More information about the Python-list mailing list