Don't understand SequenceMatcher from difflib

Antoon Pardon Antoon.Pardon at rece.vub.ac.be
Wed Jun 22 03:27:44 EDT 2011


On Tue, Jun 21, 2011 at 03:02:57PM -0400, Terry Reedy wrote:
> On 6/21/2011 9:43 AM, Antoon Pardon wrote:
> 
> >   matcher = SequenceMatcher(ls1, ls2)
> ...
> >What am I doing wrong?
> 
> Read the doc, in particular, the really stupid signature of the class:
> 
> "class difflib.SequenceMatcher(isjunk=None, a='', b='', autojunk=True)"
> You are passing isjunk = ls1, a = ls2, and by default, b=''. So
> there are no matches, len(a) = 36, len(b) = 0, and the dummy match
> is (36,0,0) as you got.

Yes my penny dropped an hour after I send the question.

But reading the doc in itself didn't help. I head read and reread it
a number of times, before I finaly posted the question.

Somehow this signature reminded me of the range function where if you
only provide one argument, this is considered to be the second parameter
with the first parameter taking on a default value. So somehow I assumed
that if you only provided two parameters, these would be shifted as in
range and the first parameter would default to None.

I know if you read the documentation carefully, it contradicts this, but
my assumption blinded me for seeing it.

> There are also several example in the doc, all like
> >>> s = SequenceMatcher(None, " abcd", "abcd abcd") # or
> >>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
> 
> So you will get better results with
> matcher = SequenceMatcher(None, ls1, ls2) # or
> matcher = SequenceMatcher(a=ls1, b=ls2)
> 
> In the future, please try to simply examples before posting for help.

Yes my bad here. I should have prepared the question better. Frustration
got the better of me.

Thanks for responding anyway.

-- 
Antoon Pardon



More information about the Python-list mailing list