modifying standard library functionality (difflib)

Vlastimil Brom vlastimil.brom at gmail.com
Thu Jun 24 11:05:02 EDT 2010


2010/6/24 Bruno Desthuilliers <bruno.42.desthuilliers at websiteburo.invalid>:
> Vlastimil Brom a écrit :
>>
>> Hi all,
>> I'd like to ask about the most reasonable/recommended/... way to
>> modify the functionality of the standard library module (if it is
>> recommended at all).
>
> ...
>> - I guess, it wouldn't be recommended to directly replace
>> difflib.SequenceMatcher._SequenceMatcher__chain_b ...
>
> For which definition of "directly replace" ? If you mean patching the
> standardlib's source code inplace, then it's definitly not something i'd do.
>  Monkeypatching OTHO is sometimes the simplest solution, specially for
> temporary fixes or evolutions.
>
> Anyway - which solution (forking, subclassing or monkeypatching) is the most
> appropriate really depends on the context so only you can decide. If it's
> for personal use only and not mission-critical, go for the simplest working
> solution. If it's going to be publicly released, you may want to consider
> contacting the difflib maintainer and submit a patch, and rely on a
> monkeypatch in the meantime. If you think you'll have a need for more
> modifications / specialisations / evolution to difflib, then just fork.
>
> My 2 cents.
> --
>

Many thanks for your insights!
Just now, I am the almost the only user of this script, hence the
consequences of version mismatches etc. shouldn't (directly) affect
anyone else, fortunately.
However, I'd like to ask for some clarification about monkeypatching -
With "directly replace" I  meant something like the following scenario:

import difflib
....
def tweaked__chain_b(self):
    # modified code of the function __chain_b copy from Lib\difflib.py
    ...

difflib.SequenceMatcher._SequenceMatcher__chain_b = tweaked__chain_b

this way I can only unconditionally change the functionality, as the
signature of SequenceMatcher (which is then used in my script) remains
unchanged.

I thought, this would qualify as monkeypatching, but I am apparently
missing some distinction between "patching the ... code inplace"  and
"monkeypatching".
Is it maybe a difference, if one makes "backups" of the original
objects and reactivates them after the usage of the patched code?

By subclassing (which I am using just now in the code) the behaviour
can be parametrised:

class my_difflib_SequenceMatcher(difflib.SequenceMatcher):
    def __init__(self, isjunk=None, a='', b='', checkpopular=True):
    # checkpopular added parameter to the signature
            self.checkpopular = checkpopular
             ...

    def __chain_b(self):
        # modified copy from Lib\difflib.py - reacting to the value of
self.checkpopular

An "official" update of the source in the standard library is probably
not viable (at least not in a way that would currently help me, as my
code only supports python 2.x due to the relevant dependencies
(wxpython ....)
Otherwise, it would depend on other users' needs (e.g. finer diff at
the cost of the much slower code in some cases )

Thanks again for your thoughts.
   vbr



More information about the Python-list mailing list