[Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken

Steven D'Aprano steve at pearwood.info
Wed Jul 14 14:38:37 CEST 2010


On Wed, 14 Jul 2010 11:45:25 am Terry Reedy wrote:
> Summary: adding an autojunk heuristic to difflib without also adding
> a way to turn it off was a bug because it disabled running code.
>
> 2.6 and 3.1 each have, most likely, one final version each. Don't fix
> for these but add something to the docs explaining the problem and
> future fix.
>
> 2.7 will have several more versions over several years and will be
> used by newcomers who might encounter the problem but not know to
> diagnose it and patch a private copy of the module. So it should have
> a fix. Solutions thought of so far.
>
> 1. Modify the heuristic to somewhat fix the problem. Bad
> (unacceptable) because this would silently change behavior and could
> break tests.
>
> 2. Add a parameter that defaults to using the heuristic but allows
> turning it off. Perhaps better, but code that used the new API would
> crash if run on 2.7.0
>
> 3.
> Tim Peters
[... snip crazy scheme...]


4. I normally dislike global flags, but this is one time it might be 
less-worse than the alternatives.

Modify SequenceMatcher to test for the value of a global flag, 
defaulting to False if it doesn't exist.

try:
    disable = disable_heuristic
except NameError:
    disable = False
if disable:
    # use the new, correct behaviour
else:
    # stick to the old, backwards-compatible but wrong behaviour



The flag will only exist if the caller explicitly creates it:

import difflib
difflib.disable_heuristic = True

On 2.7.0 and older versions, creating the flag won't do anything useful, 
but nor will it cause an exception. It will be harmless.

I think an explicit flag is better than relying on magic behaviour 
triggered by "unlikely" input.





-- 
Steven D'Aprano


More information about the Python-Dev mailing list