[issue43473] Junks in difflib

Terry J. Reedy report at bugs.python.org
Sat Mar 13 02:27:23 EST 2021


Terry J. Reedy <tjreedy at udel.edu> added the comment:

Currently return tuple (i, j, n), means that a[i:i+n] == b[j:j+n], where both matching blocks are the same length.
https://docs.python.org/3/library/difflib.html#difflib.SequenceMatcher.get_matching_blocks

This would not be the case if a has an ignored space and b does not. Changing the current definition would break existing code and would require quadruples to return two different lengths.  This would require either a new parameter for the function to select the behavior or a new function with a new name.

Either option would require justification by actual use cases.  I cannot see what they might be.  An way to have junk chars completely ignored is to strip them from both strings before calling SequenceMatcher.

----------
nosy: +terry.reedy

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue43473>
_______________________________________


More information about the Python-bugs-list mailing list