[issue43473] Junks in difflib
Terry J. Reedy
report at bugs.python.org
Sat Mar 13 02:27:23 EST 2021
Terry J. Reedy <tjreedy at udel.edu> added the comment:
Currently return tuple (i, j, n), means that a[i:i+n] == b[j:j+n], where both matching blocks are the same length.
https://docs.python.org/3/library/difflib.html#difflib.SequenceMatcher.get_matching_blocks
This would not be the case if a has an ignored space and b does not. Changing the current definition would break existing code and would require quadruples to return two different lengths. This would require either a new parameter for the function to select the behavior or a new function with a new name.
Either option would require justification by actual use cases. I cannot see what they might be. An way to have junk chars completely ignored is to strip them from both strings before calling SequenceMatcher.
----------
nosy: +terry.reedy
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue43473>
_______________________________________
More information about the Python-bugs-list
mailing list