[issue43689] difflib: mention other "problematic" characters in documentation

Terry J. Reedy report at bugs.python.org
Mon Apr 5 23:19:39 EDT 2021


Terry J. Reedy <tjreedy at udel.edu> added the comment:

I have an alternate replacement:  "These lines can be confusing if the sequences contain tab characters or other characters that result in the indicator symbols in these lines being mislocated."

Or leave the current sentence as is.

Explanation with the details omitted from the above:
In 3.x, strings are unicode.  Even if one uses a fixed pitch font for the ascii subset, a majority of characters will be rendered either in a different fixed pitch or with variable pitch.  And on a graphics screen that is not simulating a fixed-pitch text terminal (such as Windows console), the so-called double-wide East Asian characters are not really double wide but more like 1.6 times as wide.  The details depend on the OS, the font, and perhaps the font size.  One can explore this in the font sample box for the Font tab of the IDLE settings dialog.  The problems include chars less than 'one space', down to 0 wide.  For general unicode, ^ marking does not work.  Syntax error marking has the same problem and there is no general solution.  

Tab is an example of a character that is either displayed as a variable space or a fixed double space ('\t') or larger.  If we were to make a change, we should mention, as above, that many non-ascii chars are as especially confusing as tabs.

In your example above, the caret at least points to the right space.  It correctly indicates some difference beyond the visible end - a non-visible whitespace difference.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue43689>
_______________________________________


More information about the Python-bugs-list mailing list