[issue13391] string.strip Does Not Remove Zero-Width-Space (ZWSP)

Dave Mankoff report at bugs.python.org
Tue Nov 15 00:32:18 CET 2011


Dave Mankoff <mankyd at gmail.com> added the comment:

So I contacted the Unicode Technical Committee about the issue and received a promptly received a response back. They pointed that the ZWSP was, once upon a time considered white space but that was changed in Unicode 4.0.1

http://www.unicode.org/review/resolved-pri.html#pri21

One particular comment worth noting: "... for historical reasons the general category is still Zs (Space Separator)".

Perhaps this ticket can be changed to a feature request? In addition to stripping out whitespace, it is useful to remove any non-printable characters from a string (or know if a string contains any non-printable characters).

Perhaps a boolean keyword parameter, "control_chars" could be added to isspace and strip? Thus:

>>> u' \t\r\n\u200B'.isspace(control_chars=True)
True

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13391>
_______________________________________


More information about the Python-bugs-list mailing list