[issue6988] shlex.split() converts unicode input to UCS-4 output with varying byte order
Amaury Forgeot d'Arc
report at bugs.python.org
Thu Sep 24 18:12:17 CEST 2009
Amaury Forgeot d'Arc <amauryfa at gmail.com> added the comment:
I'll take the opposite point of view:
the bad behavior was introduced with 2.5.1 (issue1548891, r52302), and
reverted for 2.5.2 because "it broke backwards compatibility with
arbitrary read buffers" (issue1730114, r53831)
The difference is in cStringIO:
>>> from cStringIO import StringIO
>>> StringIO(u"Hello, World!").read()
'H\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o\x00\x00\x00,\x00\x00\x00
\x00\x00\x00W\x00\x00\x00o\x00\x00\x00r\x00\x00\x00l\x00\x00\x00d\x00\x00\x00!\x00\x00\x00'
The byte order is not different in the two strings: but u" " becomes
" \x00\x00\x00" and the three zeros are copied into the second item.
----------
nosy: +amaury.forgeotdarc
resolution: -> wont fix
status: open -> pending
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6988>
_______________________________________
More information about the Python-bugs-list
mailing list