[issue35883] Change invalid unicode characters to replacement characters in argv

Johannes Berg report at bugs.python.org
Sun May 24 13:37:55 EDT 2020


Johannes Berg <johannes at sipsolutions.net> added the comment:

A simple test case is something like

  ./python -c 'import sys; print(sys.argv[1].encode(sys.getfilesystemencoding(), "surrogateescape"))' "$(echo -ne '\xfa\xbd\x83\x96\x80')"


Which you'd probably expect to print

  b'\xfa\xbd\x83\x96\x80'

i.e. the same bytes that were passed in, but currently that fails.

----------
versions: +Python 3.10, Python 3.5, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35883>
_______________________________________


More information about the Python-bugs-list mailing list