[issue6697] Python 3.1 segfaults when invalid UTF-8 characters are passed from command line

Marc-Andre Lemburg report at bugs.python.org
Wed Aug 19 16:02:16 CEST 2009


Marc-Andre Lemburg <mal at egenix.com> added the comment:

Amaury Forgeot d'Arc wrote:
> 
> Amaury Forgeot d'Arc <amauryfa at gmail.com> added the comment:
> 
> Do you suggest to remove all usages of _PyUnicode_AsString() and
> _PyUnicode_AsStringAndSize()?

In the short-term, I suggest that all uses that do not check the
return value get replaced with a new API which implements a failsafe
return value strategy.

In the mid- to long-term, the APIs should probably be removed
altogether.

They look a lot like the PyString APIs using the same names, but unlike
those APIs, they can fail, so the implied straight-forward conversion
of the PyString APIs to the above APIs gives a wrong impression to the
developers.

For error messages, I'd use the repr() of the objects - lone UTF-8
surrogates will not work since they cause issues further down the line
with debugging tools or even stderr terminal displays.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6697>
_______________________________________


More information about the Python-bugs-list mailing list