[issue28426] PyUnicode_AsDecodedObject can only return unicode now

Marc-Andre Lemburg report at bugs.python.org
Thu Oct 13 04:02:15 EDT 2016


Marc-Andre Lemburg added the comment:

PyUnicode_AsDecodedObject() and PyUnicode_AsEncodedObject() were meant as C API implementations of the unicode.decode() and unicode.encode() methods in Python2. Not having PyUnicode_AsDecodedObject() documented was likely an oversight on my part.

In Python2, unicode.decode() and unicode.encode() were more or less direct interfaces to the codec registry. In Python 2.7 this was changed to issue a warning for porting to Python 3.

In Python3, the methods were changed to only return unicode objects and to reflect this change without breaking the C API, the new PyUnicode_AsDecodedUnicode() and PyUnicode_AsEncodedUnicode() were added.

I guess the more recent changes simply didn't pay attention to this difference anymore and put restrictions on the output of PyUnicode_AsDecodedObject() and PyUnicode_AsEncodedObject() which were not originally intended, hence the crash you are seeing, Serhiy.

Going forward, C extensions in Python3 could indeed use the PyCodec_*() APIs directly.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue28426>
_______________________________________


More information about the Python-bugs-list mailing list