[issue1943] improved allocation of PyUnicode objects

Mon Feb 1 17:12:15 CET 2010

Marc-Andre Lemburg <mal at egenix.com> added the comment:

Amaury Forgeot d'Arc wrote:
> 
> Amaury Forgeot d'Arc <amauryfa at gmail.com> added the comment:
> 
>> Base type Unicode buffers end with a null-Py_UNICODE termination,
>> but this is not used anywhere, AFAIK
> On Windows, code like 
>    CreateDirectoryW(PyUnicode_AS_UNICODE(po), NULL)
> is very common, at least in posixmodule.c.

The above usage is clearly wrong. PyUnicode_AS_UNICODE() should
only be used to access Py_UNICODE data directly when
working on Python Unicode objects. The macro is not meant
to be used directly in external APIs.

For such uses, the Unicode conversion APIs need to be used,
e.g. the PyUnicode_AsWideChar() API. These will then also
apply any 0-termination as necessary.

Note that Python is free to change the meaning of Py_UNICODE
(e.g. to use UCS4 on all platforms) or Unicode implementation
details (such as e.g. the 0-termination) and this would then break
any use such as the one you quoted.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1943>
_______________________________________