[Python-ideas] UCS2 vs UCS4 ABIs

Daniel Stutzbach daniel at stutzbachenterprises.com
Mon Nov 2 20:50:26 CET 2009


On Mon, Nov 2, 2009 at 11:57 AM, Guido van Rossum <guido at python.org> wrote:

> We'd also have to hide the macros that can be used to access the
> internals of a PyUnicodeObject, in order for that approach to be safe.
> Basically, an extension would have to include a second header file to
> use those macros and it would have to somehow indicate to the linker
> that it is using UCS2 or UCS4 internals as well.
>

I don't know of a portable way to indicate that to the linker simply by
including a header file.  I wish I did.

Here is one idea that will cause a linker error if there's a mismatch and
one of the macros are used.  It does cause the macro to execute an extra CPU
instruction or two, though.

In unicodeobject.h:

/* Require the macro to reference a global variable that will only be
present if the Unicode ABI matches correctly.  Arrange for the global
variable to always have the value zero, and add it to the return value of
the macro. */

#if Py_UNICODE_SIZE == 4
extern const int Py_UnicodeZero_UCS4;
#define Py_UNICODE_ZERO (Py_UnicodeZero_UCS4)
#else
extern const int Py_UnicodeZero_UCS2;
#define Py_UNICODE_ZERO (Py_UnicodeZero_UCS2)
#endif

#define PyUnicode_AS_UNICODE(op) \
        (Py_UNICODE_ZERO + (((PyUnicodeObject *)(op))->str))

In unicodeobject.c:

extern const int Py_UNICODE_ZERO = 0;


> I would want to err on the safe side here -- if it was at all easy to
> create an extension that *seems* to be ABI-neutral but *actually*
> relies on knowledge about the UCS2 or UCS4 representation, we'd be
> creating a worse problem. Users don't like stuff not working, but they
> *really* don't like stuff crashing with random core dumps -- if it has
> to be broken, let it break very loudly and explicitly. The current
> approach satisfies that requirement -- it probably just errs too far
> on the "never assume it might work" side.
>

Agreed.

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20091102/f8544505/attachment.html>


More information about the Python-ideas mailing list