[Python-Dev] New Py_UNICODE doc

"Martin v. Löwis" martin at v.loewis.de
Sat May 7 02:25:55 CEST 2005


Nicholas Bastin wrote:
> Yes.  Not only in my mind, but in the Python source code.  If 
> Py_UNICODE is 4 bytes wide, then the encoding is UTF-32 (UCS-4), 
> otherwise the encoding is UTF-16 (*not* UCS-2).

I see. Some people equate "encoding" with "encoding scheme";
neither UTF-32 nor UTF-16 is an encoding scheme. You were
apparently talking about encoding forms.

> What I mean by 'variable' is that you can't make any assumption as to 
> what the size will be in any given python when you're writing (and 
> building) an extension module.  This breaks binary compatibility of 
> extensions modules on the same platform and same version of python 
> across interpreters which may have been built with different configure 
> options.

True. The breakage will be quite obvious, in most cases: the module
fails to load because not only sizeof(Py_UNICODE) changes, but also
the names of all symbols change.

Regards,
Martin


More information about the Python-Dev mailing list