[capi-sig] Unicode compatibility

M.-A. Lemburg mal at egenix.com
Wed May 26 12:45:32 CEST 2010


Daniel Stutzbach wrote:
> Robert, Stefan, thank you for your feedback.
> 
> How about the following variation, which I believe will address your
> concerns:
> 
> By default, Py_UNICODE will be a fully-specified type.  In a nutshell, the
> default will behave just like Python 2 or 3.1, except that trying to load a
> mismatched module will raise an ImportError with a more helpful error
> message (much friendlier to novice programmers).  Cython would continue to
> use this mode.
>
> Extension authors who want a Unicode-agnostic build can specify an option in
> their setup.py that will instruct distutils to pass a -D_Py_UNICODE_AGNOSTIC
> compiler flag to ensure that all of their .c files are built in
> Unicode-independent mode.  That way, the whole extension is compiled in the
> same mode.

That would be our (eGenix) preferred implementation variant as well.

Building Unicode agnostic extensions should be a feature that the
extension writers turn on explicitly, rather than being the default
that has to be turned off.

However, rather than using a distutils options to specify enable
the agnostic mode, I would presume that extension writers simply
write:

#define _Py_UNICODE_AGNOSTIC 1
#include "Python.h"

in their code and then add

[build_ext]
unicode-agnostic=1

to their setup.cfg.

> It would indeed be great if package managers included the Unicode setting as
> part of the platform type.  PJE's proposed implementation of that feature (
> http://bit.ly/1bO62) would allow eggs to specify UCS2, UCS4, or "Don't
> Care". My patch greatly increases the number of eggs that could label
> themselves "Don't Care", reducing maintenance work for package maintainers
> who like to distribute binary eggs [1].  In other words, they are
> complimentary solutions.

Rather than waiting for package managers to include support
for this (I've been trying to get some awareness for this problem
for years, without much success), it's probably better to just fix
distutils to include a UCS2/UCS4 marker in the platform string.

> [1] A quick Google search of PyPi reveals many packages offering Linux
> binary eggs.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 26 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                53 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the capi-sig mailing list