[Python-Dev] Unicode <--> UTF-8 in CPython extension modules

M.-A. Lemburg mal at egenix.com
Sat Feb 23 01:06:25 CET 2008


On 2008-02-23 00:46, Colin Walters wrote:
> On Fri, Feb 22, 2008 at 4:23 PM, John Dennis <jdennis at redhat.com> wrote:
> 
>>  Python programs which use Unicode string objects for their i18n and
>>  which "link" to C libraries expecting UTF-8 but which have a CPython
>>  binding which only uses 's' or 's#' formats programs seem to often
>>  fail with encoding errors.
> 
> One thing to be aware of is that PyGTK+ actually sets the Python
> Unicode object encoding to UTF-8.
> 
> http://bugzilla.gnome.org/show_bug.cgi?id=132040
> 
> I mention this because PyGTK is a very popular library related to
> Python and Linux.  So currently if you "import gtk", then libraries
> which are using UTF-8 (as you say, the vast majority) will work with
> Python unicode objects unmodified.

Are you suggesting that John should rely on a bug in some 3rd party
extension instead of fixing the Python extension to use "es#" where
needed ?

There's a good reason why we don't allow setting the default
encoding outside site.py.

Trying to play tricks to change the default encoding later on
will only cause problems, e.g. the cached default encoded versions
of Unicode objects will then use different encodings - the one set
in site.py and later the ones with the new encoding. As a result,
all kind of weird things can happen.

Using the Python Unicode C API really isn't all that hard and it's
well documented too, so please use it instead of trying to design
software based on workarounds.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 23 2008)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611


More information about the Python-Dev mailing list