Need advices regarding the strings (str, unicode, coding) used as interface for an external library.

jmfauth wxjmfauth at gmail.com
Mon Nov 22 15:25:12 EST 2010


I'm planning to build an external lib. This lib will exchange
a lot of strings between the lib and the "core Python code"
of applications.

I wish this lib to be modern, 100% unicode compliant. It will
be developped for Python 2.7 and for Python 3. In an early
phase, technically, it will be developed on Python 2.7 before
Python 3, probably 3.2.

Two options for the strings interface.

a) Pure unicode, that means only type 'unicode' in Python 2.7
and only type 'str' in Python 3.

Similar to the Python io module.

b) Like a) plus ascii and utf-8 encoded type 'str' to keep
some kind of retro compatibility. This lib will anyway
work in a "unicode mode", so the ascii and the encoded
utf-8 str's have to be converted into "unicode".

I'm very comfortable with all this coding stuff
and aware of the pros and cons of each solutions.

My favourite solution is clearly on the a) side.

Advices and comments are welcome. Thanks in advance.



More information about the Python-list mailing list