Convertion of Unicode to ASCII NIGHTMARE
Serge Orlov
Serge.Orlov at gmail.com
Wed Apr 5 23:48:55 EDT 2006
Roger Binns wrote:
> "Fredrik Lundh" <fredrik at pythonware.com> wrote in message news:mailman.4102.1144215505.27775.python-list at python.org...
> > Roger Binns wrote:
> >
> >> SQLite only accepts Unicode so a Unicode string has to be supplied.
> >
> > fact or FUD? let's see:
>
> Note I said SQLite. For APIs that take/give strings, you can either
> supply/get a UTF-8 encoded sequence of bytes, or two bytes per character
> host byte order sequence. Any wrapper of SQLite that doesn't do
> Unicode in/out is seriously breaking things.
>
> I ended up using the UTF-8 versions of the API as Python can't quite
> make its mind up how to represent Unicode strings at the C api level.
> You can have two bytes per char or four, and the handling/production
> of byte order markers isn't that clear either.
I have an impression that handling/production of byte order marks is
pretty clear: they are produced/consumed only by two codecs: utf-16 and
utf-8-sig. What is not clear?
Serge
More information about the Python-list
mailing list