Psycopg and queries with UTF-8 data
Diez B. Roggisch
deetsNOSPAM at web.de
Thu Oct 14 07:55:34 EDT 2004
> Ah, I see now. I _thought_ it was odd that unicode('string') resulted in
> a unicode object and 'string'.encode('utf-8') did not. I understand now
> that 'unicode' is data that is actual unicode data, while 'utf-8'
> _encoded_ data is really a string, but with special characters rewritten
> to specify utf-8 escape sequences instead of the actual unicode bytes.
Exactly.
>
> Thanks for clearing out my confusion.
Your welcome.
> while confused():
> print "unicode is not utf-8!!!"
Lets hope confused() is True only for a short time, otherwise you'll end up
with pretty much output...
>> Do encode the unicode object in utf-8, and pass that to the psycopg. If
>> you set client_encoding to latin1, you have to encode unicod to that.
>
> I suppose I won't notice much of that until I read from the DB (which is
> done in PHP mostly), as the data inserted is already an ascii string by
> itself (with escaped utf-8 characters, though). I'll worry about that
> later ;)
Well, AFAIK php doesn't care about unicode - all it knows are strings as
byte sequences, plain old C-style. So if you read from it, things should
work if you set your HTTP header variables correct _and_ other parts of you
html-page aren't made in a different encoding - so make sure typing them in
your editor of choice will yield utf-8 data beeing saved.
--
Regards,
Diez B. Roggisch
More information about the Python-list
mailing list