[Tutor] Python and unicode

Ferry Dave Jäckel dave.jaeckel at arcor.de
Fri Mar 10 08:55:35 CET 2006


Hello list,

I try hard to understand python and unicode support, but don't get it 
really.

What I thought about this until yesterday :)
If I write my script in unicode encoding and put the magic # -*- coding: 
utf-8 -*- at its start, I can just use unicode everywhere without problems.
Reading strings in different encodings, I have to decode them, specifying 
there source encoding, and writing them in different encode i have to 
encode them, giving the target encoding.

But I have problems with printing my strings with print >> sys.stderr, 
mystring. I get "ASCII codec encoding errors". I'm on linux with python2.4

My programming problem where I'm stumbling about this:
I have an xml-file from OO.org writer (encoded in utf-8), and I parse this 
with sax, getting some values from it. This data should go into a mysql db 
(as utf-8, too). I think this works quite well, but debug printing gives 
this errors.

What is the right way to handle unicode and maybe different encodings in 
python?
What encoding should be put into the header of the file, and when to use the 
strings encode and decode methods? Are there modules (as maybe sax) which 
require special treatment because of lack of full unicode support?
In general I'd like to keep all strings as unicode in utf-8, and just 
convert strings from/to other encodings upon input/output.

Regards,
	Dave

-- 
If you're using anything besides US-ASCII, I *stringly* suggest Python 2.0.
      -- Uche Ogbuji (A fortuitous typo?), 29 Jan 2001
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/tutor/attachments/20060310/6da5da7a/attachment.pgp 


More information about the Tutor mailing list