codecs, Swedish characters, and XML...don't mix? (repost)

Andrew Kuchling akuchlin at mems-exchange.org
Thu May 10 17:14:44 EDT 2001


Michael Hammill <mike at pdc.kth.se> writes:
>      gg = open(outout, 'w')
>      gg.write(dom.toxml())

Is the error produced inside dom.toxml(), or by the .write()?  Judging
by the traceback, it's being produced by the write(), which is
reasonable; dom.toxml() returns a chunk of Unicode, but you can't
write it straight out to a regular file because it has characters
>127.  You could verify this by breaking that line up into 2 lines:
's=dom.toxml() \n gg.write(s)'.

Why not just do 'gg = UTF8_streamwriter( open(out, 'w') )' instead of
using Python's built-in open()?  (Possible bug here: dom.toxml()
doesn't specify an encoding in the <?xml?> declaration, and you may
want to put one in.)

--amk




More information about the Python-list mailing list