[XML-SIG] saxutils.XMLGenerator: Output encoding

Carsten Oberscheid oberscheid@doctronic.de
Wed, 25 Sep 2002 09:13:04 +0200


Hello everybody,

I have not followed this list for some time, so this may have been
discussed before: to the XMLGenerator, an output encoding can be
given. All output is then written through saxutils.escape() using this
encoding. As a result, any character in the document that can not be
represented in the output encoding raises a UnicodeException. So one
single special character in a file can force me to produce UTF-8
encoding, although for further processing ISO 8859-1 or even ASCII
would be much more handy.

An alternative would be to catch the UnicodeException and, as a
reaction, encode the offensive characters as character references
(e.g. "“"). Shouldn't this be the XML way to do it?

I can provide a very primitive patch for saxutils.py, if anybody is
interested. I even would try to make it less primitive, if there are
no objections against taking this fix into the distribution :^)

Thanks for your feedback

.co.

-- 
carsten oberscheid                  d  o  c  t  r  o  n  i  c
email oberscheid@doctronic.de       information publishing + retrieval
phone +49 2222 9292 90              http://www.doctronic.de