[XML-SIG] "encoding" argument to xml.dom.minidom.toxml()?

Stefan Behnel stefan_ml at behnel.de
Tue Jun 10 21:33:31 CEST 2008


Hi,

Bill Janssen wrote:
> I figured that the only point of having an encoding argument would be
> to allow the user to control the output character set encoding, but it
> turns out that specifying an encoding of, say, "ASCII", doesn't do
> that.  It just raises encoding exceptions when you attempt to encode a
> non-ASCII character.

Well, what did you expect? That it magically transmogrifies your non-ASCII
data into plain ASCII data?


> What's the point of having an encoding argument
> when it always has to be "UTF-8"?

Did you try any other encoding besides "ASCII"?


> Especially since it seems that this could be made useful by changing
> one line of code.  In xml/dom/minidom.py, in the class Node, in the
> method "toprettyxml", change the line
> 
>     writer = codecs.lookup(encoding)[3](writer)
> 
> to
> 
>     writer = codecs.lookup(encoding)[3](writer, "xmlcharrefreplace")

Could be done, yes. ElementTree and lxml do it that way. It's not required,
though. If you say you want to serialise plain ASCII data, nothing keeps an
XML serialiser from shouting at you when it finds non-ASCII data. Same for
latin1 data or kyrillic data, or ...

Stefan



More information about the XML-SIG mailing list