[XML-SIG] Newbie : Identifying characters that will choke XML parser

Thomas B. Passin tpassin@comcast.net
Wed, 07 May 2003 02:10:51 -0400


Ian,

You need to put a unicode character in there to begin with -

docNode.setAttributeNS(None,'a',unicode('\xb4','iso-8859-1'))

chr(xxx) does not do this for you.

Cheers,

Tom P

[Ian Sparks]

Thank you James & John your solutions allow me to filter out what should be
marked as "bad" characters.

However, I'm having real problems with character conversions. I'm building
an xml document using minidom and setAttributeNS()

I want to be able to do something like :

from xml.dom.minidom import parseString

doc1 = parseString('<test/>')
docNode = doc1.childNodes[0]
docNode.setAttributeNS(None,'a',chr(180))
source = doc1.toxml('iso-8859-1')

and have source contain :

<?xml version="1.0" encoding="iso-8859-1" ?>
<test a="&#180;"/>

without getting UnicodeErrors from codecs.py on toxml() and without ending
up with :

<?xml version="1.0" encoding="iso-8859-1" ?>
<test a="&amp;#180;"/>