[XML-SIG] Removing Whitespace

Alexandre Fayolle Alexandre.Fayolle@logilab.fr
Wed, 23 Apr 2003 13:55:18 +0200


On Wed, Apr 23, 2003 at 02:40:57PM +0300, Tonguç Yumruk wrote:
> Hi,
> 
> I'm new to Python & XML processing, and I'm in trouble with whitespace. I
> use xml.dom to process xml and Sax2.Reader() from xml.dom.ext.reader. I
> don't want the whitespace interpreted as a text node. Although the
> Reader() class have some option like keepAllWs I don't think it really
> does what I need.

In order to have Sax handle whitespace for you, you need:
 * a validating parser
 * a DTD for your document

You need to install python-xml on Debian to have a validating parser
(there are no validating parsers in python-xmlbase). 

With this package, you can get a validating parser with the following
code :

from xml.sax.sax2ext import XMLValParserFactory
parser = XMLValParserFactory.make_parser()

Then attach your handlers as usual. Ignorable whitespace should be
reported to the ContentHandler.ignorableWhitespace() method if you
provided it, and not to the characters() method.

-- 
Alexandre Fayolle
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Développement logiciel avancé - Intelligence Artificielle - Formations