[XML-SIG] Re: [4S-0.10.1beta2] problem with validating parser

Martin v. Loewis martin@mira.cs.tu-berlin.de
Sun, 14 Jan 2001 01:40:23 +0100


> I'm not sure if this is a 4Suite bug or an xmlproc bug. Attempting to
> generate a DOM with validate set to 1 fails.

It's a bug in 4DOM, although xmlproc could be more robust (as Lars
Marius already admitted).

The problem is indeed that the XmlDomGenerator produces an index error
in the line

        old_nss, del_nss = self._namespaceStack[-1]

At that point, nothing is on the namespace stack. The reason for that
is that xmlproc uses the namespace interface of the content handler by
default, ie. it calls startElementNS and endElementNS.

Now, while startElement of the XmlDomGenerator extends the
_namespacestack, startElementNS doesn't. However, endElementNS invokes
endElement, which tries to remove things from the namespace stack.

If the XmlDomGenerator was designed to always do its own namespace
processing, I suggest that this is explicitly requested from the SAX
parser, by setting xml.sax.handler.feature_namespaces to 0. Then, the
SAX parser *should* never invoke startElementNS; those methods might
be implemented as raising AssertionErrors just to make sure they
aren't.

IOW, the quick fix for this bug is to patch

--- Sax2.py.orig	Sun Jan 14 01:07:31 2001
+++ Sax2.py	Sun Jan 14 01:08:08 2001
@@ -264,6 +264,7 @@
     def __init__(self, validate=0, keepAllWs=0, catName=None,
                  saxHandlerClass=XmlDomGenerator, parser=None):
         self.parser = parser or (validate and sax2exts.XMLValParserFactory.make_parser()) or sax2exts.XMLParserFactory.make_parser()
+        self.parser.setFeature(handler.feature_namespaces, 0)
         if catName:
             #set up the catalog, if there is one
             from xml.parsers.xmlproc import catalog

into 4DOM.

Regards,
Martin

P.S. As for xmlproc catching IndexErrors, it appears that the only
possible cause for an index error inside do_parse is the assignment to
t.

So why would it hurt to write 

                    try: 
                        t=self.data[self.pos+1] # Optimization
                    except IndexError, e:
                        raise OutOfDataException()

and to remove the outer IndexError? AFAICT, it only costs a
SETUP_EXCEPT/POP_BLOCK pair, which are quite cheap (a function call,
and storing a few variables, no memory allocation).