[XML-SIG] Error using xml.sax.{xmlreader,expatreader}

Mike Orr iron@mso.oz.net
Mon, 27 Nov 2000 21:53:45 -0800


On Mon, Nov 27, 2000 at 10:44:16AM +0100, Martin v. Loewis wrote:
> This is a bug in your code:
> 
> > 	def characters(self, ch, start, length):
> 
> In SAX2, the characters method only has a single argument (besides self),
> so it should be
> 
>         def characters(self, chars):
>  		if not self.in_field:
>  			raise DataError("characters outside a field: %s" % `chars`)
>  		self.chars.append(chars)

That was it.

THANKS for your help.  Out of five or six questions I've posted to various
mailing lists in the past year, this is the only one that has received a
response.  You have renewed my faith in technical mailing lists.

* * * * *
I know the Python XML documentation is being worked on, but it's really confusing
right now.  As has been mentioned, the Python 2.0 Library reference says nothing
about DOM even though it exists.  The XML HOWTO was apparently written for SAX1;
I had to guess several method names from the source now that parseFile and
setDocumentHandler don't exist (replaced by parse and setContextHandler, 
apparently), and use grep to find others (e.g., ErrorRaiser).  The Builder class was
dropped from DOM for no reason I could find, and the only other XML writer,
xml.sax.saxutils.XMLGenerator, is undocumented.  

Are there plans to document XMLGenerator?  I was going to use it, but I would
have to subclass it for comments and indentation, so I just decided to write
the tags out myself.  One reason I used DOM instead of SAX for a previous
project was because DOM supported both reading and writing (via Builder), whereas
SAX (seemingly) supported only reading XML.

* * * * *
> There is a problem with the traceback, though - since characters is
> called from C code, and since that call happens to fail, the topmost
> Python stack frame is the one where expatreader calls into the C code.
> 
> Any hints for improving the error reporting are
> appreciated. Delegating the calls from pyexpat to a method in
> expatreader first is not acceptable, though, due to the expected
> overhead of an additional call in the normal case.

Since the last traceback function clearly expected (and got) two arguments, I
figured the actual error was inside a C function it called that wasn't being
printed.  

Is there a place to wrap the call (or even the entire parse method) in a
try...except block?  Then you could add an additional line to the exception
message (whatever it is), saying:

(The error may be in your callback method "MyContentHandler.characters",
which is called from "pyexpat.whatever".  Since the latter is written in C,
it cannot be represented in the traceback.)

Of course, just use the module name (self.doc_handler.__class__.__name__) if
it's too inconvenient to figure out the exact method, and say "If the latter is
written in C" if it's too inconvenient to determine that fact.

Since try...except doesn't slow anything down when there's no error, this
should not have a performance impact in the normal case.

* * * * *
Hmm, now I see xml.sax.saxlib.Parser.setDocumentHandler().  So why did my
parser require setContentHandler() instead of setDocumentHandler()?

-- 
-Mike (Iron) Orr, iron@mso.oz.net  (if mail problems: mso@jimpick.com)
   http://mso.oz.net/     English * Esperanto * Russkiy * Deutsch * Espan~ol