[XML-SIG] Error handling in PyExpat

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Wed, 21 Mar 2001 17:24:41 +0100


> Sorry - I missed it somehow.  ErrorLineNumber gave me numbers
> outside the document - probably because I called it only after
> parsing, but ErrorByteIndex has the right value, at least before I
> raise the exception.  The values will be incorrect in the exception
> handler because the parser continues, I guess.  Probably the parsing
> will continue, but my handlers will not be called anymore because
> PyExpat (not Expat itself) knows about the exception?

All correct, AFAICT.

> OK so there is no way to stop Expat when things go south in the C level
> handler (they could have defined handlers int instead of void and stopped
> parsing when somebody returned -1 ...).

It looks like that. You may want to report that as a bug, at
sourceforge.net/projects/expat.

> PS: if you would stop calling handlers after a handler has raised an
> exception, you could freeze ErrorLine, ErroColumn and ErrorByteIndex
> to the values they had when the (Python) handler returned to you.
> But it seems you don't stop calling handlers.

All handlers are cleared in case of an error, so expat should not call
anything anymore. It will still continue to operate until it runs out
of data, or gets to the end of the document, or finds an XML error.

Freezing the error location would be an option, but might not do what
you expect - it would freeze the location of the last error that expat
found, which is not necessarily related to what the application
considers an error. 

If the real problem was a division by zero, or a NameError because of
a typo in the callback - should that propagate into the state of the
expat object?

What you should do is to record the current position in the exception
object. It appears that pyexpat does not support retrieven the
*Current* information - any patch to that respect would be appreciated
(*).

Regards,
Martin

(*) I don't know *why* it does not expose XML_GetCurrentLineNumber
etc; perhaps earlier versions did not support it? That might need some
investigation.