[XML-SIG] parsing xml schema

Alan Kennedy pyxml@xhaus.com
Fri, 23 Nov 2001 18:30:19 +0000


"Martin v. Loewis" wrote:

> There is another option: parse the document only once using
> expat. More precisely, register a set of handlers with expat that
> feeds both the trex parsing, and a DOM builder
> (i.e. xml.dom.ext.readers.PyExpat); alternatively, feed both pytrex
> and expatreader, and use the resulting SAX events to build a DOM tree.

Martin,

Of course! Multiple handlers for expat.

I might also look into buffering the second SAX/event  stream, so that DOM
construction can be deferred until the input is confirmed valid. The
overhead of constructing the buffer should probably be less than
constructing a DOM which might then be discarded. I suppose that really
depends on how frequently I expect to receive invalid documents.

For my current requirement, where the content of the submitted XML
documents will be written by people, either hand written or with an XML
editor, receiving and checking submissions of XML files will be a
(relatively) infrequent occurence.

But I can imagine that in a web services SOAP/WSDL situation, where the
XML documents might be coming thick and fast, such "deferred DOM
construction" might result in a  considerable speed up.

I must do some timings.

Excellent solution.

Thanks,

Alan.