[Expat-discuss] Document boundaries (was Re: Text data handler)

Thomás Inskip tinskip at widevine.com
Fri May 21 21:27:44 EDT 2004


>>
>
> Right, you need to call XML_ParserReset and then re-register your
> handlers before calling XML_Parse again. You can call XML_Parse as
> many times as you want on a single document for the same parser but
> must re-initialise the parser before starting a new document.
>
>
The thing is that I am implementing a pretty generic 
transaction-oriented communications protocol; requests go in one 
direction, and responses are sent back.  Those transactions are encoded 
as XML.  The transactions go in each direction in blocks of data, which 
may contain multiple transactions, or portions of a transaction.  I'd 
rather not have to pre-parse the stream to figure out where each 
transaction (document) starts and ends before I pass it on to the 
parser.

Is it possible to call XML_ParserReset from within a handler (such as 
and end element handler)?  Probably not a good idea, huh?  If I could 
then I would just call it when I reach the end of the top-level element 
(document).

What I've done for now is just prime the parser with "<Document>" so 
that all of the transactions are considered to be subelements of 
"Document".  What I worry about is this: if there is some screwy XML in 
the stream, the parser may never recover and I won't be able to parse 
past the error point, rendering any further transactions binary waste.  
How good is Expat at recovering from errors?  I couldn't find any info 
to that regards.




More information about the Expat-discuss mailing list