[Expat-discuss] Stopping the parse -- anybody home?
David Crowley
dcrowley@scitegic.com
Wed, 18 Apr 2001 11:27:29 -0700
At 10:38 AM 4/18/2001, Fred L. Drake, Jr. wrote:
>Michael Roberts writes:
> > It did indeed make it to the list and I was kind of hoping somebody would
> > answer it.
>
> Looks like our responses crossed in the mail!
>
> > You might just keep a flag attached to the parse, and skip out of all
> > handlers when it gets set. That's the approach I'd try first.
>
> Here's a (slightly) better approach that we use in the Python
>bindings for Expat: when a Python handler raises an exception, we
>clear all the handlers registered with the parser instance being used.
>This avoids having to check a flag for each callback (which gives us
>more maintainable application code), and can be just a little faster.
I actually tried to respond last weekend but my mail bounced and I didn't
get back to it. The situation I am in is I need to break out of a parse
and then continue at a later time. So I set up a wrapper class around my
file to read the file and return "tokens" where I say a "token" is anything
before a ">" character. So my loop is like this:
bool stopParse = false;
tokenizer t("myfile.xml");
while (1)
{
void *buffer = XML_GetParseBufffer(parser, 1024)
int read = t.readToken(buffer, 1024);
XML_ParseBuffer(parser, read, read == 0);
if (stopParse || read == 0)
break;
}
void
endElementHandler(...)
{
if (needToStop)
stopParse = true;
}
The tokens returned for an xml file of "<foo><bar>data</bar></foo>" are
"<foo>", "<bar>", "data</bar>", and "</foo>." I guess you could also write
the tokenizer to break it up a little bit more to break up the "data</bar>"
token. But thats the general idea. The Xerces parser kind of does
something like that with the tokens, but I MUCH prefer Expat.
David