[XML-SIG] Pull Parsing

Lars Marius Garshol larsga@garshol.priv.no
15 May 2000 21:57:26 +0200


* Paul Prescod
|
| Fredrik showed how to turn incremental push parsers into pull parsers.
| Neat.

Agreed. :-)
 
| I must admit that I always presumed that the conversion would be
| done at the SAX level so I couldn't think of a way to do it. The
| incremental API makes all the difference (maybe we should propose an
| incremental extension to SAX).

Python SAX 1.0 already has this, and I've now put this into SAX 2.0 as
an interface IncrementalParser (with feed, close and reset methods),
which parsers _may_ support.
 
| I wonder if a pull style interface is intrinsically easier for the
| average Python programmer to get their heads around. 

I think it is.
 
|  * Pull parsers can always be turned into push parsers trivially

I wouldn't say trivially, but they can be.
 

My opinion on this is that this is definitely more intuitive for
people to understand, but it's more awkward to use, since it forces
you into doing your own dispatch of tokens to token handlers. (I call
the things the parser returns structured tokens.) 

For the same reason it costs performance-wise: the parser knows what
kind of token it has, and puts this information into the token object,
from where the application must extract it again to do dispatch. With
an event-based approach one jumps directly from the parser to the
application, with no need for special dispatch on token type.

I think the approach does have some merit, but not to the extent that
I will personally sit down and implement a framework for it. If
someone else will I'll be happy to provide input to the extent that
I'm able to.

--Lars M.