[XML-SIG] Pull Parsing

Paul Prescod paul@prescod.net
Wed, 10 May 2000 09:36:56 -0700


Fredrik showed how to turn incremental push parsers into pull parsers.
Neat.

I must admit that I always presumed that the conversion would be done at
the SAX level so I couldn't think of a way to do it. The incremental API
makes all the difference (maybe we should propose an incremental
extension to SAX).

I wonder if a pull style interface is intrinsically easier for the
average Python programmer to get their heads around. 

 * People tend to get uncomfortable when you take flow control out of
their hands (as push parsers do). 

 * There are people out there who are against inheritance and other
trappings of object orientation (except struct-like field access, it
seems).

 * Push parsers require a standardization of the parser *and* the
handler. Pull parsers do away with the concept of a handler (and filter)
altogether.

 * Pull parsers allow a very basic form of cooperative multithreading
where you could read from several files and check other event queues

 * Pull parsers can always be turned into push parsers trivially

A very simple API is forming in my head:

domnode="dummy"

while domnode:
    domnode = puller.get()
    if domnode.nodeType==TEXT:
	...
    elif domnode.nodeType==ELEMENT_NODE:
	if domnode.tagName=="Foo":
	    puller.expandTree( domnode )
	    (walk around the tree)
	elif domnode.tagName=="Bar":
	    ...
    elif domnode.nodeType==...:
	...
    elif domnode.nodeType== :
	...

If you want to mix in XPaths:

while domnode:
    if xpath.matches( "some/xpath", node ):
	...
    elif xpath.matches( "some/other/xpath", node ):
	...
    elif xpath.matches( "another/xpath", node ):
	...

Opinions?

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Art is always at peril in universities, where there are so many people, 
young and old, who love art less than argument, and dote upon a text 
that provides the nutritious pemmican on which scholars love to chew. 
				-- Robertson Davies in "The Cunning Man"