[XML-SIG] DOM Walker -> SAX

Dave Kuhlman dkuhlman@enterpriselink.com
Fri, 20 Nov 1998 15:35:46 -0800


Those of you who are interested in tree walking might want to look
at PCCTS.  PCCTS (Perdue compiler construction tool set, but now
called ANTLR, see http://www.ANTLR.org/) is intended as a
replacement for yacc/lex, the UNIX parser generators.  The PCCTS
distribution also contains Sorcerer.  PCCTS is used to generate a
parser that builds a parse tree.  Sorcerer is used to generate a
"tree parser" that can be used to walk the parse tree and produce an
abstract syntax tree with annotated nodes.  The idea is to use
Sorcerer to produce tree transformations.

I can see a use for a similar tool when processing XML: Use the DOM
parser to build a DOM tree, which is application neutral. Then use
the tree walker to transform the DOM tree into a new tree that is
application specific and is tailored for use by the application
code.  The tree walker is actually a set of rules that describe how
to recognize nodes (branches ?) in the DOM and how to transform that
node or branch into an application specific node or branch.

As an example, I recently wrote a Java XML SAX-based parser built
using Aelfred that creates a tree structure of instances of Java
classes that I have defined and implemented.  The tree represents a
Web page which contains input items which contain style information,
etc.  In this parser application I had to create each object or node
in the tree, fill in member variables (e.g. from attributes in XML
element for the object), and insert it into the tree.  For a future
project I can dream about being able to define a transformation on
the nodes in a DOM that would produce the nodes/objects in my tree
structure.

Admittedly, this task would have been much easier in Python than in
Java.  But, it might be easier still and also more orderly using a
tree match and transformation tool.  Maybe this is why Uche is "at a
loss".  Python makes this kind of work too easy.  But, put youself
in the shoes of someone struggling with a low level language like
Java ...

  -- Dave

uche.ogbuji@fourthought.com wrote:
> 
> > This reminds me: the Java people have made a DOM walker that fires SAX
> > events, called DOMParser. Is this something we want?
> 
> It sound interesting, but I'm at a loss to think up a serious need.  All I can
> think of is if a user had invested a lot of effort in an app that was
> originally designed to parse XML, that now needs to be plugged into the output
> of another app that manipulates DOM-objects.  But is this a significant enough
> need to provide more than the obvious solution of walking the DOM tree to
> print out the doc, and then feeding this to the SAX app?
> 
> Perhaps I'm missing something.
> 
> --
> Uche Ogbuji
> uche.ogbuji@fourthought.com     (970)481-0805
> Consulting Member, FourThought LLC (Open Enterprise Architects)
> Software engineering, project management, Intranets and Extranets
> http://FourThought.com          http://OpenTechnology.org
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://www.python.org/mailman/listinfo/xml-sig

-- 
Dave Kuhlman
EnterpriseLink Technology Corp
http://www.enterpriselink.com
2542 S. Bascom Ave., Suite #203
Campbell, CA 95008
dkuhlman@EnterpriseLink.com
408-558-2011