[XML-SIG] dumping an XML parser skeleton from DTD input

Thomas B. Passin tpassin@home.com
Sat, 10 Mar 2001 12:53:34 -0500


<Eugene.Leitl@lrz.uni-muenchen.de> wrote -

>
> Before I try selling them on the DOM thing, I'd rather know what I'm
> doing. It cost them three days to whip up their object tree XML parser
> in Java.
>
Yes, it's easy to make a basic xml parser, and it's easy to come up with a
tree structure.  Lots of us have done something like this.  But there are a
lot of  specialized wrinkles to xml.  If you are only ever going to work with
your own xml, it may not matter.  But if you want to work with xml produced by
others, it may use features that require these wrinkles.  Your home-grown
parser and tree structure likely won't handle them all.  Handling of external
entities, namespaces, whitespace normalization, character encodings, and CDATA
sections are some of these wrinkles that can get tricky.

Also, if you use your own tree API, you won't be able to make use of other
software that uses the DOM, like xslt, xpath,xpointer, etc. (I'm not sure how
many of these are out yet in C++, but they will be coming).

> > 2) Creating a tree-like structure to represent the structure of the xml
> > document.
> > The DOM is an API for a tree-like representation.  Most major parsers out
> > there either include a DOM api or can work with another DOM API.  (SAX is
a
> > non-DOM api, but the output of a sax processsor can be used to build a
tree,
> > too).  The DOM is an object oriented api.
>
> They (said cow-orkers) insist on an object tree based approach.
>

Oh, yes, a tree approach is fine for a lot of things.  Takes a lot of memory
if you have a large chunk of xml.  It isn't so much the tree as the api for it
that you probably want to concentrate on first.

Cheers,

Tom P