[XML-SIG] dumping an XML parser skeleton from DTD input

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Sat, 10 Mar 2001 08:00:41 +0100


> > Hard to say, I don't even understand the question. What is a "DOM XML
> > parser skeleton"? And how would you like to "dump" it? If you are
> 
> It is a program that parses XML files in a certain fashion, by creating
> a tree of objects (so it has to be an OO language it dumps) representing 
> the structure of the XML file. 

I get the feeling of being dumb here, since I still cannot understand
what you are asking for. Let me interpret it word-by-word.

You want to program that parses XML files: Well, there are plenty of
XML parsers, I can recommend PyXML. It shall create a tree of objects
...  I recommend to use a parser that creates a DOM tree: That is a
tree of objects. 

... representing the structure of the XML file. That I cannot
understand: Do you want the content of the XML file being represented
by the tree of objects (i.e. the tag names of the elements, their
attributes and attribute values, and strings for the text fragments in
the elements)? That is what the DOM does. If this is not what you
want, what is it about the "structure of the XML file" that you want
to be represented. E.g. given

<foo><bar/></foo>

what is the tree of objects that you want to get.

> It is a skeleton because it just does that, as lacking true
> understanding of my further intentions it has no clue as what I'm
> going to do with the data created from the parsing of the document,
> so it has to leave the action field blank, to be filled out by me.

The DOM tree is good for that - it has no understanding of your plans
to process the document.

> It is dumped because I'm asking for a program that will dump a program
> (see above), when supplied with a DTD of the XML it is supposed to be
> able to parse.

So you want to generate a program? Given a DTD? How about this program?

print "from xml.dom.ext.reader import Sax2"
print "import sys"
print "doc = Sax.FromXmlFile(sys.argv[1])"

When being executed, it will always generate the same program:

from xml.dom.ext.reader import Sax2
import sys
doc = Sax.FromXmlFile(sys.argv[1])

This is a program that can read an XML document and build a tree of
objects. The tree of objects is stored in a variable named doc. You
can give a DTD to the first program, but it is ignored as it is not
needed.

> No, I'm asking for a program that will dump a (skeleton of a, to be
> filled in at earliest convenience) parser program, when supplied
> with the DTD of the XML document.

The nice thing about XML is that you can parse it without a DTD, and
that you can furthermore use the same parser for all XML documents.

Regards,
Martin