[XML-SIG] Simple DOM question (writing DOM)

Martin v. Loewis martin@v.loewis.de
Wed, 12 Dec 2001 21:27:03 +0100


> > Is that a custom data structure, or a DOM tree? By "create an
> > Element", do you mean you want to create some data structure, or an
> > XML file (i.e. a sequence of bytes in a stream, or on disk)?
> 
> I want my classes to be able to represent itself as a DOM element node
> as well as reconstruct themselves from such an element node.

That is quite a challenge, if you want your classes to support all of
the DOM operations. I recommend you inherit from the classes of a DOM
implementation of your choice, and override the factory methods to
create instances of your classes appropriately.

Actually, I recommend that you drop the idea of having your classes
represent themselves as DOM elements, and instead try to merely
initialize your data structure from a DOM tree. If you think you
absolutely need it, you may want to provide the reverse operation as
well, but I doubt this will ever be necessary.

> > The best approach to do that is print/write:
> 
> If I had proposed that at my former job, I guess I would have been
> killed. Seriously? 

Definitely. If you have a data structure that you want to serialize
into XML, provide a traversal procedure (using the visitor pattern, if
you want to make it sound good :-) where each node contributes his
portion to the output. A StringIO object might be appropriate.

XML got its reputation because it is easy to process. People who
actually deal with it find out that there is a pitfall: Parsing XML is
not that easy. So all the libraries for XML parsing got
designed. However, there is no reason to give up the simplicity of
producing XML - you only have to make sure that no undesired markup
occurs in the output.

> Or is it just that I did my little XML work in Java and the Java
> culture shows?

Dunno. People tend to over-design things. I doubt that the "Java
culture" really tells you that you must not System.out.print to
produce XML.

> This sounds a little bit too unstructured to me. Btw. I've meanwhile
> got the transformation from/to XML working. I'm quite happy with it
> and it's not too much work in comparison with print/write. Any
> comments?

Two of them:

1. Sometimes, people have specific ideas of how the XML should look
   like on a lexical level (what encoding to use, where to insert
   whitespace, etc). Producing a DOM tree and invoking .xml() does not
   give you that flexibility. It may not be needed in your
   application, so this might not be an issue.

2. Keep Unicode strings as long as you can. I see that you use
   .encode("utf-8") in fromxml, and put back the UTF-8 strings in
   toxml. Strictly speaking, this is breaking the DOM contract: All
   strings in DOM are Unicode; byte strings are only accepted for
   backwards compatibility. You should really carry over the Unicode
   objects, and keep them until you need to print them into a byte
   stream.

Regards,
Martin