[XML-SIG] New Reader Architecture

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Mon, 6 Nov 2000 08:42:33 +0100


> We have rewitten most of the code used for creating text from DOMs.
> I've cc'ed xml-sig because the check-ins of 4DOM I'll be making
> today reflect these changes.

Very interesting. Are you following the DOM Level 3 discussions on
load-and-save interfaces? [I couldn't access the draft right now, so
I can't check whether it is related to your work]

> Using one of the new reader classes is also simple.  You create an
> instance passing in to the constructor any parameters relevant to the
> state of that class.

While support for customization is a good thing, I think many users
won't need it, or might get confused by it. So I'd prefer to have some
guidelines what the "good for most uses" way of getting a DOM is.

> Once you have the reader instance, you use the fromStream or fromUri
> method to create each DOM.  The equivalents to the other common utility
> reader functions (say fromString or fromFile) have been eliminated for
> simplicity since it is trivial to turn text or a filename into a
> stream.

Can you please bring the fromString interface back? In interactive
mode, it is a pain to type StringIO.StringIO.

Also, what is the complication that makes urllib not work for fromUri?
In the Python 2 SAX2 interfaces, you can pass a string to parse, and
it will then consider that as a system identifier. In turn, it will
pass it to urllib, which will open either a local file or the URL.

> [Note that the Domlette readers also have an argument to fromStream,
> stripElements, for specifying elements from which white-space is to be
> stripped while building the DOM.  This is merely to support some
> internal XSLT optimizations until a better way can be found.  Using
> these arguments is deprecated and they may be removed from the method
> signatures in any future 4Suite release.]

Isn't a validating parser supposed to indicate which elements can have
their whitespace stripped?

> Python 1.x users can break circular dependencies by calling the
> releaseNode method on the reader that was used to create the DOM:
> 
> reader.releaseNode(xml_doc)

What kind of circularity does that break? The one in the tree? Does
that mean I have to keep the reader until I release the tree?

Regards,
Martin