From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov  2 15:03:31 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon,  2 Nov 1998 10:03:31 -0500 (EST)
Subject: [XML-SIG] Grail site problem fixed
Message-ID: <13885.51565.232437.530513@weyr.cnri.reston.va.us>

  The problem with the Grail web site has now been fixed.  Please
accept my apologies for a premature announcement.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From Stuart.Hungerford@cmis.CSIRO.AU  Tue Nov 10 23:44:03 1998
From: Stuart.Hungerford@cmis.CSIRO.AU (Stuart Hungerford)
Date: Wed, 11 Nov 1998 10:44:03 +1100 (EST)
Subject: [XML-SIG] Looking for PyDOM examples...
Message-ID: <199811102344.KAA12919@aquarius.act.cmis.CSIRO.AU>

Folks,

My project is using Python to take data
out of a SQL Server database (via the ODBC
stuff on Windows) and using PyDOM to
generate XML output.

I suspect strongly that there are idioms
and techniques for using PyDOM that I'm
not aware of.  Does anyone have any good
examples of PyDOM in use (beyond the 
demos in the 0.5 package)?

Thanks,


From Jack.Jansen@cwi.nl  Wed Nov 11 15:36:46 1998
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Wed, 11 Nov 1998 16:36:46 +0100
Subject: [XML-SIG] Dom in XML 0.5 package
In-Reply-To: Message by Jeff.Johnson@stn.siemens.com ,
 Thu, 5 Nov 1998 11:42:35 -0500 , <852566B3.005BBCC4.00@BI01.boca.ssc.siemens.com>
Message-ID: <UTC199811111536.QAA04264.jack@snelboot.cwi.nl>

Just today I switched to the XML CVS tree, but I'm getting more and more the 
idea that this may not have been such a bright move.

I'm especially having problems with the dom stuff. Various things (like 
DcBuilder) have disappeared, even though the example scripts still try to use 
them. That is easily fixed, but what is more of a problem is that various 
modules use bits of various other modules that have disappeared. For instance, 
transformer.Transformer uses DOMFactory(), but DOMFactory() is nowhere to be 
found anymore...

Also, with the way stuff is organized in the XML CVS tree it has become (to 
me, at least) unclear who is responsible for what, otherwise I could have 
mailed this message straight to the author.

Is there a newer version of the dom stuff available? If so, where can I get it?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@cwi.nl      | ++++ if you agree copy these lines to your sig ++++
http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From Jack.Jansen@cwi.nl  Wed Nov 11 15:59:10 1998
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Wed, 11 Nov 1998 16:59:10 +0100
Subject: [XML-SIG] Dom in XML 0.5 package
In-Reply-To: Message by Jack Jansen <Jack.Jansen@cwi.nl> ,
 Wed, 11 Nov 1998 16:36:46 +0100 , <UTC199811111536.QAA04264.jack@snelboot.cwi.nl>
Message-ID: <UTC199811111559.QAA04413.jack@snelboot.cwi.nl>

> I'm especially having problems with the dom stuff. Various things (like 
> DcBuilder) have disappeared, even though the example scripts still try to use 
> them. That is easily fixed, but what is more of a problem is that various 
> modules use bits of various other modules that have disappeared. For instance, 
> transformer.Transformer uses DOMFactory(), but DOMFactory() is nowhere to be 
> found anymore...


Following up on my own message: the whole Transformer class appears to be 
broken. It clearly seems to expect the old version of xml.dom.core...

--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@cwi.nl      | ++++ if you agree copy these lines to your sig ++++
http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From Jeff.Johnson@stn.siemens.com  Thu Nov 12 16:42:17 1998
From: Jeff.Johnson@stn.siemens.com (Jeff.Johnson@stn.siemens.com)
Date: Thu, 12 Nov 1998 11:42:17 -0500
Subject: [XML-SIG] DOM ProcessingInstruction problem
Message-ID: <852566BA.005B93A9.00@BI01.boca.ssc.siemens.com>

I came across an exception (see below) when calling core.Document.toxml().
I made a temporary fix to core.py which is at the bottom of this email.  I
don't know if the line I added is the way to fix the problem or if all
references to "_node.target" should be changed to "_node.data".  And please
don't forget about adding support for comments (or tell me why you won't).

Traceback (innermost last):
  File "cml.py", line 64, in ?
    test('cmlexc01.xml','cml.htm')
  File "cml.py", line 59, in test
    c.readSgml(fileNameIn)
  File "cml.py", line 23, in readSgml
    print "HI JEFF",self.sgml.toxml()
  File "C:\Python\xml\dom\core.py", line 931, in toxml
    s = s + n.toxml()
  File "C:\Python\xml\dom\core.py", line 670, in toxml
    s = s + n.toxml()
  File "C:\Python\xml\dom\core.py", line 670, in toxml
    s = s + n.toxml()
  File "C:\Python\xml\dom\core.py", line 670, in toxml
    s = s + n.toxml()
  File "C:\Python\xml\dom\core.py", line 670, in toxml
    s = s + n.toxml()
  File "C:\Python\xml\dom\core.py", line 670, in toxml
    s = s + n.toxml()
  File "C:\Python\xml\dom\core.py", line 901, in toxml
    return "<? " + self._node.name + ' ' +self._node.target + "?>"
AttributeError: target

    def createProcessingInstruction(self, target, data):
        "Return a new ProcessingInstruction object."
        d = _nodeData(PROCESSING_INSTRUCTION_NODE)
        d.name = target
        d.value = data
        d.target = data       # HAD TO ADD THIS LINE
        return ProcessingInstruction(d, None, self)

Hoping-the-conference-ends-soon-so-the-cvs-tree-gets-updated-ly yours,
Jeff


From akuchlin@cnri.reston.va.us  Sun Nov 15 21:44:48 1998
From: akuchlin@cnri.reston.va.us (A.M. Kuchling)
Date: Sun, 15 Nov 1998 16:44:48 -0500
Subject: [XML-SIG] IPC7 results
Message-ID: <199811152144.QAA00557@mira.erols.com>

The XML-SIG session at IPC7 produced good results, giving us a future
direction to move in. The issues I wanted to discuss were:

	1) Anything need to be dropped from the package before 1.0?
	2) Anything need to be added to the package before we can call 
it 1.0?
	3) What to do about Unicode?
	4) What do we do after 1.0?

The near-term actions will be:

	* Lobby for adding sgmlop.c to 1.5.2, because it's generally
useful and will remove some redundancy from the XML package.

	* Two critical issues for version 1.0 of the XML package are
namespaces and Unicode. For Unicode support, we're going to include a
Unicode type in the XML package, probably Martin von Löwis's wstring
module. Namespace support will probably be added as an extension to
SAX and the DOM interface; we'll have to discuss what this should look
like.

	* More demos should be added that aren't small toy applications. 

	* For interoperability, people want to be able to marshal
Python data structures into XML. My misgivings about which DTD to
support were answered by the response: "Support them all."

	* The XML package will be divided into a base and extension
package.  What we currently have will go into the base package; it'll
be the minimal requirement for XML processing. The extension package
will include higher-level stuff, such as code for any DTD we deem
significant, XSL, XLink, and other things.

The discussion about post-1.0 didn't produce any definite results, so
we'll worry about that when the time comes.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
A pig can learn more tricks than a dog, but has too much sense to want to do
it.
    -- Robertson Davies, _The Table Talk of Samuel Marchbanks_


From larsga@ifi.uio.no  Sun Nov 15 22:03:54 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 15 Nov 1998 23:03:54 +0100
Subject: [XML-SIG] IPC7 results
In-Reply-To: <199811152144.QAA00557@mira.erols.com>
References: <199811152144.QAA00557@mira.erols.com>
Message-ID: <wkyapcy5o5.fsf@ifi.uio.no>

* A. M. Kuchling
| 
| 	* Two critical issues for version 1.0 of the XML package are
| namespaces and Unicode. 

I'm not so sure that we need to worry about namespaces. From what I
hear enthusiasm about them in the W3C is waning, nor does there seem
to be all that much enthusiasm among implementors.

| Namespace support will probably be added as an extension to SAX and
| the DOM interface; we'll have to discuss what this should look like.

The trouble is that it will be very hard (if at all possible) to do
this without doing damage to backwards compatibility. 

In other words, we should wait and see what happens with SAX and DOM
and then follow up on it. I think we can go ahead and do 1.0 without
namespaces.

Other than that everything looked good to me. I'll take a look at the
wstring module you mentioned. 

--Lars M. (who wishes he could have been there...)
 

From Jack.Jansen@cwi.nl  Mon Nov 16 13:18:26 1998
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Mon, 16 Nov 1998 14:18:26 +0100
Subject: [XML-SIG] IPC7 results
In-Reply-To: Message by Lars Marius Garshol <larsga@ifi.uio.no> ,
 15 Nov 1998 23:03:54 +0100 , <wkyapcy5o5.fsf@ifi.uio.no>
Message-ID: <UTC199811161318.OAA25187.jack@snelboot.cwi.nl>

> I'm not so sure that we need to worry about namespaces. From what I
> hear enthusiasm about them in the W3C is waning, nor does there seem
> to be all that much enthusiasm among implementors.

Oh? I know that _I_ am pretty enthusiastic about them, and envision using them 
for various things...

> The trouble is that it will be very hard (if at all possible) to do
> this without doing damage to backwards compatibility. 

This, I think, may not be so difficult if we specify a couple of things in 
advance. For instance (and this is just an example) I can envision that we 
specify that in DOM you should always check nodes for being of a type you 
understand before processing them. Then we could add namespaces to a later 
release of DOM by adding an API to tell which namespaces your app understands 
and hiding elements and attributes of other namespaces as different nodetypes.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@cwi.nl      | ++++ if you agree copy these lines to your sig ++++
http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From larsga@ifi.uio.no  Mon Nov 16 13:37:16 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: Mon, 16 Nov 1998 14:37:16 +0100
Subject: [XML-SIG] IPC7 results
In-Reply-To: <UTC199811161318.OAA25187.jack@snelboot.cwi.nl>
References: <Message by Lars Marius Garshol <larsga@ifi.uio.no>
 <wkyapcy5o5.fsf@ifi.uio.no>
Message-ID: <3.0.1.32.19981116143716.00de6400@ifi.uio.no>

* Lars Marius Garshol
>
> I'm not so sure that we need to worry about namespaces. From what I
> hear enthusiasm about them in the W3C is waning, nor does there seem
> to be all that much enthusiasm among implementors.

* Jack Jansen
>
> Oh? I know that _I_ am pretty enthusiastic about them, and envision using
> them for various things...

What kinds of things? And why are you enthusiastic about them?

I envision a lot of pain in implementing them (if we are to do it properly)
so I'd like to know why I have to suffer if I have to. :)

* Lars Marius Garshol
>
> The trouble is that it will be very hard (if at all possible) to do
> this without doing damage to backwards compatibility. 

* Jack Jansen
>
> This, I think, may not be so difficult if we specify a couple of things in 
> advance. For instance (and this is just an example) I can envision that we 
> specify that in DOM you should always check nodes for being of a type you 
> understand before processing them. Then we could add namespaces to a later 
> release of DOM by adding an API to tell which namespaces your app
understands 
> and hiding elements and attributes of other namespaces as different
> nodetypes.

This sounds like a viable alternative, even if it is just a limited form of
support. However, you can do exactly the same (and much more) with
architectural forms, which we already have support for via Geir Oves xmlarch
module. Why do you want to use namespaces instead?

Also, perhaps we should add to the DOM implementations some standard way of
inserting a SAX ParserFilter (something we should perhaps also work on)
between
the parser and the DOM.

This would enable us to do automate things like removing whitespace, joining
blocks of PCDATA that were separated by buffer boundaries in the parser, doing
architectural processing, (for those who want it) doing namespace filtering,
filtering out XLinks for special processing etc etc

--Lars M.


From gstein@lyra.org  Mon Nov 16 13:46:22 1998
From: gstein@lyra.org (Greg Stein)
Date: Mon, 16 Nov 1998 05:46:22 -0800
Subject: [XML-SIG] IPC7 results
References: <UTC199811161318.OAA25187.jack@snelboot.cwi.nl>
Message-ID: <36502CAE.31711751@lyra.org>

Jack Jansen wrote:
> 
> > I'm not so sure that we need to worry about namespaces. From what I
> > hear enthusiasm about them in the W3C is waning, nor does there seem
> > to be all that much enthusiasm among implementors.
> 
> Oh? I know that _I_ am pretty enthusiastic about them, and envision using them
> for various things...

I very much agree. At IPC7, we noted that the WebDAV protocol *requires*
namespaces, and that SMIL also requires namespaces. Since there are
several Python projects that are based on these protocols, then it is
quite a necessity to have namespace support.

Further, I haven't seen anything about the W3C interest waning. Please
corroborate that with a reference. When the WebDAV protocol was being
processed for final call in the IETF, they made the WG update to the
latest XML Namespaces proposal (WebDAV was still using the PI notation).
I don't think they'd be so hard-core about the change if they felt
namespaces were "on the out."

> > The trouble is that it will be very hard (if at all possible) to do
> > this without doing damage to backwards compatibility.
> 
> This, I think, may not be so difficult if we specify a couple of things in
> advance. For instance (and this is just an example) I can envision that we
> specify that in DOM you should always check nodes for being of a type you
> understand before processing them. Then we could add namespaces to a later
> release of DOM by adding an API to tell which namespaces your app understands
> and hiding elements and attributes of other namespaces as different nodetypes.

Well, just a quick note: nobody suggested changing the SAX interface (if
people seem to have received that impression from Andrew's email). It is
very easy to have a teeny layer over SAX to process element and
attribute names into name/namespace pairs. I have done this quite
successfully within the callback from the Expat parser (see
dav_xmlparse.c in my mod_dav distribution).

Regarding the DOM: it should be possible to just attach a namespace URI
attribute to each node and attribute object. Since just having the
information available doesn't immediately imply the client will check
it, the possibility of hiding nodes/attrs is quite interesting... 

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Mon Nov 16 13:49:05 1998
From: gstein@lyra.org (Greg Stein)
Date: Mon, 16 Nov 1998 05:49:05 -0800
Subject: [XML-SIG] IPC7 results
References: <Message by Lars Marius Garshol <larsga@ifi.uio.no>
 <wkyapcy5o5.fsf@ifi.uio.no> <3.0.1.32.19981116143716.00de6400@ifi.uio.no>
Message-ID: <36502D51.3C5E713F@lyra.org>

Lars Marius Garshol wrote:
> 
> * Lars Marius Garshol
> >
> > I'm not so sure that we need to worry about namespaces. From what I
> > hear enthusiasm about them in the W3C is waning, nor does there seem
> > to be all that much enthusiasm among implementors.
> 
> * Jack Jansen
> >
> > Oh? I know that _I_ am pretty enthusiastic about them, and envision using
> > them for various things...
> 
> What kinds of things? And why are you enthusiastic about them?
> 
> I envision a lot of pain in implementing them (if we are to do it properly)
> so I'd like to know why I have to suffer if I have to. :)

Per my other email, SMIL is a protocol defined to use XML namespaces.
CWI has been working on applications, for a long while now, that use
SMIL (note that Jack works at CWI, and I'd guess *on* that project).

WebDAV is no small potatoes either :-)

-g

--
Greg Stein, http://www.lyra.org/


From akuchlin@cnri.reston.va.us  Mon Nov 16 14:31:18 1998
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon, 16 Nov 1998 09:31:18 -0500 (EST)
Subject: [XML-SIG] IPC7 results
In-Reply-To: <wkyapcy5o5.fsf@ifi.uio.no>
References: <199811152144.QAA00557@mira.erols.com>
 <wkyapcy5o5.fsf@ifi.uio.no>
Message-ID: <13904.13841.271323.414543@amarok.cnri.reston.va.us>

Lars Marius Garshol writes:
>I'm not so sure that we need to worry about namespaces. From what I
>hear enthusiasm about them in the W3C is waning, nor does there seem
>to be all that much enthusiasm among implementors.

	I'm surprised to hear that, since various standards are using 
namespaces.  Sjoerd and Greg have already pointed out SMIL and DAV;
I'll add RDF.

>The trouble is that it will be very hard (if at all possible) to do
>this without doing damage to backwards compatibility. 

	Since it's already possible to do namespace handling "by hand"
-- look for attributes like xmlns:??? in your elements, and keep track
of them -- I was thinking of simply providing a new
NamespaceAwareSAXHandler class that came with the namespace handling
built-in.

>Other than that everything looked good to me. I'll take a look at the
>wstring module you mentioned. 

	I added it to the CVS tree on Sunday evening.  The module is
simply built and installed when you compile the package, but nothing
else has been modified to make use of it.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
    "Not even Kit Marlowe will be able to gainsay that."
    "You have not heard? Marlowe is dead, Will. He died in Deptford, three
weeks back, of a knife wound to the head."
    -- Shakespeare and Dream, in SANDMAN #19: "A Midsummer Night's Dream"


From Jack.Jansen@cwi.nl  Mon Nov 16 15:13:27 1998
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Mon, 16 Nov 1998 16:13:27 +0100
Subject: [XML-SIG] IPC7 results
In-Reply-To: Message by "Andrew M. Kuchling" <akuchlin@cnri.reston.va.us> ,
 Mon, 16 Nov 1998 09:31:18 -0500 (EST) , <13904.13841.271323.414543@amarok.cnri.reston.va.us>
Message-ID: <UTC199811161513.QAA26962.jack@snelboot.cwi.nl>

SMIL is indeed one of the reasons I want namespaces. SMIL doesn't require 
namespaces (as someone suggested), but we definitely want them to be able to 
incorporate our cmif-specific features in a SMIL document.

And to answer Lars' question "why I don't use architectural forms": because 
I'm not familiar enough with them, I guess. Namespaces seem like a nice 
lightweight mechanism to allow easy reuse of standards.


What I would like to do (i.e. what I would like us, as python-xml sig to 
do:-), before we go off and implement namespaces in the various python modules 
is to determine how people would want to use namespaces and how this would be 
facilitated in the API. (Or, perhaps better, to find out how other groups such 
as the DOM people envision doing this).

I can think of a two ways in which I might want to treat unknown namespaces, 
and each would require a slightly different API in DOM (SAX probably isn't as 
much of a problem):
- Pretend that stuff in unrecognized namespaces isn't there at all,
- Treat stuff in unrecognized namespaces as opaque (i.e. leave it in the tree,
  but during transforms and such treat it as you would PCDATA)

For known namespaces there are again various issues. I might want to treat one 
of the namespaces as "primary", where the tag/element names would be simple 
strings (backward compatible) and names from other namespaces are returned as 
"ns:elemname" or ("ns", "elemname"). But, for other applications I might want 
the namespaces to be treated pretty much separately. And, of course, there are 
probably quite a few applications that are happy enough if we just treat ":" 
as part of the identifier... (half a :-)

--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@cwi.nl      | ++++ if you agree copy these lines to your sig ++++
http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From jtauber@jtauber.com  Mon Nov 16 15:27:12 1998
From: jtauber@jtauber.com (James Tauber)
Date: Mon, 16 Nov 1998 23:27:12 +0800
Subject: [XML-SIG] IPC7 results
Message-ID: <005201be1175$9709cbc0$0300000a@othniel.cygnus.uwa.edu.au>

-----Original Message-----
From: Jack Jansen <Jack.Jansen@cwi.nl>


>And to answer Lars' question "why I don't use architectural forms": because
>I'm not familiar enough with them, I guess. Namespaces seem like a nice
>lightweight mechanism to allow easy reuse of standards.

The whole point of namespaces is to enable me to distinguish my FOO from
your (or SMIL's) FOO.
That's all. If you want to do any more (like saying my FOO is the same as
your BAR), then architectural forms are great. If you don't need any more,
they are overkill.

People who have a problem with namespaces seem to expect them to do more
than they are actually intended for.

James


From wes@rishel.com  Mon Nov 16 16:45:08 1998
From: wes@rishel.com (Wes Rishel)
Date: Mon, 16 Nov 1998 08:45:08 -0800
Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results)
In-Reply-To: <36502D51.3C5E713F@lyra.org>
Message-ID: <000601be1180$78b58280$f98f2499@Wes>

I am part of a team that is working on representing the Health Level-7
protocol in XML. This protocol is used by 90% of the hospitals in the US and
in several countries in Europe and the Pacific Rim.

The essence of the protocol is messages (clumps of data) that are
transmitted among various systems in response to a trigger event, such as
"the physician ordered a chest x-ray for the patient".

We derive the clumps of data from an O-O model. The methodology, which
predates our interest in XML, has always assumed a naming scope similar to
one used in most programming languages, where this is not a problem.

	Patient data
		Person data
			name
			religion
			date of birth
			id number

	Physician data
		Person data
			name
			id number
			pager number

This has presented a problem because using XML we can have only a single
content model for Person data. Name spaces would have presented a clear and
elegant solution.

Surely we are not alone in this matter?

Thanks,
W


From jtauber@jtauber.com  Mon Nov 16 17:29:22 1998
From: jtauber@jtauber.com (James Tauber)
Date: Tue, 17 Nov 1998 01:29:22 +0800
Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results)
Message-ID: <00c201be1186$a87851e0$0300000a@othniel.cygnus.uwa.edu.au>

-----Original Message-----
From: Wes Rishel <wes@rishel.com>
[...]

> Patient data
> Person data
> name
> religion
> date of birth
> id number
>
> Physician data
> Person data
> name
> id number
> pager number
>
>This has presented a problem because using XML we can have only a single
>content model for Person data. Name spaces would have presented a clear and
>elegant solution.
>
>Surely we are not alone in this matter?

Namespaces are not the solution for context sensitive content models.
Why not just have two separate element types?

<!ELEMENT patient-person-data (name,religion...)>
<!ELEMENT physician-person-data (name, idnum, pager...)>

What do namespaces give you that this doesn't?

If you want to associate the two types of person-data (so you can have an
application that does things will both) just have a FIXED attributes (a la
architectural forms):

<!ATTLIST patient-person-data
    data-class CDATA #FIXED "person-data">
<!ATTLIST physician-person-data
    data-class CDATA #FIXED "person-data">

James


From fleck@informatik.uni-bonn.de  Mon Nov 16 18:26:17 1998
From: fleck@informatik.uni-bonn.de (Markus Fleck)
Date: Mon, 16 Nov 1998 19:26:17 +0100
Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results)
References: <00c201be1186$a87851e0$0300000a@othniel.cygnus.uwa.edu.au>
Message-ID: <36506E49.7F5D@informatik.uni-bonn.de>

James Tauber wrote:
> <!ELEMENT patient-person-data (name,religion...)>
> <!ELEMENT physician-person-data (name, idnum, pager...)>
> 
> What do namespaces give you that this doesn't?

Let me quote Tim Berners-Lee again (from the WWW7 Conference):

   "You need to build a system that is futureproof;
   it's no good just making a modular system," he said.
   "You need to realize that your system is just going
   to be a module in some bigger system to come, and so
   you have to be part of something else, and it's a
   bit of a way of life."

In other words, namespaces allow you to use globally
unique identifiers without needing to revert to
non-descriptive and ugly numerical UUIDs.

So with namespaces, it would be possible to exchange
or convert data from other hospitals that use 
differently defined "*-person-data" structures.

Yours,
Markus.

-- 
////////////////////////////////////////////////////////////////////////////
   Markus B Fleck - University of Bonn - CS Department IV - WHOIS MF5079
          UNIX Administrator - comp.lang.python.announce Moderator
   "GNU Gather" Free Internet Groupware Project - http://cscw.net/gather/
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\


From akuchlin@cnri.reston.va.us  Mon Nov 16 21:22:24 1998
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon, 16 Nov 1998 16:22:24 -0500 (EST)
Subject: [XML-SIG] Dom in XML 0.5 package
In-Reply-To: <UTC199811111536.QAA04264.jack@snelboot.cwi.nl>
References: <852566B3.005BBCC4.00@BI01.boca.ssc.siemens.com>
 <UTC199811111536.QAA04264.jack@snelboot.cwi.nl>
Message-ID: <13904.36750.540685.658864@amarok.cnri.reston.va.us>

Jack Jansen writes:
>Just today I switched to the XML CVS tree, but I'm getting more and more the 
>idea that this may not have been such a bright move.
>
>I'm especially having problems with the dom stuff. Various things (like 
>DcBuilder) have disappeared, even though the example scripts still try to use 
>them. That is easily fixed, but what is more of a problem is that various 
>modules use bits of various other modules that have disappeared. For instance,
>transformer.Transformer uses DOMFactory(), but DOMFactory() is nowhere to be 
>found anymore...

	That doesn't surprise me; I've been almost exclusively working
on core.py and ignoring the other things like walker.py and
transformer.py and the demo scripts, only fixing them when people
reported problems.  Last night I checked in some changes that may have
fixed some of the problems, but they haven't really been tested.

>Also, with the way stuff is organized in the XML CVS tree it has become (to 
>me, at least) unclear who is responsible for what, otherwise I could have 
>mailed this message straight to the author.

	It's me; Stefane isn't responsible for it any more because
I've mercilessly hacked up his code, making it practically a complete
rewrite.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
Today I live in the gray, muffled, smelless, puffy, tasteless half-world of
those who have colds.
    -- Robertson Davies, _The Diary of Samuel Marchbanks_


From jtauber@jtauber.com  Tue Nov 17 01:42:55 1998
From: jtauber@jtauber.com (James Tauber)
Date: Tue, 17 Nov 1998 09:42:55 +0800
Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results)
Message-ID: <001a01be11cb$99990c10$4a850786@ecn08.curtin.edu.au>

>So with namespaces, it would be possible to exchange
>or convert data from other hospitals that use
>differently defined "*-person-data" structures.


Yes, but this is not what the original poster said he was using namespaces
for. His example had a context sensitive content model within the one DTD.

James


From wes@rishel.com  Tue Nov 17 03:24:34 1998
From: wes@rishel.com (Wes Rishel)
Date: Mon, 16 Nov 1998 19:24:34 -0800
Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results)
In-Reply-To: <001a01be11cb$99990c10$4a850786@ecn08.curtin.edu.au>
Message-ID: <002201be11d9$cc342f20$643cfad0@Wes>


> -----Original Message-----
> >So with namespaces, it would be possible to exchange
> >or convert data from other hospitals that use
> >differently defined "*-person-data" structures.
>
>
> Yes, but this is not what the original poster said he was using namespaces
> for. His example had a context sensitive content model within the one DTD.
>

Actually it is a large set of DTDs that will continue to evolve over the
years. There are about 110 classes in the information model; the various
permutations of them that would constitute the informatin structure for a
single DTD runs to a much higher.

W


From wunder@infoseek.com  Tue Nov 17 17:01:05 1998
From: wunder@infoseek.com (Walter Underwood)
Date: Tue, 17 Nov 1998 09:01:05 -0800
Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7
 results)
In-Reply-To: <00c201be1186$a87851e0$0300000a@othniel.cygnus.uwa.edu.au>
Message-ID: <3.0.5.32.19981117090105.00af8c40@corp>

>From: Wes Rishel <wes@rishel.com>
>[...]
>
>> Patient data
>> Person data
>> name
>> religion
>> date of birth
>> id number
>>
>> Physician data
>> Person data
>> name
>> id number
>> pager number
>>
>>This has presented a problem because using XML we can have only a single
>>content model for Person data. Name spaces would have presented a clear and
>>elegant solution.
>>
>>Surely we are not alone in this matter?

You are not. This may be similar to the model in SNMP MIBs. Those
are somewhat different from the usual object model. Basically, if
a slot is used, it should mean the same thing, but you don't have
to use all the slots. Sort of a cross between a data dictionary
and an object model. And really hard to represent in existing
object models!

A different comment -- it sounds like you are trying to get the 
DTD to enforce the model, rather than just making something that
can be parsed. There are lots and lots of constraints that cannot
be expressed in a DTD (age is positive, these references form a
tree), so enforcing the exact sub-elements of each element is 
just one more thing that can't be enforced in the DTD. Even if
it was possible to specify what was legal, you couldn't specify
that all elements must be there.

Since you've got to do post-parsing checking anyway, trying to 
express too much stuff in the DTD is probably wasted effort.

wunder
 
Walter R. Underwood
wunder@infoseek.com
wunder@best.com (home)
http://www.best.com/~wunder/
1-408-543-6946


From larsga@ifi.uio.no  Tue Nov 17 17:52:24 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 17 Nov 1998 18:52:24 +0100
Subject: [XML-SIG] Support for Validating Parsers
In-Reply-To: <3.0.6.32.19981106092115.00927690@gpo.iol.ie>
References: <3.0.6.32.19981106092115.00927690@gpo.iol.ie>
Message-ID: <wk7lwunr53.fsf@ifi.uio.no>

* Sean Mc Grath
| 
| 1) Run NSGMLS with os.system or os.open() and pick up the ESIS.
| This can serve as input to PyDOM. (Has anyone done a SAX driver
| for ESIS yet? If no, then I will offer to write one. ((
| Dublin->Chicago->Houston should be plenty of flight time
| for this!)).
 
I have one that can read ESIS from files, SP and SP -wxml, but it
can't handle error messages properly on Win32. I think the problem is
caused by SP doing something strange to emulate stderr on Win32 where
this doesn't exist at all.

Once I can handle the error messages I will include this in the saxlib
driver package. So far I've thought of these possible avenues on
Win32:

 - redirect error msgs to a temporary file with the -f option, ignore
   the file and delete it afterwards

 - redirect error msgs to a temporary file with the -f option and check
   it between events for new errors

 - some secret ritual involving dead bats, black candles and a Bill
   Gates doll


If anyone has any better ideas or has any details about the SP C++
source I'd be glad to know about them. Also, if people are impatient I
can release it as it is.

--Lars M.


From larsga@ifi.uio.no  Tue Nov 17 18:03:15 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 17 Nov 1998 19:03:15 +0100
Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results)
In-Reply-To: <002201be11d9$cc342f20$643cfad0@Wes>
References: <002201be11d9$cc342f20$643cfad0@Wes>
Message-ID: <wk67cenqn0.fsf@ifi.uio.no>

* Wes Rishel
| 
| Actually it is a large set of DTDs that will continue to evolve over
| the years. There are about 110 classes in the information model; the
| various permutations of them that would constitute the informatin
| structure for a single DTD runs to a much higher.

In this case I would recommend that you take a close look at
architectural forms, which lend something that is somewhat reminiscent
of OO inheritance to DTDs. Architectural forms also do many other
things that may be useful to you.

I would recommend anyone planning to do serious work with XML to take
a look at architectural forms. (Just as I would recommend anyone doing
serious programming to look at Python instead of stopping at Perl.)

--Lars M.


From larsga@ifi.uio.no  Tue Nov 17 18:15:36 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 17 Nov 1998 19:15:36 +0100
Subject: [XML-SIG] IPC7 results
In-Reply-To: <36502CAE.31711751@lyra.org>
References: <UTC199811161318.OAA25187.jack@snelboot.cwi.nl> <36502CAE.31711751@lyra.org>
Message-ID: <wk4srynq2f.fsf@ifi.uio.no>

* Greg Stein
| 
| I very much agree. At IPC7, we noted that the WebDAV protocol
| *requires* namespaces, [...] Since there are several Python projects
| that are based on these protocols, then it is quite a necessity to
| have namespace support.

If it is a necessity then I guess we'll just have to go ahead and do
it. 

I think the best way would be to make a SAX ParserFilter that does
the namespace processing. When I say ParserFilter I'm thinking of
something like the ParserFilters John Cowan made for Java-SAX. It
would simply be a SAX DocumentHandler that rode on the back of other
SAX parsers pretending to be a SAX parser to its clients.

I already have code for doing some of these things in xmlproc (it's
not used, but it's there). I can move it out into a filter and add a
sketch of what's missing as well as making a sketch of the filters.
 
| Regarding the DOM: it should be possible to just attach a namespace
| URI attribute to each node and attribute object. Since just having
| the information available doesn't immediately imply the client will
| check it, the possibility of hiding nodes/attrs is quite
| interesting...

One way to do this might be to have a DOM extension module that used
the factories to sneak in objects with the extra namespace attribute
and extra methods for handling the objects.  This module could also
extend the builders to work with the filter.

FYI: One can do an equivalent of this already with xmlarch. Just use
xmlarch as a set of filters (one for each of your architectures) and
you can build DOM trees from the filtered events for eacharchitecture.
This requires no programming beyond setting up the filters, just some
PIs and #FIXED attributes in your document and DTD.

--Lars M.


From larsga@ifi.uio.no  Tue Nov 17 23:30:51 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 18 Nov 1998 00:30:51 +0100
Subject: [XML-SIG] ParserFilter proposal
Message-ID: <wklnl9nbh0.fsf@ifi.uio.no>

OK, I've now hacked together a proposal for a general SAX ParserFilter
API, with implementations of two filters: 'keep character data
together' and namespaces. (The latter is just a rough sketch riddled
with 'FIXME' comments.) The whole thing is just a proposal, and
consists of readable source with two simple demos with sample
documents.


You can download it as a 5k zip file from:

<URL:http://birk105.studby.uio.no/tmp/filters.zip>


Comments, anyone? Is this the way to do the SAX side of this?

And, Geir Ove, what do you think? Could xmlarch be fitted into this as
a ParserFilter? (Didn't have time to look at it.)

--Lars M.


From grove@infotek.no  Wed Nov 18 08:27:56 1998
From: grove@infotek.no (Geir Ove Gronmo)
Date: Wed, 18 Nov 1998 09:27:56 +0100
Subject: [XML-SIG] ParserFilter proposal
Message-ID: <199811180827.JAA24079@mail.infotek.no>

At 00:30 18.11.98 +0100, you wrote:
>OK, I've now hacked together a proposal for a general SAX ParserFilter
>API, with implementations of two filters: 'keep character data
>together' and namespaces. (The latter is just a rough sketch riddled
>with 'FIXME' comments.) The whole thing is just a proposal, and
>consists of readable source with two simple demos with sample
>documents.

>Comments, anyone? Is this the way to do the SAX side of this?

It this how its done in Java by John Cowan? I've not been able to check
that out yet, but I will soon.

>And, Geir Ove, what do you think? Could xmlarch be fitted into this as
>a ParserFilter? (Didn't have time to look at it.)

xmlarch could definately be fitted into a ParserFilter.

I don't see any problems with this at all. Since xmlarch is written as a
DocumentHandler, only minor modifications would probably have to be done. 

I originally wrote xmlarch as a wrapper around a Parser object, but soon
realized that that was overkill. Only XML events from a DocumentHandler is
needed to write an architectural forms processor.

The next release of xmlarch is probably going to be independent of SAX.
I've been thinking of removing the direct connection to SAX, and instead
make it a more general module. Wrappers/plugins could then be written for
SAX (both DocumentHandler and ParserFilter), DOM and other kinds of input.

Geir O.


From larsga@ifi.uio.no  Wed Nov 18 08:45:08 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 18 Nov 1998 09:45:08 +0100
Subject: [XML-SIG] ParserFilter proposal
In-Reply-To: <199811180827.JAA24079@mail.infotek.no>
References: <199811180827.JAA24079@mail.infotek.no>
Message-ID: <wkemr1o0dn.fsf@ifi.uio.no>

* Geir Ove Gronmo
| 
| It this how its done in Java by John Cowan? 

No, it's not. I've separated the filter and the DocumentHandler, while
he has the filter implement DocumentHandler, AttributeList and the
other handlers.

He also lacks the factory stuff I did, but has at least settled on a
policy with the other handlers.

See: <URL:http://www.ccil.org/~cowan/XML/ParserFilter.java>


Should we align ourselves with his proposal? It's not turned into a
standard, whether de facto (OK, Simon St.Laurent uses it) or de jure.

| >And, Geir Ove, what do you think? Could xmlarch be fitted into this as
| >a ParserFilter? (Didn't have time to look at it.)
| 
| xmlarch could definately be fitted into a ParserFilter.
| 
| I don't see any problems with this at all. Since xmlarch is written as a
| DocumentHandler, only minor modifications would probably have to be done. 
| 
| I originally wrote xmlarch as a wrapper around a Parser object, but soon
| realized that that was overkill. Only XML events from a DocumentHandler is
| needed to write an architectural forms processor.
| 
| The next release of xmlarch is probably going to be independent of SAX.
| I've been thinking of removing the direct connection to SAX, and instead
| make it a more general module. Wrappers/plugins could then be written for
| SAX (both DocumentHandler and ParserFilter), DOM and other kinds of input.
| 
| Geir O.
| 
| 
| _______________________________________________
| XML-SIG maillist  -  XML-SIG@python.org
| http://www.python.org/mailman/listinfo/xml-sig


From larsga@ifi.uio.no  Wed Nov 18 08:51:34 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 18 Nov 1998 09:51:34 +0100
Subject: [XML-SIG] ParserFilter proposal
In-Reply-To: <199811180827.JAA24079@mail.infotek.no>
References: <199811180827.JAA24079@mail.infotek.no>
Message-ID: <wkd86lo02x.fsf@ifi.uio.no>

(Pardon the last email. It turns out that not only C-c C-c, but also
C-c C-s sends emails in Gnus. That is, it did. I turned it off now.)

* Geir Ove Gronmo
| 
| It this how its done in Java by John Cowan? 

No, it's not. I've separated the filter and the DocumentHandler, while
he has the filter implement DocumentHandler, Locator, AttributeList
and the other handlers. I'm not sure I really like that approach.

He also lacks the factory stuff I did, but has at least settled on a
policy with the other handlers.

See: <URL:http://www.ccil.org/~cowan/XML/ParserFilter.java>


Should we align ourselves with his proposal? It's not turned into a
standard, whether de facto (OK, Simon St.Laurent uses it) or de jure.

* Lars Marius Garshol
|
| And, Geir Ove, what do you think? Could xmlarch be fitted into this as
| a ParserFilter?
 
* Geir Ove Gronmo
|
| xmlarch could definately be fitted into a ParserFilter.

That's a good sign, at least. :)
 
| The next release of xmlarch is probably going to be independent of
| SAX.  I've been thinking of removing the direct connection to SAX,
| and instead make it a more general module. Wrappers/plugins could
| then be written for SAX (both DocumentHandler and ParserFilter), DOM
| and other kinds of input.

This reminds me: the Java people have made a DOM walker that fires SAX
events, called DOMParser. Is this something we want?

--Lars M.


From grove@infotek.no  Wed Nov 18 08:54:37 1998
From: grove@infotek.no (Geir Ove Gronmo)
Date: Wed, 18 Nov 1998 09:54:37 +0100
Subject: [XML-SIG] ParserFilter proposal
In-Reply-To: <wkemr1o0dn.fsf@ifi.uio.no>
References: <199811180827.JAA24079@mail.infotek.no>
 <199811180827.JAA24079@mail.infotek.no>
Message-ID: <199811180854.JAA24275@mail.infotek.no>

At 09:45 18.11.98 +0100, you wrote:
>| It this how its done in Java by John Cowan? 
>
>No, it's not. I've separated the filter and the DocumentHandler, while
>he has the filter implement DocumentHandler, AttributeList and the
>other handlers.

Yes, I think that's a good thing to do.

>He also lacks the factory stuff I did, but has at least settled on a
>policy with the other handlers.
>
>See: <URL:http://www.ccil.org/~cowan/XML/ParserFilter.java>
>
>
>Should we align ourselves with his proposal? It's not turned into a
>standard, whether de facto (OK, Simon St.Laurent uses it) or de jure.

I don't see a need to do that. Your proposal seems to be superior to the
one written in Java. Perhaps someone should write a Java version of the
Python ParserFilter? :-)

Geir O.


From jdnier@execpc.com  Wed Nov 18 15:00:04 1998
From: jdnier@execpc.com (David Niergarth)
Date: Wed, 18 Nov 1998 09:00:04 -0600 (CST)
Subject: [XML-SIG] Support for Validating Parsers
In-Reply-To: <wk7lwunr53.fsf@ifi.uio.no>
Message-ID: <Pine.GSO.3.95.981118081618.13485C-100000@earth>

On 17 Nov 1998, Lars Marius Garshol wrote:

> I have one that can read ESIS from files, SP and SP -wxml, but it
> can't handle error messages properly on Win32. I think the problem is
> caused by SP doing something strange to emulate stderr on Win32 where
> this doesn't exist at all.

Are you "handling" the errors or ignoring the errors? If you have parsing
errors (as opposed to warnings) seems like you'd usually need/want to
fix them first, then make esis again.

>  - redirect error msgs to a temporary file with the -f option, ignore
>    the file and delete it afterwards
>  - redirect error msgs to a temporary file with the -f option and check
>    it between events for new errors

I don't know a way to suppress errors (like -s). You can limit the
maximum number of errors reported with -E. Unfortunately, -E0 means "no
limit", however, -E1 at least keeps the error output to a minimum.

You can play with redirection, in WinNT, e.g.,

     nsgmls file 2>&0 > esis_file

2 is error and I think 0 is stdin, which makes this twisted, but the error
"stream" effectively disappears. This only works on NT (cmd.exe); in 95/98
you'll end up with a file called &0! If there were only a /dev/null....

--David Niergarth


From larsga@ifi.uio.no  Wed Nov 18 15:23:45 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 18 Nov 1998 16:23:45 +0100
Subject: [XML-SIG] Support for Validating Parsers
In-Reply-To: <Pine.GSO.3.95.981118081618.13485C-100000@earth>
References: <Pine.GSO.3.95.981118081618.13485C-100000@earth>
Message-ID: <wkr9v1m3cu.fsf@ifi.uio.no>

* Lars Marius Garshol
|
| I have one that can read ESIS from files, SP and SP -wxml, but it
| can't handle error messages properly on Win32. I think the problem
| is caused by SP doing something strange to emulate stderr on Win32
| where this doesn't exist at all.

* David Niergarth
| 
| Are you "handling" the errors or ignoring the errors? 

Both, in a sense. I want to detect them and report them to the SAX
ErrorHandler. The problem is that I need to interleave them with the
other callback events. Ignoring them is a only last resort on Win32 if
all else fails (or possibly reporting them after all the data events).

| If you have parsing errors (as opposed to warnings) seems like you'd
| usually need/want to fix them first, then make esis again.

Of course, but this is something the user must deal with after getting
the error messages from his/her application through the SAX driver.
 
| I don't know a way to suppress errors (like -s).

Well, I don't really want to suppress the errors, since the
ErrorHandler should be told about them.
 
What I want is for them to appear interleaved in the normal ESIS
stream as they do when nsgmls is run from the command line. However,
os.popen fails to accomplish this, instead the errors are still
printed to the console, while everything else is forwarded to my SAX
driver.

--Lars M.


From wes@rishel.com  Wed Nov 18 16:12:41 1998
From: wes@rishel.com (Wes Rishel)
Date: Wed, 18 Nov 1998 08:12:41 -0800
Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results)
In-Reply-To: <3.0.5.32.19981117090105.00af8c40@corp>
Message-ID: <000901be130e$455c4f60$28862499@Wes>


> -----Original Message-----
> From: xml-sig-admin@python.org [mailto:xml-sig-admin@python.org]On
> Behalf Of Walter Underwood
> Sent: Tuesday, November 17, 1998 9:01 AM
> To: xml-sig@python.org
> Subject: Re: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7
> results)
>
> A different comment -- it sounds like you are trying to get the
> DTD to enforce the model, rather than just making something that
> can be parsed. There are lots and lots of constraints that cannot
> be expressed in a DTD (age is positive, these references form a
> tree), so enforcing the exact sub-elements of each element is
> just one more thing that can't be enforced in the DTD. Even if
> it was possible to specify what was legal, you couldn't specify
> that all elements must be there.
>
> Since you've got to do post-parsing checking anyway, trying to
> express too much stuff in the DTD is probably wasted effort.

We agree with you completely.

Indeed, the debate that keeps resurfacing in our discussions is whether
there is any substantial benefit to using a validating parser. At least in
our context, which includes XML and other representations of the same
content, we have to have a metamodel that essentially duplicates the content
model anyway.


From akuchlin@cnri.reston.va.us  Wed Nov 18 17:22:12 1998
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Wed, 18 Nov 1998 12:22:12 -0500 (EST)
Subject: [XML-SIG] Value of namespaces
In-Reply-To: <000901be130e$455c4f60$28862499@Wes>
References: <3.0.5.32.19981117090105.00af8c40@corp>
 <000901be130e$455c4f60$28862499@Wes>
Message-ID: <13906.62735.313166.784251@amarok.cnri.reston.va.us>

Wes Rishel writes:
>Indeed, the debate that keeps resurfacing in our discussions is whether
>there is any substantial benefit to using a validating parser. At least in
>our context, which includes XML and other representations of the same
>content, we have to have a metamodel that essentially duplicates the content
>model anyway.

	And that's perfectly all right.  There's no rule that says you
must have a DTD for your XML documents, and for some applications you
may only care about well-formedness.  You lose something, in that the
only thing that can verify the correctness of your XML documents is
custom-written code, and it may not be obvious what the code accepts,
and XML editors can't use the DTD to assist the author, but those
factors might not be important in some cases.  For example, we have an
XML format for representing process steps, and no effort has been made
to write a DTD yet, because the current structuring is preliminary at
best.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
My life is strobed like lightning by a follow-spot, and looking backwards I can
only see the corpses of the animals and birds who strutted with me on the
darkened stage and helped me fool them all.
    -- Zatara, in BOOKS OF MAGIC #1


From stuart.hungerford@cmis.csiro.au  Wed Nov 18 23:02:50 1998
From: stuart.hungerford@cmis.csiro.au (Stuart Hungerford)
Date: Thu, 19 Nov 1998 10:02:50 +1100
Subject: [XML-SIG] Looking for substantial PyDOM examples...
Message-ID: <4.1.19981119095926.00a89e40@mailhost.act.cmis.csiro.au>

Hi all,

I've been experimenting with PyDOM in the 0.5
XML stuff release, and I'm starting to feel 
that I may not be making the best use of the
DOM in this Python-flavoured implementation.

Can some kind person share some examples of 
PyDOM being used for some non-trivial chores
with me?


Stu

-----------------------------------------------------------------------
Stuart.Hungerford@cmis.csiro.au                  Voice : +61 2 62167061
CSIRO Mathematical and Information Sciences      Fax   : +61 2 62167111
Canberra, AUSTRALIA                     GPO Box 664, Canberra, ACT 2601


From Fred L. Drake, Jr." <fdrake@acm.org  Wed Nov 18 23:21:15 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Wed, 18 Nov 1998 18:21:15 -0500 (EST)
Subject: [XML-SIG] Support for Validating Parsers
In-Reply-To: <Pine.GSO.3.95.981118081618.13485C-100000@earth>
References: <wk7lwunr53.fsf@ifi.uio.no>
 <Pine.GSO.3.95.981118081618.13485C-100000@earth>
Message-ID: <13907.22123.832713.317168@weyr.cnri.reston.va.us>

David Niergarth writes:
 > You can play with redirection, in WinNT, e.g.,
 > 
 >      nsgmls file 2>&0 > esis_file

  You should be able to use 2>NUL to send errors to the equivalent of
/dev/null.  Haven't test, though: that would require at least a 90
degree chair rotation!  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From jdnier@execpc.com  Thu Nov 19 17:03:42 1998
From: jdnier@execpc.com (David Niergarth)
Date: Thu, 19 Nov 1998 11:03:42 -0600
Subject: [XML-SIG] Support for Validating Parsers
Message-ID: <002801be13de$90b3ea10$c56ccfa9@ep3>

Lars Marius Garshol:
>What I want is for them to appear interleaved in the normal ESIS
>stream as they do when nsgmls is run from the command line. However,
>os.popen fails to accomplish this, instead the errors are still
>printed to the console, while everything else is forwarded to my SAX
>driver.

I couldn't get the interleaved behavior seen when running from the command
line but the following prepends the error messages to the esis: (Usefull?
Not sure if you'll get same behavior in W95/98.)

>>> s = os.popen("nsgmls -wxml com_err.xml 2>>&1", "r")
>>> print s.read()[0:300]
nsgmls:com_sm.xml:6:17:E: end tag for element "PP" which is not open
nsgmls:com_sm.xml:7:5:E: end tag for "P" omitted, but OMITTAG NO was
specified
nsgmls:com_sm.xml:6:2: start tag was here
?xml version="1.0"
(PLAY
(TITLE
-The Comedy of Errors
)TITLE
(FM
(P
-FM Text.\n\012\011
)P
)FM
(PERSONAE
(TITL
>>>

Fred Drake:
> You should be able to use 2>NUL to send errors to the equivalent of
> /dev/null.  Haven't test, though: that would require at least a 90
> degree chair rotation!  ;-)

No need to get up; it works as advertized. (Wow, never knew you could do
that!-)

--David Niergarth


From Fred L. Drake, Jr." <fdrake@acm.org  Thu Nov 19 17:08:35 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Thu, 19 Nov 1998 12:08:35 -0500 (EST)
Subject: [XML-SIG] Support for Validating Parsers
In-Reply-To: <002801be13de$90b3ea10$c56ccfa9@ep3>
References: <002801be13de$90b3ea10$c56ccfa9@ep3>
Message-ID: <13908.20627.117681.313512@weyr.cnri.reston.va.us>

I wrote:
 > /dev/null.  Haven't test, though: that would require at least a 90
 > degree chair rotation!  ;-)

David Niergarth writes:
 > No need to get up; it works as advertized. (Wow, never knew you could do

  Get up??  No, this is a swivel chair... I'm still tightly bound to
my mailer, recovering from the conferance.  Swivelling is simply too
much of a distraction!  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From uche.ogbuji@fourthought.com  Fri Nov 20 09:32:58 1998
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Fri, 20 Nov 1998 02:32:58 -0700
Subject: [XML-SIG] DOM Walker -> SAX
In-Reply-To: Your message of "18 Nov 1998 09:51:34 +0100."
 <wkd86lo02x.fsf@ifi.uio.no>
Message-ID: <199811200933.CAA00568@malatesta.local>

> This reminds me: the Java people have made a DOM walker that fires SAX
> events, called DOMParser. Is this something we want?

It sound interesting, but I'm at a loss to think up a serious need.  All I can 
think of is if a user had invested a lot of effort in an app that was 
originally designed to parse XML, that now needs to be plugged into the output 
of another app that manipulates DOM-objects.  But is this a significant enough 
need to provide more than the obvious solution of walking the DOM tree to 
print out the doc, and then feeding this to the SAX app?

Perhaps I'm missing something.

-- 
Uche Ogbuji
uche.ogbuji@fourthought.com	(970)481-0805
Consulting Member, FourThought LLC (Open Enterprise Architects)
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From Fred L. Drake, Jr." <fdrake@acm.org  Fri Nov 20 16:01:43 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 20 Nov 1998 11:01:43 -0500 (EST)
Subject: [XML-SIG] xml.dom.esis_builder -- "e" events
Message-ID: <13909.37479.243860.245412@weyr.cnri.reston.va.us>

  nsgmls can output "e" events if the next element has a declared
content type of EMPTY when given the "-oempty" option.  This is really
only interesting if we're generating an SGML output document, but I
found it useful to generate them from a LaTeX->ESIS conversion tool
I've been playing with.  Feeding them to EsisBuilder should not cause
an exception; they can be safely ignored.
  The patch below implements this.
  Andrew, please merge this with the CVS tree.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


Index: esis_builder.py
===================================================================
RCS file: /projects/cvsroot/xml/dom/esis_builder.py,v
retrieving revision 1.2
diff -c -c -r1.2 esis_builder.py
*** esis_builder.py	1998/11/18 23:57:12	1.2
--- esis_builder.py	1998/11/20 15:53:51
***************
*** 60,67 ****
  			elif event == 'C':
  				return
  
  			else:
! 				sys.stderr.write('Unknow event: ' + `line` + '\n')
  
  
  backslash = r"\\"
--- 60,73 ----
  			elif event == 'C':
  				return
  
+ 			elif event == 'e':
+ 				# Indicates that this is an empty element;
+ 				# only produced by nsgmls for -oempty.  We
+ 				# can safely ignore it.
+ 				pass
+ 
  			else:
! 				sys.stderr.write('Unknown event: ' + `line` + '\n')
  
  
  backslash = r"\\"


From Fred L. Drake, Jr." <fdrake@acm.org  Fri Nov 20 22:51:41 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 20 Nov 1998 17:51:41 -0500 (EST)
Subject: [XML-SIG] xml.dom.core update
Message-ID: <13909.62077.404729.856342@weyr.cnri.reston.va.us>

  This patch fixes the Document.createElement() interface to accept
either or both a dictionary of attribute name/value pairs or keywords
on the command line.  This fixes problems with the EsisBuilder as well 
as making the interface more flexible.
  Document.toxml() now uses all its children to generate the XML
representation; this ensures that processing instructions and comments 
will be included exactly as they are included in the tree.
  Andrew, please integrate these with the CVS repo.  This is just what 
we talked about earlier today.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


Index: core.py
===================================================================
RCS file: /projects/cvsroot/xml/dom/core.py,v
retrieving revision 1.30
diff -c -c -r1.30 core.py
*** core.py	1998/11/16 03:52:02	1.30
--- core.py	1998/11/20 22:45:11
***************
*** 931,943 ****
          s = '<?xml version="1.0"?>\n'
          if self.documentType:
              s = s + self.documentType
!         if len(self._node.children):
!             n = self._node.children[0]
              n =  NODE_CLASS[ n.type ] (n, self, self)
              s = s + n.toxml()
          return s
  
!     def createElement(self, tagName, **kwdict):
          "Return a new Element object."
  
          d = _nodeData(ELEMENT_NODE)
--- 931,942 ----
          s = '<?xml version="1.0"?>\n'
          if self.documentType:
              s = s + self.documentType
!         for n in self._node.children:
              n =  NODE_CLASS[ n.type ] (n, self, self)
              s = s + n.toxml()
          return s
  
!     def createElement(self, tagName, dict={}, **kwdict):
          "Return a new Element object."
  
          d = _nodeData(ELEMENT_NODE)
***************
*** 945,950 ****
--- 944,950 ----
          d.value = None
          d.attributes = {}
          elem = Element(d, None, self)
+         kwdict.update(dict)
          for name, value in kwdict.items():
              elem.setAttribute(name, value)
          return elem


From dkuhlman@enterpriselink.com  Fri Nov 20 23:35:46 1998
From: dkuhlman@enterpriselink.com (Dave Kuhlman)
Date: Fri, 20 Nov 1998 15:35:46 -0800
Subject: [XML-SIG] DOM Walker -> SAX
References: <199811200933.CAA00568@malatesta.local>
Message-ID: <3655FCD1.5384C0C2@EnterpriseLink.com>

Those of you who are interested in tree walking might want to look
at PCCTS.  PCCTS (Perdue compiler construction tool set, but now
called ANTLR, see http://www.ANTLR.org/) is intended as a
replacement for yacc/lex, the UNIX parser generators.  The PCCTS
distribution also contains Sorcerer.  PCCTS is used to generate a
parser that builds a parse tree.  Sorcerer is used to generate a
"tree parser" that can be used to walk the parse tree and produce an
abstract syntax tree with annotated nodes.  The idea is to use
Sorcerer to produce tree transformations.

I can see a use for a similar tool when processing XML: Use the DOM
parser to build a DOM tree, which is application neutral. Then use
the tree walker to transform the DOM tree into a new tree that is
application specific and is tailored for use by the application
code.  The tree walker is actually a set of rules that describe how
to recognize nodes (branches ?) in the DOM and how to transform that
node or branch into an application specific node or branch.

As an example, I recently wrote a Java XML SAX-based parser built
using Aelfred that creates a tree structure of instances of Java
classes that I have defined and implemented.  The tree represents a
Web page which contains input items which contain style information,
etc.  In this parser application I had to create each object or node
in the tree, fill in member variables (e.g. from attributes in XML
element for the object), and insert it into the tree.  For a future
project I can dream about being able to define a transformation on
the nodes in a DOM that would produce the nodes/objects in my tree
structure.

Admittedly, this task would have been much easier in Python than in
Java.  But, it might be easier still and also more orderly using a
tree match and transformation tool.  Maybe this is why Uche is "at a
loss".  Python makes this kind of work too easy.  But, put youself
in the shoes of someone struggling with a low level language like
Java ...

  -- Dave

uche.ogbuji@fourthought.com wrote:
> 
> > This reminds me: the Java people have made a DOM walker that fires SAX
> > events, called DOMParser. Is this something we want?
> 
> It sound interesting, but I'm at a loss to think up a serious need.  All I can
> think of is if a user had invested a lot of effort in an app that was
> originally designed to parse XML, that now needs to be plugged into the output
> of another app that manipulates DOM-objects.  But is this a significant enough
> need to provide more than the obvious solution of walking the DOM tree to
> print out the doc, and then feeding this to the SAX app?
> 
> Perhaps I'm missing something.
> 
> --
> Uche Ogbuji
> uche.ogbuji@fourthought.com     (970)481-0805
> Consulting Member, FourThought LLC (Open Enterprise Architects)
> Software engineering, project management, Intranets and Extranets
> http://FourThought.com          http://OpenTechnology.org
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://www.python.org/mailman/listinfo/xml-sig

-- 
Dave Kuhlman
EnterpriseLink Technology Corp
http://www.enterpriselink.com
2542 S. Bascom Ave., Suite #203
Campbell, CA 95008
dkuhlman@EnterpriseLink.com
408-558-2011


From Fred L. Drake, Jr." <fdrake@acm.org  Sat Nov 21 03:18:42 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 20 Nov 1998 22:18:42 -0500 (EST)
Subject: [XML-SIG] patch to xml.dom.core
Message-ID: <13910.12562.933837.421428@weyr.cnri.reston.va.us>

  The patch below didn't appear to make it into the update; this is
what allows nodes other than the documentElement to get written by the 
Document.toxml() method.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


Index: core.py
===================================================================
RCS file: /projects/cvsroot/xml/dom/core.py,v
retrieving revision 1.31
diff -c -c -r1.31 core.py
*** core.py	1998/11/21 02:48:12	1.31
--- core.py	1998/11/21 03:13:43
***************
*** 931,938 ****
          s = '<?xml version="1.0"?>\n'
          if self.documentType:
              s = s + self.documentType
!         if len(self._node.children):
!             n = self._node.children[0]
              n =  NODE_CLASS[ n.type ] (n, self, self)
              s = s + n.toxml()
          return s
--- 931,937 ----
          s = '<?xml version="1.0"?>\n'
          if self.documentType:
              s = s + self.documentType
!         for n in self._node.children:
              n =  NODE_CLASS[ n.type ] (n, self, self)
              s = s + n.toxml()
          return s


From uche.ogbuji@fourthought.com  Sat Nov 21 04:17:45 1998
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Fri, 20 Nov 1998 21:17:45 -0700
Subject: [XML-SIG] Announcement: 4DOM 0.6.1, an implementation of the W3C DOM Spec in
 Python
 Python
Message-ID: <199811210417.VAA15364@malatesta.local>

FourThought LLC (http://FourThought.com) announces the release of


                             4DOM 0.6.1
                      -----------------------
              A CORBA-aware implementation of the W3C's
		       Document Object Model
                            in Python


4DOM is a close implementation of the DOM, including DOM Core
level 1, DOM HTML level 1, and a few utility and helper components.
4DOM was designed from the start to work in a CORBA environment.
Currently, two ORB environments are supported, both open-source:
Fnorb and ILU.  One or the other is required.

4DOM is designed to allow developers rapidly design applications
that read, write or manipulate HTML and XML.

Changes since 0.6.0
===================

- added ILU support with a series of kludges
  (all designed to minimize effect on existing DOM code):

        o Use ILU's python-stubber in makefile rather than fnidl
        o python-stubber generates *IF__skel rather than fnidl's
          *IF_skel, so copy the files so bother names are available.
        o add config modules for DOM core and HTML, globally imported,
          which creates dummy INTERFACENAME_skel classes because ILU
          does not append "_skel" to skeleton class names as Fnorb
          does: it uses module-scoping for the distinction.
        o Add variables using Fnorb-style constant naming
          (INTERFACENAME.CONSTANTNAME) to refer to the ILU-style
          constants (INTERFACENAME_CONSTANTNAME)
        o Brutally hack all 4DOM source files during make to change
          Fnorb-style invocations for DOMException
          (raise DOMException(EXCEPTNAME))
          into ILU-style
          (raise DOMException, DOMException__omgidl_exctype(EXCEPTNAME))

- added the #pragma prefix "fourthought.com" to all IDL files
- Document.repr() now includes the DOCTYPE

More info and Obtaining 4DOM
============================

Please see

	http://OpenTechnology.org/projects/4DOM


4DOM is distributed under the terms of the GNU Library Public
License (LGPL).

	http://www.gnu.org/copyleft/lgpl.html


-- 
Uche Ogbuji
uche.ogbuji@fourthought.com
Consulting Member, FourThought LLC (Open Enterprise Architects)
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From kajiyama@etl.go.jp  Sun Nov 22 04:57:54 1998
From: kajiyama@etl.go.jp (Tamito Kajiyama)
Date: Sun, 22 Nov 98 04:57:54 JST
Subject: [XML-SIG] [Q] Namespace
Message-ID: <9811211957.AA19395@etlibs2.etl.go.jp>

Hi.

I'd like to do some experiments about RDF, so I'm writing a limited RDF
parser with the required XML namespace support.  I have questions about
XML namespace.

First, according to the specification (Subsection 5.2), "If the URI in a
default namespace declaration is empty, then unprefixed elements in the
scope of the declaration are not considered to be in any namespace."
I cannot understand what this means.  What should a parser do on such
unprefixed elements?  Really nothing to do?

Second, what is the initial value of the default namespace before the
first declaration of the default namespace appears?  Is the default
namespace undefined at first?  I think it is reasonable that the
namespace associated to the DTD of the XML document, if any, would be
the first default namespace.

Thank you,

-- 
KAJIYAMA, Tamito <kajiyama@etl.go.jp>


From larsga@ifi.uio.no  Sat Nov 21 20:10:11 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 21 Nov 1998 21:10:11 +0100
Subject: [XML-SIG] [Q] Namespace
In-Reply-To: <9811211957.AA19395@etlibs2.etl.go.jp>
References: <9811211957.AA19395@etlibs2.etl.go.jp>
Message-ID: <wkg1bclsd8.fsf@ifi.uio.no>

* Tamito Kajiyama
| 
| First, according to the specification (Subsection 5.2), "If the URI
| in a default namespace declaration is empty, then unprefixed
| elements in the scope of the declaration are not considered to be in
| any namespace."  I cannot understand what this means.  What should a
| parser do on such unprefixed elements?  Really nothing to do?
 
This is used for turning off a default namespace:

<root xmlns="http://default.namespace.uri/">
  <foo/>  <!-- The expanded name is here ns='http://default.namespace.uri/'
               and type='foo' -->
  <bar xmlns=""/> <!-- The expanded name is here ns="" and type="bar -->
</root>

| Second, what is the initial value of the default namespace before
| the first declaration of the default namespace appears?  Is the
| default namespace undefined at first? 

So it is. The namespace is not defined until an explicit declaration
appears, which is entirely reasonable to me, at least.

| I think it is reasonable that the namespace associated to the DTD of
| the XML document, if any, would be the first default namespace.

What do you mean by this? I can't follow you here.

--Lars M.


From kajiyama@etl.go.jp  Sun Nov 22 06:10:25 1998
From: kajiyama@etl.go.jp (Tamito Kajiyama)
Date: Sun, 22 Nov 98 06:10:25 JST
Subject: [XML-SIG] [Q] Namespace
In-Reply-To: <wkg1bclsd8.fsf@ifi.uio.no> (message from Lars Marius Garshol on 21 Nov 1998 21:10:11 +0100)
Message-ID: <9811212110.AA19430@etlibs2.etl.go.jp>

Lars Marius Garshol <larsga@ifi.uio.no> writes:
| 
| * Tamito Kajiyama
| | 
| | First, according to the specification (Subsection 5.2), "If the URI
| | in a default namespace declaration is empty, then unprefixed
| | elements in the scope of the declaration are not considered to be in
| | any namespace."  I cannot understand what this means.  What should a
| | parser do on such unprefixed elements?  Really nothing to do?
|  
| This is used for turning off a default namespace:
| 
| <root xmlns="http://default.namespace.uri/">
|   <foo/>  <!-- The expanded name is here ns='http://default.namespace.uri/'
|                and type='foo' -->
|   <bar xmlns=""/> <!-- The expanded name is here ns="" and type="bar -->
| </root>

What should a validating parser do on the unprefixed element `bar'?

| | Second, what is the initial value of the default namespace before
| | the first declaration of the default namespace appears?  Is the
| | default namespace undefined at first? 
| 
| So it is. The namespace is not defined until an explicit declaration
| appears, which is entirely reasonable to me, at least.

When the default namespace is undefined, what should a validating parser
do for unprefixed elements?

| | I think it is reasonable that the namespace associated to the DTD of
| | the XML document, if any, would be the first default namespace.
| 
| What do you mean by this? I can't follow you here.

I understand that a namespace has an associated schema (e.g. DTD, RDF
schema), and a parser validates a prefixed element by referring the
schema associated to the namespace prefix.  So, If an XML document has
its DTD specified by <!DOCTYPE ...>, there is a namespace associated to
the DTD.  Is my understanding correct?

Thank you,

-- 
KAJIYAMA, Tamito <kajiyama@etl.go.jp>


From larsga@ifi.uio.no  Sat Nov 21 21:28:29 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 21 Nov 1998 22:28:29 +0100
Subject: [XML-SIG] [Q] Namespace
In-Reply-To: <9811212110.AA19430@etlibs2.etl.go.jp>
References: <9811212110.AA19430@etlibs2.etl.go.jp>
Message-ID: <wkbtm0loqq.fsf@ifi.uio.no>

* Tamito Kajiyama
| 
| What should a validating parser do on the unprefixed element `bar'?
| 
| [...]
| 
| When the default namespace is undefined, what should a validating
| parser do for unprefixed elements?

Nothing. :)

If asked, it should reply that the namespace is undefined. Remember,
the namespace is just that, an identifier, and nothing more. It's just
used to be able to uniquely identify elements and attributes.
 
| I understand that a namespace has an associated schema (e.g. DTD,
| RDF schema), 

It does not. A namespace is just a URI used to distinguish a set of
names from all other names, globally.

Of course, in our minds there is usually an association between the
namespace and a schema/DTD, but the parser knows nothing of this.

| and a parser validates a prefixed element by referring the schema
| associated to the namespace prefix.

In fact, validation as defined in XML 1.0 does not work with
namespaces, which is a point against them. A prefixed element is
invalid if it was not declared with the prefix...

| So, If an XML document has its DTD specified by <!DOCTYPE ...>,
| there is a namespace associated to the DTD.  Is my understanding
| correct?

No. :) There is no requirement that there be any namespaces at all in
XML documents, and like I said namespaces and DTDs don't work very
well together.

What namespaces do is e.g to allow you to use the TITLE element from
both HTML and DocBook in the same DTD, and still be able to tell them
apart, by associating instances of the TITLE element type with different 
namespaces.

--Lars M.


From kajiyama@etl.go.jp  Sun Nov 22 07:46:14 1998
From: kajiyama@etl.go.jp (Tamito Kajiyama)
Date: Sun, 22 Nov 98 07:46:14 JST
Subject: [XML-SIG] [Q] Namespace
In-Reply-To: <wkbtm0loqq.fsf@ifi.uio.no> (message from Lars Marius Garshol on 21 Nov 1998 22:28:29 +0100)
Message-ID: <9811212246.AA19493@etlibs2.etl.go.jp>

Lars Marius Garshol <larsga@ifi.uio.no> writes:
| 
| A namespace is just a URI used to distinguish a set of names from all
| other names, globally.

I see.

| Of course, in our minds there is usually an association between the
| namespace and a schema/DTD, but the parser knows nothing of this.

What I want to do is some experiments about RDF.  In an RDF instance,
RDF schemas are specified using namespace.  So, I'm writing an RDF
parser that, for each namespace declaration, retrieves the RDF file
specified by the URI, parses it, and constructs an internal
representation of the RDF schema for further validation.  Is this good
practice?  It seems that my parser knows something about namespace...

| In fact, validation as defined in XML 1.0 does not work with
| namespaces, which is a point against them. A prefixed element is
| invalid if it was not declared with the prefix...

Hmm, it's surprising.  What will happen to XML 1.0 when the namespace
specification becomes a W3C recommendation?

Thank you,

-- 
KAJIYAMA, Tamito <kajiyama@etl.go.jp>


From larsga@ifi.uio.no  Sat Nov 21 23:02:42 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 22 Nov 1998 00:02:42 +0100
Subject: [XML-SIG] [Q] Namespace
In-Reply-To: <9811212246.AA19493@etlibs2.etl.go.jp>
References: <9811212246.AA19493@etlibs2.etl.go.jp>
Message-ID: <wk7lwolkdp.fsf@ifi.uio.no>

* Tamito Kajiyama
| 
| What I want to do is some experiments about RDF.  In an RDF
| instance, RDF schemas are specified using namespace.  So, I'm
| writing an RDF parser that, for each namespace declaration,
| retrieves the RDF file specified by the URI, parses it, and
| constructs an internal representation of the RDF schema for further
| validation.

Hmmm. Do you mean that you're writing your own XML parser, or are you
building on top of SAX, DOM or some parser API?

| Is this good practice?  

It sounds good to me, at least.

| It seems that my parser knows something about namespace...

Nothing wrong with that, you just have to keep the different layers of
the different specs separate in your mind (and parser :).
 
| What will happen to XML 1.0 when the namespace specification becomes
| a W3C recommendation?

Good question. I don't really know. A reasonable guess would be that
the SGML DTD syntax is ditched in favour of an XML-based syntax that
is namespace-aware. Or that both are retained. Of course, this means
that XML will have two different schema languages, only one of them
SGML-compatible. But, like I say, this is just a guess.

--Lars M.


From kajiyama@etl.go.jp  Sun Nov 22 08:24:57 1998
From: kajiyama@etl.go.jp (Tamito Kajiyama)
Date: Sun, 22 Nov 98 08:24:57 JST
Subject: [XML-SIG] [Q] Namespace
In-Reply-To: <wk7lwolkdp.fsf@ifi.uio.no> (message from Lars Marius Garshol on 22 Nov 1998 00:02:42 +0100)
Message-ID: <9811212324.AA19540@etlibs2.etl.go.jp>

Lars Marius Garshol <larsga@ifi.uio.no> writes:
| 
| * Tamito Kajiyama
| | 
| | What I want to do is some experiments about RDF.  In an RDF
| | instance, RDF schemas are specified using namespace.  So, I'm
| | writing an RDF parser that, for each namespace declaration,
| | retrieves the RDF file specified by the URI, parses it, and
| | constructs an internal representation of the RDF schema for further
| | validation.
| 
| Hmmm. Do you mean that you're writing your own XML parser, or are you
| building on top of SAX, DOM or some parser API?

I'm building my RDF parser on the top of SAX, using the Python XML
Package (version 0.4).

| | Is this good practice?  
| 
| It sounds good to me, at least.

Thank you for the kind replies, in spite of midnight ;-)

-- 
KAJIYAMA, Tamito <kajiyama@etl.go.jp>


From kajiyama@etl.go.jp  Mon Nov 23 01:44:35 1998
From: kajiyama@etl.go.jp (Tamito Kajiyama)
Date: Mon, 23 Nov 98 01:44:35 JST
Subject: [XML-SIG] [Q] SAX Exception
Message-ID: <9811221644.AA19959@etlibs2.etl.go.jp>

Hi.

I'm writing a parser using the SAX module, and want to raise exceptions
in the methods of saxlib.DocumentHandler (e.g. startElement) so that the
exceptions are handled by saxlib.ErrorHandler's error and fatalError
methods.  How can I achieve it?

-- 
KAJIYAMA, Tamito <kajiyama@etl.go.jp>


From larsga@ifi.uio.no  Sun Nov 22 17:09:08 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 22 Nov 1998 18:09:08 +0100
Subject: [XML-SIG] [Q] SAX Exception
In-Reply-To: <9811221644.AA19959@etlibs2.etl.go.jp>
References: <9811221644.AA19959@etlibs2.etl.go.jp>
Message-ID: <wkemqvk62z.fsf@ifi.uio.no>

* Tamito Kajiyama
| 
| I'm writing a parser using the SAX module, and want to raise
| exceptions in the methods of saxlib.DocumentHandler
| (e.g. startElement) so that the exceptions are handled by
| saxlib.ErrorHandler's error and fatalError methods.  How can I
| achieve it?

You can't achieve it that way. The SAX exception classes are for the
cases where the parser throws an exception instead of reporting the
error via a method. When the parser throws an exception it loses its
internal state and in these cases the parse is aborted.  (This also
applies when you throw an exception from inside a callback method.)

In other words, what you need to do is to call those methods directly,
which requires you to have a reference to the ErrorHandler yourself.
I'd recommend that you simply wrap the SAX driver completely so that
your clients have no direct access to it. That way you can keep track
of the ErrorHandler object.

--Lars M.


From H.Jansen@math.tudelft.nl  Mon Nov 23 08:32:01 1998
From: H.Jansen@math.tudelft.nl (Henk Jansen)
Date: Mon, 23 Nov 1998 09:32:01 +0100 (MET)
Subject: [XML-SIG] PCCTS in python: yapps.
Message-ID: <199811230832.JAA02994@dutita4.twi.tudelft.nl>

Amit Patel has built a python recursive/decendent parser-generator modeled after PCCTS
(http://theory.stanford.edu/~amitp/Yapps/). I'm using this tool currently for a
simulation modeling language and it is very nice and easy to understand tool indeed
(mainly, because it's all Python: no segmentation faults, bus errors etc. -- by the
way, I found ANTLR very slow in creating the grammar).

Personally, I would like to see more PCCTS-like features added to Yapps:

- LL(k), k>1
- semantic predicates 
- ...

and maybe some critical parts written as compliled modules (which maybe
could be borrowed from PCCTS...?)

Hope this will help in finding a suitable XML/DOM parser-generator.

Henk.


---------
Alcibiades, when at the dinner table of Agathon: """When we listen to anyone else
talking, however eloquent he is, we don't really care a damn what he says ... I
have heard Pericles and al the other great orators ... but they never ... turned
my soul upside down. But this [Socrates] has often left me in such a state of mind
that I have felt that I simply could not go on living the way I did. He makes me
admit that while I am spending my time on politics I am neglecting all the things
that are crying for attention in myself."""

-- 
  -----------------------------------------------------------------------
 | Henk Jansen   http://dutita0.twi.tudelft.nl/WAGM/people/H.Jansen.html |
 | Delft University of Technology             |  hjansen@math.tudelft.nl |
 | > Information Technoloy and Systems (ITS)  |      Mekelweg 4, 2628 CD |
 | >> Mathematics (TWI)                       |   Delft, The Netherlands |
 | >>> Applied Analysis (TA)                  | phone: +31(0)15-278-7295 |
 | >>>> Analysis of Large Scale Models (WAGM) | fax:   +31(0)15-278-7209 |
  -----------------------------------------------------------------------


From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 23 14:55:57 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 23 Nov 1998 09:55:57 -0500 (EST)
Subject: [XML-SIG] patch to core.py
Message-ID: <13913.30589.414085.745874@weyr.cnri.reston.va.us>

  I've attached a patch below that includes the last patch I sent
Friday evening (since it's not in yet; apply this to the CVS version), 
and fixes the childNodes attribute of the Document object.
  The Node.get_childNodes() method creates a NodeList which has
self.get_ownerDocument() as the owner.  When used from the Document
class, the owner is None, but the owner for the chilren is self.
Without the fix, using nodes accessed from
<Document instance>.childNodes could easily cause
WrongDocumentException to be raised.  This may not be a problem in a
typical application, where (I expect) the Document instance is mostly
used as node factory and source for Document.documentElement, but for
some particularly weird conversion scripts I'm working on where I'm
starting with multi-rooted documents, this can be a real problem.
  Yes, I know XML only allows a single root; that's one reason a
conversion script is needed!  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 23 15:08:42 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 23 Nov 1998 10:08:42 -0500 (EST)
Subject: [XML-SIG] Extended DOM interface proposal.
Message-ID: <13913.31354.802344.207085@weyr.cnri.reston.va.us>

  The toxml() method in the DOM implementation is convenient, but not
always what's needed.  There are two specific problems:  it creates a
string in memory with the entire document representation, and it can
only produce the XML form of the document.
  I'd also like to be able to generate an ESIS stream or SGML from the 
DOM, and I don't need the entire representation to be in memory.
  I propose the addition of three methods; these could be functions in 
a utility module just as easily.  Each method should accept a
file-like object that supports at least the write() method.  The
methods are:

	def write_esis(self, file):
	    """Write an ESIS stream on file."""

	def write_sgml(self, file, knownempties=[]):
	    """Write an SGML instance on file.  `knownempties' should
	    be a list of GIs of element types declared to be empty."""

	def write_xml(self, file):
	    """Write an XML instance on file."""

  Does anyone have any opinions as to whether these should be methods
or utility functions?  As I think about it, using functions may make
more sense, esp. since different functions may be needed for different 
SGML declarations.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From fredrik@pythonware.com  Mon Nov 23 15:30:59 1998
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 23 Nov 1998 16:30:59 +0100
Subject: [XML-SIG] Extended DOM interface proposal.
Message-ID: <000d01be16f6$46b33c70$f29b12c2@pythonware.com>

>  I propose the addition of three methods; these could be functions in 
>a utility module just as easily.  Each method should accept a
>file-like object that supports at least the write() method.  The
>methods are:
>
> def write_esis(self, file):
>     """Write an ESIS stream on file."""
>
> def write_sgml(self, file, knownempties=[]):
>     """Write an SGML instance on file.  `knownempties' should
>     be a list of GIs of element types declared to be empty."""
>
> def write_xml(self, file):
>     """Write an XML instance on file."""
>
>  Does anyone have any opinions as to whether these should be methods
>or utility functions?  As I think about it, using functions may make
>more sense, esp. since different functions may be needed for different 
>SGML declarations.

IMHO, using objects makes even more sense -- so why
not use the Visitor pattern?

1. add an accept method which takes any object
implementing the DOMVisitor class as its single
argument, and calls the appropriate methods on
that object.

2. the Visitor interface is probably identical to
the SAX API...

3. which leads us back to the DOM->SAX question...

Cheers /F
fredrik@pythonware.com
http://www.pythonware.com


From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 23 15:33:45 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 23 Nov 1998 10:33:45 -0500 (EST)
Subject: [XML-SIG] Extended DOM interface proposal.
In-Reply-To: <000d01be16f6$46b33c70$f29b12c2@pythonware.com>
References: <000d01be16f6$46b33c70$f29b12c2@pythonware.com>
Message-ID: <13913.32857.93445.571115@weyr.cnri.reston.va.us>

Fredrik Lundh writes:
 > IMHO, using objects makes even more sense -- so why
 > not use the Visitor pattern?

  This would be fine for me.

 > 1. add an accept method which takes any object
 > implementing the DOMVisitor class as its single

  The DOMVisitor *class*?  Shouldn't that be interface?  Or was it
protocol?  ;-)

 > 2. the Visitor interface is probably identical to
 > the SAX API...

  SAX is insufficient; I'd at least like to preserve comments.

 > 3. which leads us back to the DOM->SAX question...

  Yes.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From fredrik@pythonware.com  Mon Nov 23 15:48:33 1998
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 23 Nov 1998 16:48:33 +0100
Subject: [XML-SIG] Extended DOM interface proposal.
Message-ID: <002101be16f8$bb7f5fa0$f29b12c2@pythonware.com>

> > 1. add an accept method which takes any object
> > implementing the DOMVisitor class as its single
>
>  The DOMVisitor *class*?  Shouldn't that be interface?  Or was it
>protocol?  ;-)

behaviour!?

nah. I think I prefer interface. so here's the fix:

message = string.replace(
    message, "DOMVisitor class",
    "DOMVisitor interface"
    )

> > 2. the Visitor interface is probably identical to
> > the SAX API...
>
>  SAX is insufficient; I'd at least like to preserve comments.

extended SAX, anyone?

Cheers /F
fredrik@pythonware.com
http://www.pythonware.com


From akuchlin@cnri.reston.va.us  Mon Nov 23 19:24:53 1998
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon, 23 Nov 1998 14:24:53 -0500 (EST)
Subject: [XML-SIG] Extended DOM interface proposal.
In-Reply-To: <13913.31354.802344.207085@weyr.cnri.reston.va.us>
References: <13913.31354.802344.207085@weyr.cnri.reston.va.us>
Message-ID: <13913.46595.766607.862556@amarok.cnri.reston.va.us>

Fred L. Drake writes:
>  I propose the addition of three methods; these could be functions in 
>a utility module just as easily.  Each method should accept a
>file-like object that supports at least the write() method.  The
 ...
>  Does anyone have any opinions as to whether these should be methods
>or utility functions?  As I think about it, using functions may make
>more sense, esp. since different functions may be needed for different 
>SGML declarations.

	IMHO functions would be preferable, though it might be
workable if it only required a very small number of methods.  A small
number of methods doesn't appear likely, though, because of all the
many possible variations on output: SGML, XML, ESIS?  Pretty-printed
SGML/XML or not?  Etc...

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
Your son's head is valuable to you, and I am attached to mine. Indeed,
hitherto we have been inseparable.
    -- Lady Johanna Constantine, in SANDMAN #29: "Thermidor"


From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 23 20:33:35 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 23 Nov 1998 15:33:35 -0500 (EST)
Subject: [XML-SIG] patch to core.py
In-Reply-To: <13913.47103.230377.539181@amarok.cnri.reston.va.us>
References: <13913.30589.414085.745874@weyr.cnri.reston.va.us>
 <13913.47103.230377.539181@amarok.cnri.reston.va.us>
Message-ID: <13913.50847.502345.58043@weyr.cnri.reston.va.us>

Andrew M. Kuchling writes:
 > 	You didn't attach the patch...  On the other hand, I can
 > probably fix it myself from your description.

  Oops!  It's below.  Yeah, I know you could, but some people like
patches because they know there've been no typos introduced between
testing and the change information.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


Index: core.py
===================================================================
RCS file: /projects/cvsroot/xml/dom/core.py,v
retrieving revision 1.31
diff -c -c -r1.31 core.py
*** core.py	1998/11/21 02:48:12	1.31
--- core.py	1998/11/23 20:29:12
***************
*** 931,938 ****
          s = '<?xml version="1.0"?>\n'
          if self.documentType:
              s = s + self.documentType
!         if len(self._node.children):
!             n = self._node.children[0]
              n =  NODE_CLASS[ n.type ] (n, self, self)
              s = s + n.toxml()
          return s
--- 931,937 ----
          s = '<?xml version="1.0"?>\n'
          if self.documentType:
              s = s + self.documentType
!         for n in self._node.children:
              n =  NODE_CLASS[ n.type ] (n, self, self)
              s = s + n.toxml()
          return s
***************
*** 1038,1043 ****
--- 1037,1045 ----
          return self.documentType
      def get_implementation(self):
          return self.DOMImplementation
+ 
+     def get_childNodes(self):
+         return NodeList(self._node.children, self, self)
  
      def get_documentElement(self):
          """Return the root element of the Document object, or None


From larsga@ifi.uio.no  Tue Nov 24 17:52:41 1998
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Nov 1998 18:52:41 +0100
Subject: [XML-SIG] Patches to adr_parse.py and bookmark.py
Message-ID: <wk67c5ugeu.fsf@ifi.uio.no>

Hi!

I've been playing around a little with XBEL, thinking about making a
demo for a conference I'm going to in a couple of weeks. So far, what
I've done is to modify adr_parse.py to actually work with the latest
version of bookmark.py and to deal with command-line arguments and
also to modify bookmark.py to insert the XBEL public identifier.

Patch 1 (to adr_parse.py):

65c65
<             visited=parse_date(readfield(infile,"VISITED"))
---
>             parse_date(readfield(infile,"VISITED")) # Just throw this away
69c69
<             bms.add_folder(name,created,visited)
---
>             bms.add_folder(name,created)
78c78
<             bms.add_bookmark(name,created,visited,url)
---
>             bms.add_bookmark(name,created,visited,None,url)
87,88c87,106
<     bms=parse_adr(r"c:\programfiler\opera\opera3.adr")
<     bms.dump_xbel()
---
>     import sys
> 
>     if len(sys.argv)<2 or len(sys.argv)>3:
>         print
>         print "A simple utility to convert Opera bookmarks to XBEL."
>         print
>         print "Usage: "        
>         print "  adr_parse.py <adr-file> [<xbel-file>]"
>         sys.exit(1)        
>     
>     bms=parse_adr(sys.argv[1])
> 
>     if len(sys.argv)==3:
>         out=open(sys.argv[2],"w")
>         bms.dump_xbel(out)
>         out.close()
>     else:
>         bms.dump_xbel()
>         
>     # Done


Path 2 (to bookmark.py):

42c42
<                   '<!DOCTYPE xbel SYSTEM "xbel.dtd">\n'
---
>                   '<!DOCTYPE xbel PUBLIC "+//IDN python.org//DTD XML Bookmark Exchange Language 1.0//EN//XML" "xbel.dtd">\n'

--Lars M.


From akuchlin@cnri.reston.va.us  Wed Nov 25 15:18:55 1998
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Wed, 25 Nov 1998 10:18:55 -0500 (EST)
Subject: [XML-SIG] Patches to adr_parse.py and bookmark.py
In-Reply-To: <wk67c5ugeu.fsf@ifi.uio.no>
References: <wk67c5ugeu.fsf@ifi.uio.no>
Message-ID: <13916.5182.99234.556728@amarok.cnri.reston.va.us>

Lars Marius Garshol writes:
>I've been playing around a little with XBEL, thinking about making a
>demo for a conference I'm going to in a couple of weeks. So far, what
>I've done is to modify adr_parse.py to actually work with the latest
>version of bookmark.py and to deal with command-line arguments and
>also to modify bookmark.py to insert the XBEL public identifier.

	Thanks, Lars!  Patches applied, and they also inspired me to
go and fix ns_parse.py and lynx_parse.py accordingly.  (I'm not sure
what to do for msie_parse.py; anyone want to contribute the right
Windows incantation to find the user's bookmark file?)

	The week before the conference, I came out with a 0.5
pre-release, but things intervened, and the code has continued to
change after the pre-release.  I'd really like to get a 0.5 release
out, so I'll try to make another pre-release, in preparation for a new
release by next Monday.  This release would be announced outside the
XML-SIG a bit.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
Time itself flows on with constant motion, just like a river: for no more than
a river can the fleeting hour stand still. As wave is driven on by wave, and,
itself pursued, pursues the one before, so the moments of time at once flee
and follow, and are ever new.
    -- Ovid, _The Metamorphoses_


From MHammond@skippinet.com.au  Wed Nov 25 22:55:47 1998
From: MHammond@skippinet.com.au (Mark Hammond)
Date: Thu, 26 Nov 1998 09:55:47 +1100
Subject: [XML-SIG] Patches to adr_parse.py and bookmark.py
In-Reply-To: <13916.5182.99234.556728@amarok.cnri.reston.va.us>
Message-ID: <001701be18c6$be1a52e0$0801a8c0@bobcat>

> 	Thanks, Lars!  Patches applied, and they also inspired me to
> go and fix ns_parse.py and lynx_parse.py accordingly.  (I'm not sure
> what to do for msie_parse.py; anyone want to contribute the right
> Windows incantation to find the user's bookmark file?)

I could do this, but it will require the "win32api" module (for the
registry functions).  It is almost getting to the time where win32api
should be released by Guido!

If it is acceptable to have this dependency, then I will be happy to
make the change!  (It could obviously be done such that if "import
win32api" fails, we revert to the existing behaviour!)

Mark.


From akuchlin@cnri.reston.va.us  Wed Nov 25 23:04:34 1998
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Wed, 25 Nov 1998 18:04:34 -0500 (EST)
Subject: [XML-SIG] Patches to adr_parse.py and bookmark.py
In-Reply-To: <001701be18c6$be1a52e0$0801a8c0@bobcat>
References: <13916.5182.99234.556728@amarok.cnri.reston.va.us>
 <001701be18c6$be1a52e0$0801a8c0@bobcat>
Message-ID: <13916.35975.503487.722442@amarok.cnri.reston.va.us>

Mark Hammond writes:
>> go and fix ns_parse.py and lynx_parse.py accordingly.  (I'm not sure
>> what to do for msie_parse.py; anyone want to contribute the right
>> Windows incantation to find the user's bookmark file?)
>
>I could do this, but it will require the "win32api" module (for the
>registry functions).  It is almost getting to the time where win32api
>should be released by Guido!

	That dependency shouldn't be a problem, and we can
conditionalize it after checking sys.platform.  Anyone want to do this 
for IE on the Mac?  (IE on Unix probably isn't a concern; when I run
it on Solaris, it grabs the X server and then freezes, so I have to
telnet it and kill my whole session.  Doubt it has many users...)

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
People marry most happily with their own kind. The trouble lies in the fact
that people usually marry at an age where they do not really know what their
own kind is.
    -- Robertson Davies, _A Voice from the Attic_


From akuchlin@cnri.reston.va.us  Fri Nov 27 15:55:40 1998
From: akuchlin@cnri.reston.va.us (A.M. Kuchling)
Date: Fri, 27 Nov 1998 10:55:40 -0500
Subject: [XML-SIG] DOM walker class
Message-ID: <199811271555.KAA00331@mira.erols.com>

The walk() method of the DOM Walker class is defined as follows:

    def walk(self, root):
        if root.get_nodeType() == DOCUMENT_NODE:
	    c = root.get_documentElement()
	    assert c.get_nodeType() == ELEMENT_NODE
	    return self.walk1(c)
	else:
	    return self.walk1(root)

This behaves unexpectedly if the Document node has several children,
as might happen if there are PIs preceding or following the root
element.  Only the root element will be walked, missing any other
children of the root, which becomes apparent if you're walking the
tree in order to print it.  

     How should this be fixed?  One choice is to change the
DOCUMENT_NODE case to:
	      for c in root.get_childNodes():
                  self.walk1(c)

However, this change really makes the distinction between walk() and
walk1() unnecessary.  walk() is basically there as a wrapper for
walk1(), to get the root element if it's a Document node; if we just
traverse all the children, this is consistent for any node type so
walk() and walk1() could be collapsed into one function.  This will
break code that subclasses Walker and overrides walk() or walk1() with
something customized.
	  
What do people think should be done?  Just fix walk(), or merge walk()
and walk1()?

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
May you go safe, my friend, across that dizzy way / No wider than a hair, by
which your people go / From earth to Paradise; may you go safe today / With
stars and space above, and time and stars below.
    -- Lord Dunsany


From Mike.Olson@FourThought.com  Sat Nov 28 22:32:07 1998
From: Mike.Olson@FourThought.com (Mike Olson)
Date: Sat, 28 Nov 1998 17:32:07 -0500
Subject: [XML-SIG] DOM walker class
References: <199811271555.KAA00331@mira.erols.com>
Message-ID: <366079E7.BEC0C587@FourThought.com>


A.M. Kuchling wrote:

> However, this change really makes the distinction between walk() and
> walk1() unnecessary.  walk() is basically there as a wrapper for
> walk1(), to get the root element if it's a Document node; if we just
> traverse all the children, this is consistent for any node type so
> walk() and walk1() could be collapsed into one function.  This will
> break code that subclasses Walker and overrides walk() or walk1() with
> something customized.
>
> What do people think should be done?  Just fix walk(), or merge walk()
> and walk1()?
>

I think they should be merged.  The current solution also does not allow you to
print comments, doc types, or anything else outside of the root element....

If you are worried about breaking customizations on this interface, you could
define walk1 and just have it call walk until everyone gets thier code
modified....

> --
> A.M. Kuchling                   http://starship.skyport.net/crew/amk/
> May you go safe, my friend, across that dizzy way / No wider than a hair, by
> which your people go / From earth to Paradise; may you go safe today / With
> stars and space above, and time and stars below.
>     -- Lord Dunsany
>
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://www.python.org/mailman/listinfo/xml-sig


From H.Jansen@math.tudelft.nl  Mon Nov 30 09:49:24 1998
From: H.Jansen@math.tudelft.nl (Henk Jansen)
Date: Mon, 30 Nov 1998 10:49:24 +0100 (MET)
Subject: [XML-SIG] Re: XML-SIG digest, Vol 1 #156 - 1 msg
In-Reply-To: <199811291700.MAA18483@python.org> from "xml-sig-admin@python.org" at Nov 29, 98 12:00:43 pm
Message-ID: <199811300949.KAA23957@dutita4.twi.tudelft.nl>

> A.M. Kuchling wrote:
> 
> > However, this change really makes the distinction between walk() and
> > walk1() unnecessary.  walk() is basically there as a wrapper for
> > walk1(), to get the root element if it's a Document node; if we just
> > traverse all the children, this is consistent for any node type so
> > walk() and walk1() could be collapsed into one function.  This will
> > break code that subclasses Walker and overrides walk() or walk1() with
> > something customized.
> >
> > What do people think should be done?  Just fix walk(), or merge walk()
> > and walk1()?
> >
> 
> I think they should be merged.  The current solution also does not allow you to
> print comments, doc types, or anything else outside of the root element....
> 
> If you are worried about breaking customizations on this interface, you could
> define walk1 and just have it call walk until everyone gets thier code
> modified....

Since the DOM hierarchy resembles much of a general tree type, I wonder if the following walk method, which is quite general, could serve as a DOM tree walker interface (Note: I haven't checked the DOM code so I don't know how general it is already. I'm also not sure if it will break code or not.):

def walk (_, co):
    """
    Walk the tree with co a callable object having:
	 .atleaf()
	 .preorder()
	 .postorder()
    methods.
    """
    assert ... _co_has_these_methods_ ...
    _._walk (co)

def _walk (_, co, depth=1):
    __doc__ = Node.walk.__doc__
    for child in _.leaves:
	co.atleaf (child, depth)
    for child in _.branches:
        co.preorder (child, depth):
	child._walk (co, depth=depth+1)
	co.postorder (child, depth):
    return co


For instance, printing the tree goes as follows:


import StringIO

class TreeRepr:

    def __init__ (_):
	_.t = StringIO.StringIO ()

    def atleaf (_, child, depth):
         _.t.write ("--"*depth+`child`+'\n')

    def preorder (_, child, depth):
         _.t.write ("--"*depth+`child`+'\n')

    def postorder (_, child, depth):
        pass 


class Node:

    ... general tree node methods.

    def __repr__ (_)
        repr = TreeRepr()
        repr = _walk (repr)
        return repr._t.getvalue()


Dependig on the callable object, all sorts of functionality can be exploited when walking the tree: translation, checking, finding, etc.

The _walk procedure could be exteded with a "break condition":

def _walk (_, co, depth=1):
    __doc__ = Node.walk.__doc__
    for child in _.leaves:
	if co.atleaf (child, depth):
	    # stop walking the current list of leaves
	   break
    for child in _.branches:
        if co.preorder (child, depth):
	    # stop walking the current branch
	    break
	child._walk (co, depth=depth+1)
	if co.postorder (child, depth):
	    # stop walking the current branch
	    break
    return co

This means that walking the tree can be aborted when one of the co.<methods>
returns a true value.

Henk.

-- 
  -----------------------------------------------------------------------
 | Henk Jansen   http://dutita0.twi.tudelft.nl/WAGM/people/H.Jansen.html |
 | Delft University of Technology             |  hjansen@math.tudelft.nl |
 | > Information Technoloy and Systems (ITS)  |      Mekelweg 4, 2628 CD |
 | >> Mathematics (TWI)                       |   Delft, The Netherlands |
 | >>> Applied Analysis (TA)                  | phone: +31(0)15-278-7295 |
 | >>>> Analysis of Large Scale Models (WAGM) | fax:   +31(0)15-278-7209 |
  -----------------------------------------------------------------------


From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 30 20:41:26 1998
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 30 Nov 1998 15:41:26 -0500 (EST)
Subject: [XML-SIG] DOM walker class
In-Reply-To: <199811271555.KAA00331@mira.erols.com>
References: <199811271555.KAA00331@mira.erols.com>
Message-ID: <13923.758.240371.779901@weyr.cnri.reston.va.us>

A.M. Kuchling writes:
 > What do people think should be done?  Just fix walk(), or merge walk()
 > and walk1()?

  Merge the two to simply be walk().  This package doesn't have a 1.0
yet, so there's no compelling need to be particularly concerned about
backward compatibility.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191