From Fred L. Drake, Jr." The problem with the Grail web site has now been fixed. Please accept my apologies for a premature announcement. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From Stuart.Hungerford@cmis.CSIRO.AU Tue Nov 10 23:44:03 1998 From: Stuart.Hungerford@cmis.CSIRO.AU (Stuart Hungerford) Date: Wed, 11 Nov 1998 10:44:03 +1100 (EST) Subject: [XML-SIG] Looking for PyDOM examples... Message-ID: <199811102344.KAA12919@aquarius.act.cmis.CSIRO.AU> Folks, My project is using Python to take data out of a SQL Server database (via the ODBC stuff on Windows) and using PyDOM to generate XML output. I suspect strongly that there are idioms and techniques for using PyDOM that I'm not aware of. Does anyone have any good examples of PyDOM in use (beyond the demos in the 0.5 package)? Thanks, From Jack.Jansen@cwi.nl Wed Nov 11 15:36:46 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Wed, 11 Nov 1998 16:36:46 +0100 Subject: [XML-SIG] Dom in XML 0.5 package In-Reply-To: Message by Jeff.Johnson@stn.siemens.com , Thu, 5 Nov 1998 11:42:35 -0500 , <852566B3.005BBCC4.00@BI01.boca.ssc.siemens.com> Message-ID: Just today I switched to the XML CVS tree, but I'm getting more and more the idea that this may not have been such a bright move. I'm especially having problems with the dom stuff. Various things (like DcBuilder) have disappeared, even though the example scripts still try to use them. That is easily fixed, but what is more of a problem is that various modules use bits of various other modules that have disappeared. For instance, transformer.Transformer uses DOMFactory(), but DOMFactory() is nowhere to be found anymore... Also, with the way stuff is organized in the XML CVS tree it has become (to me, at least) unclear who is responsible for what, otherwise I could have mailed this message straight to the author. Is there a newer version of the dom stuff available? If so, where can I get it? -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From Jack.Jansen@cwi.nl Wed Nov 11 15:59:10 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Wed, 11 Nov 1998 16:59:10 +0100 Subject: [XML-SIG] Dom in XML 0.5 package In-Reply-To: Message by Jack Jansen , Wed, 11 Nov 1998 16:36:46 +0100 , Message-ID: > I'm especially having problems with the dom stuff. Various things (like > DcBuilder) have disappeared, even though the example scripts still try to use > them. That is easily fixed, but what is more of a problem is that various > modules use bits of various other modules that have disappeared. For instance, > transformer.Transformer uses DOMFactory(), but DOMFactory() is nowhere to be > found anymore... Following up on my own message: the whole Transformer class appears to be broken. It clearly seems to expect the old version of xml.dom.core... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From Jeff.Johnson@stn.siemens.com Thu Nov 12 16:42:17 1998 From: Jeff.Johnson@stn.siemens.com (Jeff.Johnson@stn.siemens.com) Date: Thu, 12 Nov 1998 11:42:17 -0500 Subject: [XML-SIG] DOM ProcessingInstruction problem Message-ID: <852566BA.005B93A9.00@BI01.boca.ssc.siemens.com> I came across an exception (see below) when calling core.Document.toxml(). I made a temporary fix to core.py which is at the bottom of this email. I don't know if the line I added is the way to fix the problem or if all references to "_node.target" should be changed to "_node.data". And please don't forget about adding support for comments (or tell me why you won't). Traceback (innermost last): File "cml.py", line 64, in ? test('cmlexc01.xml','cml.htm') File "cml.py", line 59, in test c.readSgml(fileNameIn) File "cml.py", line 23, in readSgml print "HI JEFF",self.sgml.toxml() File "C:\Python\xml\dom\core.py", line 931, in toxml s = s + n.toxml() File "C:\Python\xml\dom\core.py", line 670, in toxml s = s + n.toxml() File "C:\Python\xml\dom\core.py", line 670, in toxml s = s + n.toxml() File "C:\Python\xml\dom\core.py", line 670, in toxml s = s + n.toxml() File "C:\Python\xml\dom\core.py", line 670, in toxml s = s + n.toxml() File "C:\Python\xml\dom\core.py", line 670, in toxml s = s + n.toxml() File "C:\Python\xml\dom\core.py", line 901, in toxml return "" AttributeError: target def createProcessingInstruction(self, target, data): "Return a new ProcessingInstruction object." d = _nodeData(PROCESSING_INSTRUCTION_NODE) d.name = target d.value = data d.target = data # HAD TO ADD THIS LINE return ProcessingInstruction(d, None, self) Hoping-the-conference-ends-soon-so-the-cvs-tree-gets-updated-ly yours, Jeff From akuchlin@cnri.reston.va.us Sun Nov 15 21:44:48 1998 From: akuchlin@cnri.reston.va.us (A.M. Kuchling) Date: Sun, 15 Nov 1998 16:44:48 -0500 Subject: [XML-SIG] IPC7 results Message-ID: <199811152144.QAA00557@mira.erols.com> The XML-SIG session at IPC7 produced good results, giving us a future direction to move in. The issues I wanted to discuss were: 1) Anything need to be dropped from the package before 1.0? 2) Anything need to be added to the package before we can call it 1.0? 3) What to do about Unicode? 4) What do we do after 1.0? The near-term actions will be: * Lobby for adding sgmlop.c to 1.5.2, because it's generally useful and will remove some redundancy from the XML package. * Two critical issues for version 1.0 of the XML package are namespaces and Unicode. For Unicode support, we're going to include a Unicode type in the XML package, probably Martin von Löwis's wstring module. Namespace support will probably be added as an extension to SAX and the DOM interface; we'll have to discuss what this should look like. * More demos should be added that aren't small toy applications. * For interoperability, people want to be able to marshal Python data structures into XML. My misgivings about which DTD to support were answered by the response: "Support them all." * The XML package will be divided into a base and extension package. What we currently have will go into the base package; it'll be the minimal requirement for XML processing. The extension package will include higher-level stuff, such as code for any DTD we deem significant, XSL, XLink, and other things. The discussion about post-1.0 didn't produce any definite results, so we'll worry about that when the time comes. -- A.M. Kuchling http://starship.skyport.net/crew/amk/ A pig can learn more tricks than a dog, but has too much sense to want to do it. -- Robertson Davies, _The Table Talk of Samuel Marchbanks_ From larsga@ifi.uio.no Sun Nov 15 22:03:54 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 15 Nov 1998 23:03:54 +0100 Subject: [XML-SIG] IPC7 results In-Reply-To: <199811152144.QAA00557@mira.erols.com> References: <199811152144.QAA00557@mira.erols.com> Message-ID: * A. M. Kuchling | | * Two critical issues for version 1.0 of the XML package are | namespaces and Unicode. I'm not so sure that we need to worry about namespaces. From what I hear enthusiasm about them in the W3C is waning, nor does there seem to be all that much enthusiasm among implementors. | Namespace support will probably be added as an extension to SAX and | the DOM interface; we'll have to discuss what this should look like. The trouble is that it will be very hard (if at all possible) to do this without doing damage to backwards compatibility. In other words, we should wait and see what happens with SAX and DOM and then follow up on it. I think we can go ahead and do 1.0 without namespaces. Other than that everything looked good to me. I'll take a look at the wstring module you mentioned. --Lars M. (who wishes he could have been there...) From Jack.Jansen@cwi.nl Mon Nov 16 13:18:26 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Mon, 16 Nov 1998 14:18:26 +0100 Subject: [XML-SIG] IPC7 results In-Reply-To: Message by Lars Marius Garshol , 15 Nov 1998 23:03:54 +0100 , Message-ID: > I'm not so sure that we need to worry about namespaces. From what I > hear enthusiasm about them in the W3C is waning, nor does there seem > to be all that much enthusiasm among implementors. Oh? I know that _I_ am pretty enthusiastic about them, and envision using them for various things... > The trouble is that it will be very hard (if at all possible) to do > this without doing damage to backwards compatibility. This, I think, may not be so difficult if we specify a couple of things in advance. For instance (and this is just an example) I can envision that we specify that in DOM you should always check nodes for being of a type you understand before processing them. Then we could add namespaces to a later release of DOM by adding an API to tell which namespaces your app understands and hiding elements and attributes of other namespaces as different nodetypes. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From larsga@ifi.uio.no Mon Nov 16 13:37:16 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: Mon, 16 Nov 1998 14:37:16 +0100 Subject: [XML-SIG] IPC7 results In-Reply-To: References: Message-ID: <3.0.1.32.19981116143716.00de6400@ifi.uio.no> * Lars Marius Garshol > > I'm not so sure that we need to worry about namespaces. From what I > hear enthusiasm about them in the W3C is waning, nor does there seem > to be all that much enthusiasm among implementors. * Jack Jansen > > Oh? I know that _I_ am pretty enthusiastic about them, and envision using > them for various things... What kinds of things? And why are you enthusiastic about them? I envision a lot of pain in implementing them (if we are to do it properly) so I'd like to know why I have to suffer if I have to. :) * Lars Marius Garshol > > The trouble is that it will be very hard (if at all possible) to do > this without doing damage to backwards compatibility. * Jack Jansen > > This, I think, may not be so difficult if we specify a couple of things in > advance. For instance (and this is just an example) I can envision that we > specify that in DOM you should always check nodes for being of a type you > understand before processing them. Then we could add namespaces to a later > release of DOM by adding an API to tell which namespaces your app understands > and hiding elements and attributes of other namespaces as different > nodetypes. This sounds like a viable alternative, even if it is just a limited form of support. However, you can do exactly the same (and much more) with architectural forms, which we already have support for via Geir Oves xmlarch module. Why do you want to use namespaces instead? Also, perhaps we should add to the DOM implementations some standard way of inserting a SAX ParserFilter (something we should perhaps also work on) between the parser and the DOM. This would enable us to do automate things like removing whitespace, joining blocks of PCDATA that were separated by buffer boundaries in the parser, doing architectural processing, (for those who want it) doing namespace filtering, filtering out XLinks for special processing etc etc --Lars M. From gstein@lyra.org Mon Nov 16 13:46:22 1998 From: gstein@lyra.org (Greg Stein) Date: Mon, 16 Nov 1998 05:46:22 -0800 Subject: [XML-SIG] IPC7 results References: Message-ID: <36502CAE.31711751@lyra.org> Jack Jansen wrote: > > > I'm not so sure that we need to worry about namespaces. From what I > > hear enthusiasm about them in the W3C is waning, nor does there seem > > to be all that much enthusiasm among implementors. > > Oh? I know that _I_ am pretty enthusiastic about them, and envision using them > for various things... I very much agree. At IPC7, we noted that the WebDAV protocol *requires* namespaces, and that SMIL also requires namespaces. Since there are several Python projects that are based on these protocols, then it is quite a necessity to have namespace support. Further, I haven't seen anything about the W3C interest waning. Please corroborate that with a reference. When the WebDAV protocol was being processed for final call in the IETF, they made the WG update to the latest XML Namespaces proposal (WebDAV was still using the PI notation). I don't think they'd be so hard-core about the change if they felt namespaces were "on the out." > > The trouble is that it will be very hard (if at all possible) to do > > this without doing damage to backwards compatibility. > > This, I think, may not be so difficult if we specify a couple of things in > advance. For instance (and this is just an example) I can envision that we > specify that in DOM you should always check nodes for being of a type you > understand before processing them. Then we could add namespaces to a later > release of DOM by adding an API to tell which namespaces your app understands > and hiding elements and attributes of other namespaces as different nodetypes. Well, just a quick note: nobody suggested changing the SAX interface (if people seem to have received that impression from Andrew's email). It is very easy to have a teeny layer over SAX to process element and attribute names into name/namespace pairs. I have done this quite successfully within the callback from the Expat parser (see dav_xmlparse.c in my mod_dav distribution). Regarding the DOM: it should be possible to just attach a namespace URI attribute to each node and attribute object. Since just having the information available doesn't immediately imply the client will check it, the possibility of hiding nodes/attrs is quite interesting... Cheers, -g -- Greg Stein, http://www.lyra.org/ From gstein@lyra.org Mon Nov 16 13:49:05 1998 From: gstein@lyra.org (Greg Stein) Date: Mon, 16 Nov 1998 05:49:05 -0800 Subject: [XML-SIG] IPC7 results References: <3.0.1.32.19981116143716.00de6400@ifi.uio.no> Message-ID: <36502D51.3C5E713F@lyra.org> Lars Marius Garshol wrote: > > * Lars Marius Garshol > > > > I'm not so sure that we need to worry about namespaces. From what I > > hear enthusiasm about them in the W3C is waning, nor does there seem > > to be all that much enthusiasm among implementors. > > * Jack Jansen > > > > Oh? I know that _I_ am pretty enthusiastic about them, and envision using > > them for various things... > > What kinds of things? And why are you enthusiastic about them? > > I envision a lot of pain in implementing them (if we are to do it properly) > so I'd like to know why I have to suffer if I have to. :) Per my other email, SMIL is a protocol defined to use XML namespaces. CWI has been working on applications, for a long while now, that use SMIL (note that Jack works at CWI, and I'd guess *on* that project). WebDAV is no small potatoes either :-) -g -- Greg Stein, http://www.lyra.org/ From akuchlin@cnri.reston.va.us Mon Nov 16 14:31:18 1998 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 16 Nov 1998 09:31:18 -0500 (EST) Subject: [XML-SIG] IPC7 results In-Reply-To: References: <199811152144.QAA00557@mira.erols.com> Message-ID: <13904.13841.271323.414543@amarok.cnri.reston.va.us> Lars Marius Garshol writes: >I'm not so sure that we need to worry about namespaces. From what I >hear enthusiasm about them in the W3C is waning, nor does there seem >to be all that much enthusiasm among implementors. I'm surprised to hear that, since various standards are using namespaces. Sjoerd and Greg have already pointed out SMIL and DAV; I'll add RDF. >The trouble is that it will be very hard (if at all possible) to do >this without doing damage to backwards compatibility. Since it's already possible to do namespace handling "by hand" -- look for attributes like xmlns:??? in your elements, and keep track of them -- I was thinking of simply providing a new NamespaceAwareSAXHandler class that came with the namespace handling built-in. >Other than that everything looked good to me. I'll take a look at the >wstring module you mentioned. I added it to the CVS tree on Sunday evening. The module is simply built and installed when you compile the package, but nothing else has been modified to make use of it. -- A.M. Kuchling http://starship.skyport.net/crew/amk/ "Not even Kit Marlowe will be able to gainsay that." "You have not heard? Marlowe is dead, Will. He died in Deptford, three weeks back, of a knife wound to the head." -- Shakespeare and Dream, in SANDMAN #19: "A Midsummer Night's Dream" From Jack.Jansen@cwi.nl Mon Nov 16 15:13:27 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Mon, 16 Nov 1998 16:13:27 +0100 Subject: [XML-SIG] IPC7 results In-Reply-To: Message by "Andrew M. Kuchling" , Mon, 16 Nov 1998 09:31:18 -0500 (EST) , <13904.13841.271323.414543@amarok.cnri.reston.va.us> Message-ID: SMIL is indeed one of the reasons I want namespaces. SMIL doesn't require namespaces (as someone suggested), but we definitely want them to be able to incorporate our cmif-specific features in a SMIL document. And to answer Lars' question "why I don't use architectural forms": because I'm not familiar enough with them, I guess. Namespaces seem like a nice lightweight mechanism to allow easy reuse of standards. What I would like to do (i.e. what I would like us, as python-xml sig to do:-), before we go off and implement namespaces in the various python modules is to determine how people would want to use namespaces and how this would be facilitated in the API. (Or, perhaps better, to find out how other groups such as the DOM people envision doing this). I can think of a two ways in which I might want to treat unknown namespaces, and each would require a slightly different API in DOM (SAX probably isn't as much of a problem): - Pretend that stuff in unrecognized namespaces isn't there at all, - Treat stuff in unrecognized namespaces as opaque (i.e. leave it in the tree, but during transforms and such treat it as you would PCDATA) For known namespaces there are again various issues. I might want to treat one of the namespaces as "primary", where the tag/element names would be simple strings (backward compatible) and names from other namespaces are returned as "ns:elemname" or ("ns", "elemname"). But, for other applications I might want the namespaces to be treated pretty much separately. And, of course, there are probably quite a few applications that are happy enough if we just treat ":" as part of the identifier... (half a :-) -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From jtauber@jtauber.com Mon Nov 16 15:27:12 1998 From: jtauber@jtauber.com (James Tauber) Date: Mon, 16 Nov 1998 23:27:12 +0800 Subject: [XML-SIG] IPC7 results Message-ID: <005201be1175$9709cbc0$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Jack Jansen >And to answer Lars' question "why I don't use architectural forms": because >I'm not familiar enough with them, I guess. Namespaces seem like a nice >lightweight mechanism to allow easy reuse of standards. The whole point of namespaces is to enable me to distinguish my FOO from your (or SMIL's) FOO. That's all. If you want to do any more (like saying my FOO is the same as your BAR), then architectural forms are great. If you don't need any more, they are overkill. People who have a problem with namespaces seem to expect them to do more than they are actually intended for. James From wes@rishel.com Mon Nov 16 16:45:08 1998 From: wes@rishel.com (Wes Rishel) Date: Mon, 16 Nov 1998 08:45:08 -0800 Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results) In-Reply-To: <36502D51.3C5E713F@lyra.org> Message-ID: <000601be1180$78b58280$f98f2499@Wes> I am part of a team that is working on representing the Health Level-7 protocol in XML. This protocol is used by 90% of the hospitals in the US and in several countries in Europe and the Pacific Rim. The essence of the protocol is messages (clumps of data) that are transmitted among various systems in response to a trigger event, such as "the physician ordered a chest x-ray for the patient". We derive the clumps of data from an O-O model. The methodology, which predates our interest in XML, has always assumed a naming scope similar to one used in most programming languages, where this is not a problem. Patient data Person data name religion date of birth id number Physician data Person data name id number pager number This has presented a problem because using XML we can have only a single content model for Person data. Name spaces would have presented a clear and elegant solution. Surely we are not alone in this matter? Thanks, W From jtauber@jtauber.com Mon Nov 16 17:29:22 1998 From: jtauber@jtauber.com (James Tauber) Date: Tue, 17 Nov 1998 01:29:22 +0800 Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results) Message-ID: <00c201be1186$a87851e0$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Wes Rishel [...] > Patient data > Person data > name > religion > date of birth > id number > > Physician data > Person data > name > id number > pager number > >This has presented a problem because using XML we can have only a single >content model for Person data. Name spaces would have presented a clear and >elegant solution. > >Surely we are not alone in this matter? Namespaces are not the solution for context sensitive content models. Why not just have two separate element types? What do namespaces give you that this doesn't? If you want to associate the two types of person-data (so you can have an application that does things will both) just have a FIXED attributes (a la architectural forms): James From fleck@informatik.uni-bonn.de Mon Nov 16 18:26:17 1998 From: fleck@informatik.uni-bonn.de (Markus Fleck) Date: Mon, 16 Nov 1998 19:26:17 +0100 Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results) References: <00c201be1186$a87851e0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <36506E49.7F5D@informatik.uni-bonn.de> James Tauber wrote: > > > > What do namespaces give you that this doesn't? Let me quote Tim Berners-Lee again (from the WWW7 Conference): "You need to build a system that is futureproof; it's no good just making a modular system," he said. "You need to realize that your system is just going to be a module in some bigger system to come, and so you have to be part of something else, and it's a bit of a way of life." In other words, namespaces allow you to use globally unique identifiers without needing to revert to non-descriptive and ugly numerical UUIDs. So with namespaces, it would be possible to exchange or convert data from other hospitals that use differently defined "*-person-data" structures. Yours, Markus. -- //////////////////////////////////////////////////////////////////////////// Markus B Fleck - University of Bonn - CS Department IV - WHOIS MF5079 UNIX Administrator - comp.lang.python.announce Moderator "GNU Gather" Free Internet Groupware Project - http://cscw.net/gather/ \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ From akuchlin@cnri.reston.va.us Mon Nov 16 21:22:24 1998 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 16 Nov 1998 16:22:24 -0500 (EST) Subject: [XML-SIG] Dom in XML 0.5 package In-Reply-To: References: <852566B3.005BBCC4.00@BI01.boca.ssc.siemens.com> Message-ID: <13904.36750.540685.658864@amarok.cnri.reston.va.us> Jack Jansen writes: >Just today I switched to the XML CVS tree, but I'm getting more and more the >idea that this may not have been such a bright move. > >I'm especially having problems with the dom stuff. Various things (like >DcBuilder) have disappeared, even though the example scripts still try to use >them. That is easily fixed, but what is more of a problem is that various >modules use bits of various other modules that have disappeared. For instance, >transformer.Transformer uses DOMFactory(), but DOMFactory() is nowhere to be >found anymore... That doesn't surprise me; I've been almost exclusively working on core.py and ignoring the other things like walker.py and transformer.py and the demo scripts, only fixing them when people reported problems. Last night I checked in some changes that may have fixed some of the problems, but they haven't really been tested. >Also, with the way stuff is organized in the XML CVS tree it has become (to >me, at least) unclear who is responsible for what, otherwise I could have >mailed this message straight to the author. It's me; Stefane isn't responsible for it any more because I've mercilessly hacked up his code, making it practically a complete rewrite. -- A.M. Kuchling http://starship.skyport.net/crew/amk/ Today I live in the gray, muffled, smelless, puffy, tasteless half-world of those who have colds. -- Robertson Davies, _The Diary of Samuel Marchbanks_ From jtauber@jtauber.com Tue Nov 17 01:42:55 1998 From: jtauber@jtauber.com (James Tauber) Date: Tue, 17 Nov 1998 09:42:55 +0800 Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results) Message-ID: <001a01be11cb$99990c10$4a850786@ecn08.curtin.edu.au> >So with namespaces, it would be possible to exchange >or convert data from other hospitals that use >differently defined "*-person-data" structures. Yes, but this is not what the original poster said he was using namespaces for. His example had a context sensitive content model within the one DTD. James From wes@rishel.com Tue Nov 17 03:24:34 1998 From: wes@rishel.com (Wes Rishel) Date: Mon, 16 Nov 1998 19:24:34 -0800 Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results) In-Reply-To: <001a01be11cb$99990c10$4a850786@ecn08.curtin.edu.au> Message-ID: <002201be11d9$cc342f20$643cfad0@Wes> > -----Original Message----- > >So with namespaces, it would be possible to exchange > >or convert data from other hospitals that use > >differently defined "*-person-data" structures. > > > Yes, but this is not what the original poster said he was using namespaces > for. His example had a context sensitive content model within the one DTD. > Actually it is a large set of DTDs that will continue to evolve over the years. There are about 110 classes in the information model; the various permutations of them that would constitute the informatin structure for a single DTD runs to a much higher. W From wunder@infoseek.com Tue Nov 17 17:01:05 1998 From: wunder@infoseek.com (Walter Underwood) Date: Tue, 17 Nov 1998 09:01:05 -0800 Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results) In-Reply-To: <00c201be1186$a87851e0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <3.0.5.32.19981117090105.00af8c40@corp> >From: Wes Rishel >[...] > >> Patient data >> Person data >> name >> religion >> date of birth >> id number >> >> Physician data >> Person data >> name >> id number >> pager number >> >>This has presented a problem because using XML we can have only a single >>content model for Person data. Name spaces would have presented a clear and >>elegant solution. >> >>Surely we are not alone in this matter? You are not. This may be similar to the model in SNMP MIBs. Those are somewhat different from the usual object model. Basically, if a slot is used, it should mean the same thing, but you don't have to use all the slots. Sort of a cross between a data dictionary and an object model. And really hard to represent in existing object models! A different comment -- it sounds like you are trying to get the DTD to enforce the model, rather than just making something that can be parsed. There are lots and lots of constraints that cannot be expressed in a DTD (age is positive, these references form a tree), so enforcing the exact sub-elements of each element is just one more thing that can't be enforced in the DTD. Even if it was possible to specify what was legal, you couldn't specify that all elements must be there. Since you've got to do post-parsing checking anyway, trying to express too much stuff in the DTD is probably wasted effort. wunder Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://www.best.com/~wunder/ 1-408-543-6946 From larsga@ifi.uio.no Tue Nov 17 17:52:24 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 17 Nov 1998 18:52:24 +0100 Subject: [XML-SIG] Support for Validating Parsers In-Reply-To: <3.0.6.32.19981106092115.00927690@gpo.iol.ie> References: <3.0.6.32.19981106092115.00927690@gpo.iol.ie> Message-ID: * Sean Mc Grath | | 1) Run NSGMLS with os.system or os.open() and pick up the ESIS. | This can serve as input to PyDOM. (Has anyone done a SAX driver | for ESIS yet? If no, then I will offer to write one. (( | Dublin->Chicago->Houston should be plenty of flight time | for this!)). I have one that can read ESIS from files, SP and SP -wxml, but it can't handle error messages properly on Win32. I think the problem is caused by SP doing something strange to emulate stderr on Win32 where this doesn't exist at all. Once I can handle the error messages I will include this in the saxlib driver package. So far I've thought of these possible avenues on Win32: - redirect error msgs to a temporary file with the -f option, ignore the file and delete it afterwards - redirect error msgs to a temporary file with the -f option and check it between events for new errors - some secret ritual involving dead bats, black candles and a Bill Gates doll If anyone has any better ideas or has any details about the SP C++ source I'd be glad to know about them. Also, if people are impatient I can release it as it is. --Lars M. From larsga@ifi.uio.no Tue Nov 17 18:03:15 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 17 Nov 1998 19:03:15 +0100 Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results) In-Reply-To: <002201be11d9$cc342f20$643cfad0@Wes> References: <002201be11d9$cc342f20$643cfad0@Wes> Message-ID: * Wes Rishel | | Actually it is a large set of DTDs that will continue to evolve over | the years. There are about 110 classes in the information model; the | various permutations of them that would constitute the informatin | structure for a single DTD runs to a much higher. In this case I would recommend that you take a close look at architectural forms, which lend something that is somewhat reminiscent of OO inheritance to DTDs. Architectural forms also do many other things that may be useful to you. I would recommend anyone planning to do serious work with XML to take a look at architectural forms. (Just as I would recommend anyone doing serious programming to look at Python instead of stopping at Perl.) --Lars M. From larsga@ifi.uio.no Tue Nov 17 18:15:36 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 17 Nov 1998 19:15:36 +0100 Subject: [XML-SIG] IPC7 results In-Reply-To: <36502CAE.31711751@lyra.org> References: <36502CAE.31711751@lyra.org> Message-ID: * Greg Stein | | I very much agree. At IPC7, we noted that the WebDAV protocol | *requires* namespaces, [...] Since there are several Python projects | that are based on these protocols, then it is quite a necessity to | have namespace support. If it is a necessity then I guess we'll just have to go ahead and do it. I think the best way would be to make a SAX ParserFilter that does the namespace processing. When I say ParserFilter I'm thinking of something like the ParserFilters John Cowan made for Java-SAX. It would simply be a SAX DocumentHandler that rode on the back of other SAX parsers pretending to be a SAX parser to its clients. I already have code for doing some of these things in xmlproc (it's not used, but it's there). I can move it out into a filter and add a sketch of what's missing as well as making a sketch of the filters. | Regarding the DOM: it should be possible to just attach a namespace | URI attribute to each node and attribute object. Since just having | the information available doesn't immediately imply the client will | check it, the possibility of hiding nodes/attrs is quite | interesting... One way to do this might be to have a DOM extension module that used the factories to sneak in objects with the extra namespace attribute and extra methods for handling the objects. This module could also extend the builders to work with the filter. FYI: One can do an equivalent of this already with xmlarch. Just use xmlarch as a set of filters (one for each of your architectures) and you can build DOM trees from the filtered events for eacharchitecture. This requires no programming beyond setting up the filters, just some PIs and #FIXED attributes in your document and DTD. --Lars M. From larsga@ifi.uio.no Tue Nov 17 23:30:51 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 18 Nov 1998 00:30:51 +0100 Subject: [XML-SIG] ParserFilter proposal Message-ID: OK, I've now hacked together a proposal for a general SAX ParserFilter API, with implementations of two filters: 'keep character data together' and namespaces. (The latter is just a rough sketch riddled with 'FIXME' comments.) The whole thing is just a proposal, and consists of readable source with two simple demos with sample documents. You can download it as a 5k zip file from: Comments, anyone? Is this the way to do the SAX side of this? And, Geir Ove, what do you think? Could xmlarch be fitted into this as a ParserFilter? (Didn't have time to look at it.) --Lars M. From grove@infotek.no Wed Nov 18 08:27:56 1998 From: grove@infotek.no (Geir Ove Gronmo) Date: Wed, 18 Nov 1998 09:27:56 +0100 Subject: [XML-SIG] ParserFilter proposal Message-ID: <199811180827.JAA24079@mail.infotek.no> At 00:30 18.11.98 +0100, you wrote: >OK, I've now hacked together a proposal for a general SAX ParserFilter >API, with implementations of two filters: 'keep character data >together' and namespaces. (The latter is just a rough sketch riddled >with 'FIXME' comments.) The whole thing is just a proposal, and >consists of readable source with two simple demos with sample >documents. >Comments, anyone? Is this the way to do the SAX side of this? It this how its done in Java by John Cowan? I've not been able to check that out yet, but I will soon. >And, Geir Ove, what do you think? Could xmlarch be fitted into this as >a ParserFilter? (Didn't have time to look at it.) xmlarch could definately be fitted into a ParserFilter. I don't see any problems with this at all. Since xmlarch is written as a DocumentHandler, only minor modifications would probably have to be done. I originally wrote xmlarch as a wrapper around a Parser object, but soon realized that that was overkill. Only XML events from a DocumentHandler is needed to write an architectural forms processor. The next release of xmlarch is probably going to be independent of SAX. I've been thinking of removing the direct connection to SAX, and instead make it a more general module. Wrappers/plugins could then be written for SAX (both DocumentHandler and ParserFilter), DOM and other kinds of input. Geir O. From larsga@ifi.uio.no Wed Nov 18 08:45:08 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 18 Nov 1998 09:45:08 +0100 Subject: [XML-SIG] ParserFilter proposal In-Reply-To: <199811180827.JAA24079@mail.infotek.no> References: <199811180827.JAA24079@mail.infotek.no> Message-ID: * Geir Ove Gronmo | | It this how its done in Java by John Cowan? No, it's not. I've separated the filter and the DocumentHandler, while he has the filter implement DocumentHandler, AttributeList and the other handlers. He also lacks the factory stuff I did, but has at least settled on a policy with the other handlers. See: Should we align ourselves with his proposal? It's not turned into a standard, whether de facto (OK, Simon St.Laurent uses it) or de jure. | >And, Geir Ove, what do you think? Could xmlarch be fitted into this as | >a ParserFilter? (Didn't have time to look at it.) | | xmlarch could definately be fitted into a ParserFilter. | | I don't see any problems with this at all. Since xmlarch is written as a | DocumentHandler, only minor modifications would probably have to be done. | | I originally wrote xmlarch as a wrapper around a Parser object, but soon | realized that that was overkill. Only XML events from a DocumentHandler is | needed to write an architectural forms processor. | | The next release of xmlarch is probably going to be independent of SAX. | I've been thinking of removing the direct connection to SAX, and instead | make it a more general module. Wrappers/plugins could then be written for | SAX (both DocumentHandler and ParserFilter), DOM and other kinds of input. | | Geir O. | | | _______________________________________________ | XML-SIG maillist - XML-SIG@python.org | http://www.python.org/mailman/listinfo/xml-sig From larsga@ifi.uio.no Wed Nov 18 08:51:34 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 18 Nov 1998 09:51:34 +0100 Subject: [XML-SIG] ParserFilter proposal In-Reply-To: <199811180827.JAA24079@mail.infotek.no> References: <199811180827.JAA24079@mail.infotek.no> Message-ID: (Pardon the last email. It turns out that not only C-c C-c, but also C-c C-s sends emails in Gnus. That is, it did. I turned it off now.) * Geir Ove Gronmo | | It this how its done in Java by John Cowan? No, it's not. I've separated the filter and the DocumentHandler, while he has the filter implement DocumentHandler, Locator, AttributeList and the other handlers. I'm not sure I really like that approach. He also lacks the factory stuff I did, but has at least settled on a policy with the other handlers. See: Should we align ourselves with his proposal? It's not turned into a standard, whether de facto (OK, Simon St.Laurent uses it) or de jure. * Lars Marius Garshol | | And, Geir Ove, what do you think? Could xmlarch be fitted into this as | a ParserFilter? * Geir Ove Gronmo | | xmlarch could definately be fitted into a ParserFilter. That's a good sign, at least. :) | The next release of xmlarch is probably going to be independent of | SAX. I've been thinking of removing the direct connection to SAX, | and instead make it a more general module. Wrappers/plugins could | then be written for SAX (both DocumentHandler and ParserFilter), DOM | and other kinds of input. This reminds me: the Java people have made a DOM walker that fires SAX events, called DOMParser. Is this something we want? --Lars M. From grove@infotek.no Wed Nov 18 08:54:37 1998 From: grove@infotek.no (Geir Ove Gronmo) Date: Wed, 18 Nov 1998 09:54:37 +0100 Subject: [XML-SIG] ParserFilter proposal In-Reply-To: References: <199811180827.JAA24079@mail.infotek.no> <199811180827.JAA24079@mail.infotek.no> Message-ID: <199811180854.JAA24275@mail.infotek.no> At 09:45 18.11.98 +0100, you wrote: >| It this how its done in Java by John Cowan? > >No, it's not. I've separated the filter and the DocumentHandler, while >he has the filter implement DocumentHandler, AttributeList and the >other handlers. Yes, I think that's a good thing to do. >He also lacks the factory stuff I did, but has at least settled on a >policy with the other handlers. > >See: > > >Should we align ourselves with his proposal? It's not turned into a >standard, whether de facto (OK, Simon St.Laurent uses it) or de jure. I don't see a need to do that. Your proposal seems to be superior to the one written in Java. Perhaps someone should write a Java version of the Python ParserFilter? :-) Geir O. From jdnier@execpc.com Wed Nov 18 15:00:04 1998 From: jdnier@execpc.com (David Niergarth) Date: Wed, 18 Nov 1998 09:00:04 -0600 (CST) Subject: [XML-SIG] Support for Validating Parsers In-Reply-To: Message-ID: On 17 Nov 1998, Lars Marius Garshol wrote: > I have one that can read ESIS from files, SP and SP -wxml, but it > can't handle error messages properly on Win32. I think the problem is > caused by SP doing something strange to emulate stderr on Win32 where > this doesn't exist at all. Are you "handling" the errors or ignoring the errors? If you have parsing errors (as opposed to warnings) seems like you'd usually need/want to fix them first, then make esis again. > - redirect error msgs to a temporary file with the -f option, ignore > the file and delete it afterwards > - redirect error msgs to a temporary file with the -f option and check > it between events for new errors I don't know a way to suppress errors (like -s). You can limit the maximum number of errors reported with -E. Unfortunately, -E0 means "no limit", however, -E1 at least keeps the error output to a minimum. You can play with redirection, in WinNT, e.g., nsgmls file 2>&0 > esis_file 2 is error and I think 0 is stdin, which makes this twisted, but the error "stream" effectively disappears. This only works on NT (cmd.exe); in 95/98 you'll end up with a file called &0! If there were only a /dev/null.... --David Niergarth From larsga@ifi.uio.no Wed Nov 18 15:23:45 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 18 Nov 1998 16:23:45 +0100 Subject: [XML-SIG] Support for Validating Parsers In-Reply-To: References: Message-ID: * Lars Marius Garshol | | I have one that can read ESIS from files, SP and SP -wxml, but it | can't handle error messages properly on Win32. I think the problem | is caused by SP doing something strange to emulate stderr on Win32 | where this doesn't exist at all. * David Niergarth | | Are you "handling" the errors or ignoring the errors? Both, in a sense. I want to detect them and report them to the SAX ErrorHandler. The problem is that I need to interleave them with the other callback events. Ignoring them is a only last resort on Win32 if all else fails (or possibly reporting them after all the data events). | If you have parsing errors (as opposed to warnings) seems like you'd | usually need/want to fix them first, then make esis again. Of course, but this is something the user must deal with after getting the error messages from his/her application through the SAX driver. | I don't know a way to suppress errors (like -s). Well, I don't really want to suppress the errors, since the ErrorHandler should be told about them. What I want is for them to appear interleaved in the normal ESIS stream as they do when nsgmls is run from the command line. However, os.popen fails to accomplish this, instead the errors are still printed to the console, while everything else is forwarded to my SAX driver. --Lars M. From wes@rishel.com Wed Nov 18 16:12:41 1998 From: wes@rishel.com (Wes Rishel) Date: Wed, 18 Nov 1998 08:12:41 -0800 Subject: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 results) In-Reply-To: <3.0.5.32.19981117090105.00af8c40@corp> Message-ID: <000901be130e$455c4f60$28862499@Wes> > -----Original Message----- > From: xml-sig-admin@python.org [mailto:xml-sig-admin@python.org]On > Behalf Of Walter Underwood > Sent: Tuesday, November 17, 1998 9:01 AM > To: xml-sig@python.org > Subject: Re: [XML-SIG] Value of namespaces (was RE: [XML-SIG] IPC7 > results) > > A different comment -- it sounds like you are trying to get the > DTD to enforce the model, rather than just making something that > can be parsed. There are lots and lots of constraints that cannot > be expressed in a DTD (age is positive, these references form a > tree), so enforcing the exact sub-elements of each element is > just one more thing that can't be enforced in the DTD. Even if > it was possible to specify what was legal, you couldn't specify > that all elements must be there. > > Since you've got to do post-parsing checking anyway, trying to > express too much stuff in the DTD is probably wasted effort. We agree with you completely. Indeed, the debate that keeps resurfacing in our discussions is whether there is any substantial benefit to using a validating parser. At least in our context, which includes XML and other representations of the same content, we have to have a metamodel that essentially duplicates the content model anyway. From akuchlin@cnri.reston.va.us Wed Nov 18 17:22:12 1998 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Wed, 18 Nov 1998 12:22:12 -0500 (EST) Subject: [XML-SIG] Value of namespaces In-Reply-To: <000901be130e$455c4f60$28862499@Wes> References: <3.0.5.32.19981117090105.00af8c40@corp> <000901be130e$455c4f60$28862499@Wes> Message-ID: <13906.62735.313166.784251@amarok.cnri.reston.va.us> Wes Rishel writes: >Indeed, the debate that keeps resurfacing in our discussions is whether >there is any substantial benefit to using a validating parser. At least in >our context, which includes XML and other representations of the same >content, we have to have a metamodel that essentially duplicates the content >model anyway. And that's perfectly all right. There's no rule that says you must have a DTD for your XML documents, and for some applications you may only care about well-formedness. You lose something, in that the only thing that can verify the correctness of your XML documents is custom-written code, and it may not be obvious what the code accepts, and XML editors can't use the DTD to assist the author, but those factors might not be important in some cases. For example, we have an XML format for representing process steps, and no effort has been made to write a DTD yet, because the current structuring is preliminary at best. -- A.M. Kuchling http://starship.skyport.net/crew/amk/ My life is strobed like lightning by a follow-spot, and looking backwards I can only see the corpses of the animals and birds who strutted with me on the darkened stage and helped me fool them all. -- Zatara, in BOOKS OF MAGIC #1 From stuart.hungerford@cmis.csiro.au Wed Nov 18 23:02:50 1998 From: stuart.hungerford@cmis.csiro.au (Stuart Hungerford) Date: Thu, 19 Nov 1998 10:02:50 +1100 Subject: [XML-SIG] Looking for substantial PyDOM examples... Message-ID: <4.1.19981119095926.00a89e40@mailhost.act.cmis.csiro.au> Hi all, I've been experimenting with PyDOM in the 0.5 XML stuff release, and I'm starting to feel that I may not be making the best use of the DOM in this Python-flavoured implementation. Can some kind person share some examples of PyDOM being used for some non-trivial chores with me? Stu ----------------------------------------------------------------------- Stuart.Hungerford@cmis.csiro.au Voice : +61 2 62167061 CSIRO Mathematical and Information Sciences Fax : +61 2 62167111 Canberra, AUSTRALIA GPO Box 664, Canberra, ACT 2601 From Fred L. Drake, Jr." References: Message-ID: <13907.22123.832713.317168@weyr.cnri.reston.va.us> David Niergarth writes: > You can play with redirection, in WinNT, e.g., > > nsgmls file 2>&0 > esis_file You should be able to use 2>NUL to send errors to the equivalent of /dev/null. Haven't test, though: that would require at least a 90 degree chair rotation! ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From jdnier@execpc.com Thu Nov 19 17:03:42 1998 From: jdnier@execpc.com (David Niergarth) Date: Thu, 19 Nov 1998 11:03:42 -0600 Subject: [XML-SIG] Support for Validating Parsers Message-ID: <002801be13de$90b3ea10$c56ccfa9@ep3> Lars Marius Garshol: >What I want is for them to appear interleaved in the normal ESIS >stream as they do when nsgmls is run from the command line. However, >os.popen fails to accomplish this, instead the errors are still >printed to the console, while everything else is forwarded to my SAX >driver. I couldn't get the interleaved behavior seen when running from the command line but the following prepends the error messages to the esis: (Usefull? Not sure if you'll get same behavior in W95/98.) >>> s = os.popen("nsgmls -wxml com_err.xml 2>>&1", "r") >>> print s.read()[0:300] nsgmls:com_sm.xml:6:17:E: end tag for element "PP" which is not open nsgmls:com_sm.xml:7:5:E: end tag for "P" omitted, but OMITTAG NO was specified nsgmls:com_sm.xml:6:2: start tag was here ?xml version="1.0" (PLAY (TITLE -The Comedy of Errors )TITLE (FM (P -FM Text.\n\012\011 )P )FM (PERSONAE (TITL >>> Fred Drake: > You should be able to use 2>NUL to send errors to the equivalent of > /dev/null. Haven't test, though: that would require at least a 90 > degree chair rotation! ;-) No need to get up; it works as advertized. (Wow, never knew you could do that!-) --David Niergarth From Fred L. Drake, Jr." References: <002801be13de$90b3ea10$c56ccfa9@ep3> Message-ID: <13908.20627.117681.313512@weyr.cnri.reston.va.us> I wrote: > /dev/null. Haven't test, though: that would require at least a 90 > degree chair rotation! ;-) David Niergarth writes: > No need to get up; it works as advertized. (Wow, never knew you could do Get up?? No, this is a swivel chair... I'm still tightly bound to my mailer, recovering from the conferance. Swivelling is simply too much of a distraction! ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From uche.ogbuji@fourthought.com Fri Nov 20 09:32:58 1998 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Fri, 20 Nov 1998 02:32:58 -0700 Subject: [XML-SIG] DOM Walker -> SAX In-Reply-To: Your message of "18 Nov 1998 09:51:34 +0100." Message-ID: <199811200933.CAA00568@malatesta.local> > This reminds me: the Java people have made a DOM walker that fires SAX > events, called DOMParser. Is this something we want? It sound interesting, but I'm at a loss to think up a serious need. All I can think of is if a user had invested a lot of effort in an app that was originally designed to parse XML, that now needs to be plugged into the output of another app that manipulates DOM-objects. But is this a significant enough need to provide more than the obvious solution of walking the DOM tree to print out the doc, and then feeding this to the SAX app? Perhaps I'm missing something. -- Uche Ogbuji uche.ogbuji@fourthought.com (970)481-0805 Consulting Member, FourThought LLC (Open Enterprise Architects) Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From Fred L. Drake, Jr." nsgmls can output "e" events if the next element has a declared content type of EMPTY when given the "-oempty" option. This is really only interesting if we're generating an SGML output document, but I found it useful to generate them from a LaTeX->ESIS conversion tool I've been playing with. Feeding them to EsisBuilder should not cause an exception; they can be safely ignored. The patch below implements this. Andrew, please merge this with the CVS tree. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 Index: esis_builder.py =================================================================== RCS file: /projects/cvsroot/xml/dom/esis_builder.py,v retrieving revision 1.2 diff -c -c -r1.2 esis_builder.py *** esis_builder.py 1998/11/18 23:57:12 1.2 --- esis_builder.py 1998/11/20 15:53:51 *************** *** 60,67 **** elif event == 'C': return else: ! sys.stderr.write('Unknow event: ' + `line` + '\n') backslash = r"\\" --- 60,73 ---- elif event == 'C': return + elif event == 'e': + # Indicates that this is an empty element; + # only produced by nsgmls for -oempty. We + # can safely ignore it. + pass + else: ! sys.stderr.write('Unknown event: ' + `line` + '\n') backslash = r"\\" From Fred L. Drake, Jr." This patch fixes the Document.createElement() interface to accept either or both a dictionary of attribute name/value pairs or keywords on the command line. This fixes problems with the EsisBuilder as well as making the interface more flexible. Document.toxml() now uses all its children to generate the XML representation; this ensures that processing instructions and comments will be included exactly as they are included in the tree. Andrew, please integrate these with the CVS repo. This is just what we talked about earlier today. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 Index: core.py =================================================================== RCS file: /projects/cvsroot/xml/dom/core.py,v retrieving revision 1.30 diff -c -c -r1.30 core.py *** core.py 1998/11/16 03:52:02 1.30 --- core.py 1998/11/20 22:45:11 *************** *** 931,943 **** s = '\n' if self.documentType: s = s + self.documentType ! if len(self._node.children): ! n = self._node.children[0] n = NODE_CLASS[ n.type ] (n, self, self) s = s + n.toxml() return s ! def createElement(self, tagName, **kwdict): "Return a new Element object." d = _nodeData(ELEMENT_NODE) --- 931,942 ---- s = '\n' if self.documentType: s = s + self.documentType ! for n in self._node.children: n = NODE_CLASS[ n.type ] (n, self, self) s = s + n.toxml() return s ! def createElement(self, tagName, dict={}, **kwdict): "Return a new Element object." d = _nodeData(ELEMENT_NODE) *************** *** 945,950 **** --- 944,950 ---- d.value = None d.attributes = {} elem = Element(d, None, self) + kwdict.update(dict) for name, value in kwdict.items(): elem.setAttribute(name, value) return elem From dkuhlman@enterpriselink.com Fri Nov 20 23:35:46 1998 From: dkuhlman@enterpriselink.com (Dave Kuhlman) Date: Fri, 20 Nov 1998 15:35:46 -0800 Subject: [XML-SIG] DOM Walker -> SAX References: <199811200933.CAA00568@malatesta.local> Message-ID: <3655FCD1.5384C0C2@EnterpriseLink.com> Those of you who are interested in tree walking might want to look at PCCTS. PCCTS (Perdue compiler construction tool set, but now called ANTLR, see http://www.ANTLR.org/) is intended as a replacement for yacc/lex, the UNIX parser generators. The PCCTS distribution also contains Sorcerer. PCCTS is used to generate a parser that builds a parse tree. Sorcerer is used to generate a "tree parser" that can be used to walk the parse tree and produce an abstract syntax tree with annotated nodes. The idea is to use Sorcerer to produce tree transformations. I can see a use for a similar tool when processing XML: Use the DOM parser to build a DOM tree, which is application neutral. Then use the tree walker to transform the DOM tree into a new tree that is application specific and is tailored for use by the application code. The tree walker is actually a set of rules that describe how to recognize nodes (branches ?) in the DOM and how to transform that node or branch into an application specific node or branch. As an example, I recently wrote a Java XML SAX-based parser built using Aelfred that creates a tree structure of instances of Java classes that I have defined and implemented. The tree represents a Web page which contains input items which contain style information, etc. In this parser application I had to create each object or node in the tree, fill in member variables (e.g. from attributes in XML element for the object), and insert it into the tree. For a future project I can dream about being able to define a transformation on the nodes in a DOM that would produce the nodes/objects in my tree structure. Admittedly, this task would have been much easier in Python than in Java. But, it might be easier still and also more orderly using a tree match and transformation tool. Maybe this is why Uche is "at a loss". Python makes this kind of work too easy. But, put youself in the shoes of someone struggling with a low level language like Java ... -- Dave uche.ogbuji@fourthought.com wrote: > > > This reminds me: the Java people have made a DOM walker that fires SAX > > events, called DOMParser. Is this something we want? > > It sound interesting, but I'm at a loss to think up a serious need. All I can > think of is if a user had invested a lot of effort in an app that was > originally designed to parse XML, that now needs to be plugged into the output > of another app that manipulates DOM-objects. But is this a significant enough > need to provide more than the obvious solution of walking the DOM tree to > print out the doc, and then feeding this to the SAX app? > > Perhaps I'm missing something. > > -- > Uche Ogbuji > uche.ogbuji@fourthought.com (970)481-0805 > Consulting Member, FourThought LLC (Open Enterprise Architects) > Software engineering, project management, Intranets and Extranets > http://FourThought.com http://OpenTechnology.org > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Dave Kuhlman EnterpriseLink Technology Corp http://www.enterpriselink.com 2542 S. Bascom Ave., Suite #203 Campbell, CA 95008 dkuhlman@EnterpriseLink.com 408-558-2011 From Fred L. Drake, Jr." The patch below didn't appear to make it into the update; this is what allows nodes other than the documentElement to get written by the Document.toxml() method. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 Index: core.py =================================================================== RCS file: /projects/cvsroot/xml/dom/core.py,v retrieving revision 1.31 diff -c -c -r1.31 core.py *** core.py 1998/11/21 02:48:12 1.31 --- core.py 1998/11/21 03:13:43 *************** *** 931,938 **** s = '\n' if self.documentType: s = s + self.documentType ! if len(self._node.children): ! n = self._node.children[0] n = NODE_CLASS[ n.type ] (n, self, self) s = s + n.toxml() return s --- 931,937 ---- s = '\n' if self.documentType: s = s + self.documentType ! for n in self._node.children: n = NODE_CLASS[ n.type ] (n, self, self) s = s + n.toxml() return s From uche.ogbuji@fourthought.com Sat Nov 21 04:17:45 1998 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Fri, 20 Nov 1998 21:17:45 -0700 Subject: [XML-SIG] Announcement: 4DOM 0.6.1, an implementation of the W3C DOM Spec in Python Python Message-ID: <199811210417.VAA15364@malatesta.local> FourThought LLC (http://FourThought.com) announces the release of 4DOM 0.6.1 ----------------------- A CORBA-aware implementation of the W3C's Document Object Model in Python 4DOM is a close implementation of the DOM, including DOM Core level 1, DOM HTML level 1, and a few utility and helper components. 4DOM was designed from the start to work in a CORBA environment. Currently, two ORB environments are supported, both open-source: Fnorb and ILU. One or the other is required. 4DOM is designed to allow developers rapidly design applications that read, write or manipulate HTML and XML. Changes since 0.6.0 =================== - added ILU support with a series of kludges (all designed to minimize effect on existing DOM code): o Use ILU's python-stubber in makefile rather than fnidl o python-stubber generates *IF__skel rather than fnidl's *IF_skel, so copy the files so bother names are available. o add config modules for DOM core and HTML, globally imported, which creates dummy INTERFACENAME_skel classes because ILU does not append "_skel" to skeleton class names as Fnorb does: it uses module-scoping for the distinction. o Add variables using Fnorb-style constant naming (INTERFACENAME.CONSTANTNAME) to refer to the ILU-style constants (INTERFACENAME_CONSTANTNAME) o Brutally hack all 4DOM source files during make to change Fnorb-style invocations for DOMException (raise DOMException(EXCEPTNAME)) into ILU-style (raise DOMException, DOMException__omgidl_exctype(EXCEPTNAME)) - added the #pragma prefix "fourthought.com" to all IDL files - Document.repr() now includes the DOCTYPE More info and Obtaining 4DOM ============================ Please see http://OpenTechnology.org/projects/4DOM 4DOM is distributed under the terms of the GNU Library Public License (LGPL). http://www.gnu.org/copyleft/lgpl.html -- Uche Ogbuji uche.ogbuji@fourthought.com Consulting Member, FourThought LLC (Open Enterprise Architects) Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From kajiyama@etl.go.jp Sun Nov 22 04:57:54 1998 From: kajiyama@etl.go.jp (Tamito Kajiyama) Date: Sun, 22 Nov 98 04:57:54 JST Subject: [XML-SIG] [Q] Namespace Message-ID: <9811211957.AA19395@etlibs2.etl.go.jp> Hi. I'd like to do some experiments about RDF, so I'm writing a limited RDF parser with the required XML namespace support. I have questions about XML namespace. First, according to the specification (Subsection 5.2), "If the URI in a default namespace declaration is empty, then unprefixed elements in the scope of the declaration are not considered to be in any namespace." I cannot understand what this means. What should a parser do on such unprefixed elements? Really nothing to do? Second, what is the initial value of the default namespace before the first declaration of the default namespace appears? Is the default namespace undefined at first? I think it is reasonable that the namespace associated to the DTD of the XML document, if any, would be the first default namespace. Thank you, -- KAJIYAMA, Tamito From larsga@ifi.uio.no Sat Nov 21 20:10:11 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 21 Nov 1998 21:10:11 +0100 Subject: [XML-SIG] [Q] Namespace In-Reply-To: <9811211957.AA19395@etlibs2.etl.go.jp> References: <9811211957.AA19395@etlibs2.etl.go.jp> Message-ID: * Tamito Kajiyama | | First, according to the specification (Subsection 5.2), "If the URI | in a default namespace declaration is empty, then unprefixed | elements in the scope of the declaration are not considered to be in | any namespace." I cannot understand what this means. What should a | parser do on such unprefixed elements? Really nothing to do? This is used for turning off a default namespace: | Second, what is the initial value of the default namespace before | the first declaration of the default namespace appears? Is the | default namespace undefined at first? So it is. The namespace is not defined until an explicit declaration appears, which is entirely reasonable to me, at least. | I think it is reasonable that the namespace associated to the DTD of | the XML document, if any, would be the first default namespace. What do you mean by this? I can't follow you here. --Lars M. From kajiyama@etl.go.jp Sun Nov 22 06:10:25 1998 From: kajiyama@etl.go.jp (Tamito Kajiyama) Date: Sun, 22 Nov 98 06:10:25 JST Subject: [XML-SIG] [Q] Namespace In-Reply-To: (message from Lars Marius Garshol on 21 Nov 1998 21:10:11 +0100) Message-ID: <9811212110.AA19430@etlibs2.etl.go.jp> Lars Marius Garshol writes: | | * Tamito Kajiyama | | | | First, according to the specification (Subsection 5.2), "If the URI | | in a default namespace declaration is empty, then unprefixed | | elements in the scope of the declaration are not considered to be in | | any namespace." I cannot understand what this means. What should a | | parser do on such unprefixed elements? Really nothing to do? | | This is used for turning off a default namespace: | | | | | What should a validating parser do on the unprefixed element `bar'? | | Second, what is the initial value of the default namespace before | | the first declaration of the default namespace appears? Is the | | default namespace undefined at first? | | So it is. The namespace is not defined until an explicit declaration | appears, which is entirely reasonable to me, at least. When the default namespace is undefined, what should a validating parser do for unprefixed elements? | | I think it is reasonable that the namespace associated to the DTD of | | the XML document, if any, would be the first default namespace. | | What do you mean by this? I can't follow you here. I understand that a namespace has an associated schema (e.g. DTD, RDF schema), and a parser validates a prefixed element by referring the schema associated to the namespace prefix. So, If an XML document has its DTD specified by , there is a namespace associated to the DTD. Is my understanding correct? Thank you, -- KAJIYAMA, Tamito From larsga@ifi.uio.no Sat Nov 21 21:28:29 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 21 Nov 1998 22:28:29 +0100 Subject: [XML-SIG] [Q] Namespace In-Reply-To: <9811212110.AA19430@etlibs2.etl.go.jp> References: <9811212110.AA19430@etlibs2.etl.go.jp> Message-ID: * Tamito Kajiyama | | What should a validating parser do on the unprefixed element `bar'? | | [...] | | When the default namespace is undefined, what should a validating | parser do for unprefixed elements? Nothing. :) If asked, it should reply that the namespace is undefined. Remember, the namespace is just that, an identifier, and nothing more. It's just used to be able to uniquely identify elements and attributes. | I understand that a namespace has an associated schema (e.g. DTD, | RDF schema), It does not. A namespace is just a URI used to distinguish a set of names from all other names, globally. Of course, in our minds there is usually an association between the namespace and a schema/DTD, but the parser knows nothing of this. | and a parser validates a prefixed element by referring the schema | associated to the namespace prefix. In fact, validation as defined in XML 1.0 does not work with namespaces, which is a point against them. A prefixed element is invalid if it was not declared with the prefix... | So, If an XML document has its DTD specified by , | there is a namespace associated to the DTD. Is my understanding | correct? No. :) There is no requirement that there be any namespaces at all in XML documents, and like I said namespaces and DTDs don't work very well together. What namespaces do is e.g to allow you to use the TITLE element from both HTML and DocBook in the same DTD, and still be able to tell them apart, by associating instances of the TITLE element type with different namespaces. --Lars M. From kajiyama@etl.go.jp Sun Nov 22 07:46:14 1998 From: kajiyama@etl.go.jp (Tamito Kajiyama) Date: Sun, 22 Nov 98 07:46:14 JST Subject: [XML-SIG] [Q] Namespace In-Reply-To: (message from Lars Marius Garshol on 21 Nov 1998 22:28:29 +0100) Message-ID: <9811212246.AA19493@etlibs2.etl.go.jp> Lars Marius Garshol writes: | | A namespace is just a URI used to distinguish a set of names from all | other names, globally. I see. | Of course, in our minds there is usually an association between the | namespace and a schema/DTD, but the parser knows nothing of this. What I want to do is some experiments about RDF. In an RDF instance, RDF schemas are specified using namespace. So, I'm writing an RDF parser that, for each namespace declaration, retrieves the RDF file specified by the URI, parses it, and constructs an internal representation of the RDF schema for further validation. Is this good practice? It seems that my parser knows something about namespace... | In fact, validation as defined in XML 1.0 does not work with | namespaces, which is a point against them. A prefixed element is | invalid if it was not declared with the prefix... Hmm, it's surprising. What will happen to XML 1.0 when the namespace specification becomes a W3C recommendation? Thank you, -- KAJIYAMA, Tamito From larsga@ifi.uio.no Sat Nov 21 23:02:42 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 22 Nov 1998 00:02:42 +0100 Subject: [XML-SIG] [Q] Namespace In-Reply-To: <9811212246.AA19493@etlibs2.etl.go.jp> References: <9811212246.AA19493@etlibs2.etl.go.jp> Message-ID: * Tamito Kajiyama | | What I want to do is some experiments about RDF. In an RDF | instance, RDF schemas are specified using namespace. So, I'm | writing an RDF parser that, for each namespace declaration, | retrieves the RDF file specified by the URI, parses it, and | constructs an internal representation of the RDF schema for further | validation. Hmmm. Do you mean that you're writing your own XML parser, or are you building on top of SAX, DOM or some parser API? | Is this good practice? It sounds good to me, at least. | It seems that my parser knows something about namespace... Nothing wrong with that, you just have to keep the different layers of the different specs separate in your mind (and parser :). | What will happen to XML 1.0 when the namespace specification becomes | a W3C recommendation? Good question. I don't really know. A reasonable guess would be that the SGML DTD syntax is ditched in favour of an XML-based syntax that is namespace-aware. Or that both are retained. Of course, this means that XML will have two different schema languages, only one of them SGML-compatible. But, like I say, this is just a guess. --Lars M. From kajiyama@etl.go.jp Sun Nov 22 08:24:57 1998 From: kajiyama@etl.go.jp (Tamito Kajiyama) Date: Sun, 22 Nov 98 08:24:57 JST Subject: [XML-SIG] [Q] Namespace In-Reply-To: (message from Lars Marius Garshol on 22 Nov 1998 00:02:42 +0100) Message-ID: <9811212324.AA19540@etlibs2.etl.go.jp> Lars Marius Garshol writes: | | * Tamito Kajiyama | | | | What I want to do is some experiments about RDF. In an RDF | | instance, RDF schemas are specified using namespace. So, I'm | | writing an RDF parser that, for each namespace declaration, | | retrieves the RDF file specified by the URI, parses it, and | | constructs an internal representation of the RDF schema for further | | validation. | | Hmmm. Do you mean that you're writing your own XML parser, or are you | building on top of SAX, DOM or some parser API? I'm building my RDF parser on the top of SAX, using the Python XML Package (version 0.4). | | Is this good practice? | | It sounds good to me, at least. Thank you for the kind replies, in spite of midnight ;-) -- KAJIYAMA, Tamito From kajiyama@etl.go.jp Mon Nov 23 01:44:35 1998 From: kajiyama@etl.go.jp (Tamito Kajiyama) Date: Mon, 23 Nov 98 01:44:35 JST Subject: [XML-SIG] [Q] SAX Exception Message-ID: <9811221644.AA19959@etlibs2.etl.go.jp> Hi. I'm writing a parser using the SAX module, and want to raise exceptions in the methods of saxlib.DocumentHandler (e.g. startElement) so that the exceptions are handled by saxlib.ErrorHandler's error and fatalError methods. How can I achieve it? -- KAJIYAMA, Tamito From larsga@ifi.uio.no Sun Nov 22 17:09:08 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 22 Nov 1998 18:09:08 +0100 Subject: [XML-SIG] [Q] SAX Exception In-Reply-To: <9811221644.AA19959@etlibs2.etl.go.jp> References: <9811221644.AA19959@etlibs2.etl.go.jp> Message-ID: * Tamito Kajiyama | | I'm writing a parser using the SAX module, and want to raise | exceptions in the methods of saxlib.DocumentHandler | (e.g. startElement) so that the exceptions are handled by | saxlib.ErrorHandler's error and fatalError methods. How can I | achieve it? You can't achieve it that way. The SAX exception classes are for the cases where the parser throws an exception instead of reporting the error via a method. When the parser throws an exception it loses its internal state and in these cases the parse is aborted. (This also applies when you throw an exception from inside a callback method.) In other words, what you need to do is to call those methods directly, which requires you to have a reference to the ErrorHandler yourself. I'd recommend that you simply wrap the SAX driver completely so that your clients have no direct access to it. That way you can keep track of the ErrorHandler object. --Lars M. From H.Jansen@math.tudelft.nl Mon Nov 23 08:32:01 1998 From: H.Jansen@math.tudelft.nl (Henk Jansen) Date: Mon, 23 Nov 1998 09:32:01 +0100 (MET) Subject: [XML-SIG] PCCTS in python: yapps. Message-ID: <199811230832.JAA02994@dutita4.twi.tudelft.nl> Amit Patel has built a python recursive/decendent parser-generator modeled after PCCTS (http://theory.stanford.edu/~amitp/Yapps/). I'm using this tool currently for a simulation modeling language and it is very nice and easy to understand tool indeed (mainly, because it's all Python: no segmentation faults, bus errors etc. -- by the way, I found ANTLR very slow in creating the grammar). Personally, I would like to see more PCCTS-like features added to Yapps: - LL(k), k>1 - semantic predicates - ... and maybe some critical parts written as compliled modules (which maybe could be borrowed from PCCTS...?) Hope this will help in finding a suitable XML/DOM parser-generator. Henk. --------- Alcibiades, when at the dinner table of Agathon: """When we listen to anyone else talking, however eloquent he is, we don't really care a damn what he says ... I have heard Pericles and al the other great orators ... but they never ... turned my soul upside down. But this [Socrates] has often left me in such a state of mind that I have felt that I simply could not go on living the way I did. He makes me admit that while I am spending my time on politics I am neglecting all the things that are crying for attention in myself.""" -- ----------------------------------------------------------------------- | Henk Jansen http://dutita0.twi.tudelft.nl/WAGM/people/H.Jansen.html | | Delft University of Technology | hjansen@math.tudelft.nl | | > Information Technoloy and Systems (ITS) | Mekelweg 4, 2628 CD | | >> Mathematics (TWI) | Delft, The Netherlands | | >>> Applied Analysis (TA) | phone: +31(0)15-278-7295 | | >>>> Analysis of Large Scale Models (WAGM) | fax: +31(0)15-278-7209 | ----------------------------------------------------------------------- From Fred L. Drake, Jr." I've attached a patch below that includes the last patch I sent Friday evening (since it's not in yet; apply this to the CVS version), and fixes the childNodes attribute of the Document object. The Node.get_childNodes() method creates a NodeList which has self.get_ownerDocument() as the owner. When used from the Document class, the owner is None, but the owner for the chilren is self. Without the fix, using nodes accessed from .childNodes could easily cause WrongDocumentException to be raised. This may not be a problem in a typical application, where (I expect) the Document instance is mostly used as node factory and source for Document.documentElement, but for some particularly weird conversion scripts I'm working on where I'm starting with multi-rooted documents, this can be a real problem. Yes, I know XML only allows a single root; that's one reason a conversion script is needed! ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From Fred L. Drake, Jr." The toxml() method in the DOM implementation is convenient, but not always what's needed. There are two specific problems: it creates a string in memory with the entire document representation, and it can only produce the XML form of the document. I'd also like to be able to generate an ESIS stream or SGML from the DOM, and I don't need the entire representation to be in memory. I propose the addition of three methods; these could be functions in a utility module just as easily. Each method should accept a file-like object that supports at least the write() method. The methods are: def write_esis(self, file): """Write an ESIS stream on file.""" def write_sgml(self, file, knownempties=[]): """Write an SGML instance on file. `knownempties' should be a list of GIs of element types declared to be empty.""" def write_xml(self, file): """Write an XML instance on file.""" Does anyone have any opinions as to whether these should be methods or utility functions? As I think about it, using functions may make more sense, esp. since different functions may be needed for different SGML declarations. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From fredrik@pythonware.com Mon Nov 23 15:30:59 1998 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 23 Nov 1998 16:30:59 +0100 Subject: [XML-SIG] Extended DOM interface proposal. Message-ID: <000d01be16f6$46b33c70$f29b12c2@pythonware.com> > I propose the addition of three methods; these could be functions in >a utility module just as easily. Each method should accept a >file-like object that supports at least the write() method. The >methods are: > > def write_esis(self, file): > """Write an ESIS stream on file.""" > > def write_sgml(self, file, knownempties=[]): > """Write an SGML instance on file. `knownempties' should > be a list of GIs of element types declared to be empty.""" > > def write_xml(self, file): > """Write an XML instance on file.""" > > Does anyone have any opinions as to whether these should be methods >or utility functions? As I think about it, using functions may make >more sense, esp. since different functions may be needed for different >SGML declarations. IMHO, using objects makes even more sense -- so why not use the Visitor pattern? 1. add an accept method which takes any object implementing the DOMVisitor class as its single argument, and calls the appropriate methods on that object. 2. the Visitor interface is probably identical to the SAX API... 3. which leads us back to the DOM->SAX question... Cheers /F fredrik@pythonware.com http://www.pythonware.com From Fred L. Drake, Jr." References: <000d01be16f6$46b33c70$f29b12c2@pythonware.com> Message-ID: <13913.32857.93445.571115@weyr.cnri.reston.va.us> Fredrik Lundh writes: > IMHO, using objects makes even more sense -- so why > not use the Visitor pattern? This would be fine for me. > 1. add an accept method which takes any object > implementing the DOMVisitor class as its single The DOMVisitor *class*? Shouldn't that be interface? Or was it protocol? ;-) > 2. the Visitor interface is probably identical to > the SAX API... SAX is insufficient; I'd at least like to preserve comments. > 3. which leads us back to the DOM->SAX question... Yes. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From fredrik@pythonware.com Mon Nov 23 15:48:33 1998 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 23 Nov 1998 16:48:33 +0100 Subject: [XML-SIG] Extended DOM interface proposal. Message-ID: <002101be16f8$bb7f5fa0$f29b12c2@pythonware.com> > > 1. add an accept method which takes any object > > implementing the DOMVisitor class as its single > > The DOMVisitor *class*? Shouldn't that be interface? Or was it >protocol? ;-) behaviour!? nah. I think I prefer interface. so here's the fix: message = string.replace( message, "DOMVisitor class", "DOMVisitor interface" ) > > 2. the Visitor interface is probably identical to > > the SAX API... > > SAX is insufficient; I'd at least like to preserve comments. extended SAX, anyone? Cheers /F fredrik@pythonware.com http://www.pythonware.com From akuchlin@cnri.reston.va.us Mon Nov 23 19:24:53 1998 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 23 Nov 1998 14:24:53 -0500 (EST) Subject: [XML-SIG] Extended DOM interface proposal. In-Reply-To: <13913.31354.802344.207085@weyr.cnri.reston.va.us> References: <13913.31354.802344.207085@weyr.cnri.reston.va.us> Message-ID: <13913.46595.766607.862556@amarok.cnri.reston.va.us> Fred L. Drake writes: > I propose the addition of three methods; these could be functions in >a utility module just as easily. Each method should accept a >file-like object that supports at least the write() method. The ... > Does anyone have any opinions as to whether these should be methods >or utility functions? As I think about it, using functions may make >more sense, esp. since different functions may be needed for different >SGML declarations. IMHO functions would be preferable, though it might be workable if it only required a very small number of methods. A small number of methods doesn't appear likely, though, because of all the many possible variations on output: SGML, XML, ESIS? Pretty-printed SGML/XML or not? Etc... -- A.M. Kuchling http://starship.skyport.net/crew/amk/ Your son's head is valuable to you, and I am attached to mine. Indeed, hitherto we have been inseparable. -- Lady Johanna Constantine, in SANDMAN #29: "Thermidor" From Fred L. Drake, Jr." References: <13913.30589.414085.745874@weyr.cnri.reston.va.us> <13913.47103.230377.539181@amarok.cnri.reston.va.us> Message-ID: <13913.50847.502345.58043@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > You didn't attach the patch... On the other hand, I can > probably fix it myself from your description. Oops! It's below. Yeah, I know you could, but some people like patches because they know there've been no typos introduced between testing and the change information. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 Index: core.py =================================================================== RCS file: /projects/cvsroot/xml/dom/core.py,v retrieving revision 1.31 diff -c -c -r1.31 core.py *** core.py 1998/11/21 02:48:12 1.31 --- core.py 1998/11/23 20:29:12 *************** *** 931,938 **** s = '\n' if self.documentType: s = s + self.documentType ! if len(self._node.children): ! n = self._node.children[0] n = NODE_CLASS[ n.type ] (n, self, self) s = s + n.toxml() return s --- 931,937 ---- s = '\n' if self.documentType: s = s + self.documentType ! for n in self._node.children: n = NODE_CLASS[ n.type ] (n, self, self) s = s + n.toxml() return s *************** *** 1038,1043 **** --- 1037,1045 ---- return self.documentType def get_implementation(self): return self.DOMImplementation + + def get_childNodes(self): + return NodeList(self._node.children, self, self) def get_documentElement(self): """Return the root element of the Document object, or None From larsga@ifi.uio.no Tue Nov 24 17:52:41 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 24 Nov 1998 18:52:41 +0100 Subject: [XML-SIG] Patches to adr_parse.py and bookmark.py Message-ID: Hi! I've been playing around a little with XBEL, thinking about making a demo for a conference I'm going to in a couple of weeks. So far, what I've done is to modify adr_parse.py to actually work with the latest version of bookmark.py and to deal with command-line arguments and also to modify bookmark.py to insert the XBEL public identifier. Patch 1 (to adr_parse.py): 65c65 < visited=parse_date(readfield(infile,"VISITED")) --- > parse_date(readfield(infile,"VISITED")) # Just throw this away 69c69 < bms.add_folder(name,created,visited) --- > bms.add_folder(name,created) 78c78 < bms.add_bookmark(name,created,visited,url) --- > bms.add_bookmark(name,created,visited,None,url) 87,88c87,106 < bms=parse_adr(r"c:\programfiler\opera\opera3.adr") < bms.dump_xbel() --- > import sys > > if len(sys.argv)<2 or len(sys.argv)>3: > print > print "A simple utility to convert Opera bookmarks to XBEL." > print > print "Usage: " > print " adr_parse.py []" > sys.exit(1) > > bms=parse_adr(sys.argv[1]) > > if len(sys.argv)==3: > out=open(sys.argv[2],"w") > bms.dump_xbel(out) > out.close() > else: > bms.dump_xbel() > > # Done Path 2 (to bookmark.py): 42c42 < '\n' --- > '\n' --Lars M. From akuchlin@cnri.reston.va.us Wed Nov 25 15:18:55 1998 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Wed, 25 Nov 1998 10:18:55 -0500 (EST) Subject: [XML-SIG] Patches to adr_parse.py and bookmark.py In-Reply-To: References: Message-ID: <13916.5182.99234.556728@amarok.cnri.reston.va.us> Lars Marius Garshol writes: >I've been playing around a little with XBEL, thinking about making a >demo for a conference I'm going to in a couple of weeks. So far, what >I've done is to modify adr_parse.py to actually work with the latest >version of bookmark.py and to deal with command-line arguments and >also to modify bookmark.py to insert the XBEL public identifier. Thanks, Lars! Patches applied, and they also inspired me to go and fix ns_parse.py and lynx_parse.py accordingly. (I'm not sure what to do for msie_parse.py; anyone want to contribute the right Windows incantation to find the user's bookmark file?) The week before the conference, I came out with a 0.5 pre-release, but things intervened, and the code has continued to change after the pre-release. I'd really like to get a 0.5 release out, so I'll try to make another pre-release, in preparation for a new release by next Monday. This release would be announced outside the XML-SIG a bit. -- A.M. Kuchling http://starship.skyport.net/crew/amk/ Time itself flows on with constant motion, just like a river: for no more than a river can the fleeting hour stand still. As wave is driven on by wave, and, itself pursued, pursues the one before, so the moments of time at once flee and follow, and are ever new. -- Ovid, _The Metamorphoses_ From MHammond@skippinet.com.au Wed Nov 25 22:55:47 1998 From: MHammond@skippinet.com.au (Mark Hammond) Date: Thu, 26 Nov 1998 09:55:47 +1100 Subject: [XML-SIG] Patches to adr_parse.py and bookmark.py In-Reply-To: <13916.5182.99234.556728@amarok.cnri.reston.va.us> Message-ID: <001701be18c6$be1a52e0$0801a8c0@bobcat> > Thanks, Lars! Patches applied, and they also inspired me to > go and fix ns_parse.py and lynx_parse.py accordingly. (I'm not sure > what to do for msie_parse.py; anyone want to contribute the right > Windows incantation to find the user's bookmark file?) I could do this, but it will require the "win32api" module (for the registry functions). It is almost getting to the time where win32api should be released by Guido! If it is acceptable to have this dependency, then I will be happy to make the change! (It could obviously be done such that if "import win32api" fails, we revert to the existing behaviour!) Mark. From akuchlin@cnri.reston.va.us Wed Nov 25 23:04:34 1998 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Wed, 25 Nov 1998 18:04:34 -0500 (EST) Subject: [XML-SIG] Patches to adr_parse.py and bookmark.py In-Reply-To: <001701be18c6$be1a52e0$0801a8c0@bobcat> References: <13916.5182.99234.556728@amarok.cnri.reston.va.us> <001701be18c6$be1a52e0$0801a8c0@bobcat> Message-ID: <13916.35975.503487.722442@amarok.cnri.reston.va.us> Mark Hammond writes: >> go and fix ns_parse.py and lynx_parse.py accordingly. (I'm not sure >> what to do for msie_parse.py; anyone want to contribute the right >> Windows incantation to find the user's bookmark file?) > >I could do this, but it will require the "win32api" module (for the >registry functions). It is almost getting to the time where win32api >should be released by Guido! That dependency shouldn't be a problem, and we can conditionalize it after checking sys.platform. Anyone want to do this for IE on the Mac? (IE on Unix probably isn't a concern; when I run it on Solaris, it grabs the X server and then freezes, so I have to telnet it and kill my whole session. Doubt it has many users...) -- A.M. Kuchling http://starship.skyport.net/crew/amk/ People marry most happily with their own kind. The trouble lies in the fact that people usually marry at an age where they do not really know what their own kind is. -- Robertson Davies, _A Voice from the Attic_ From akuchlin@cnri.reston.va.us Fri Nov 27 15:55:40 1998 From: akuchlin@cnri.reston.va.us (A.M. Kuchling) Date: Fri, 27 Nov 1998 10:55:40 -0500 Subject: [XML-SIG] DOM walker class Message-ID: <199811271555.KAA00331@mira.erols.com> The walk() method of the DOM Walker class is defined as follows: def walk(self, root): if root.get_nodeType() == DOCUMENT_NODE: c = root.get_documentElement() assert c.get_nodeType() == ELEMENT_NODE return self.walk1(c) else: return self.walk1(root) This behaves unexpectedly if the Document node has several children, as might happen if there are PIs preceding or following the root element. Only the root element will be walked, missing any other children of the root, which becomes apparent if you're walking the tree in order to print it. How should this be fixed? One choice is to change the DOCUMENT_NODE case to: for c in root.get_childNodes(): self.walk1(c) However, this change really makes the distinction between walk() and walk1() unnecessary. walk() is basically there as a wrapper for walk1(), to get the root element if it's a Document node; if we just traverse all the children, this is consistent for any node type so walk() and walk1() could be collapsed into one function. This will break code that subclasses Walker and overrides walk() or walk1() with something customized. What do people think should be done? Just fix walk(), or merge walk() and walk1()? -- A.M. Kuchling http://starship.skyport.net/crew/amk/ May you go safe, my friend, across that dizzy way / No wider than a hair, by which your people go / From earth to Paradise; may you go safe today / With stars and space above, and time and stars below. -- Lord Dunsany From Mike.Olson@FourThought.com Sat Nov 28 22:32:07 1998 From: Mike.Olson@FourThought.com (Mike Olson) Date: Sat, 28 Nov 1998 17:32:07 -0500 Subject: [XML-SIG] DOM walker class References: <199811271555.KAA00331@mira.erols.com> Message-ID: <366079E7.BEC0C587@FourThought.com> A.M. Kuchling wrote: > However, this change really makes the distinction between walk() and > walk1() unnecessary. walk() is basically there as a wrapper for > walk1(), to get the root element if it's a Document node; if we just > traverse all the children, this is consistent for any node type so > walk() and walk1() could be collapsed into one function. This will > break code that subclasses Walker and overrides walk() or walk1() with > something customized. > > What do people think should be done? Just fix walk(), or merge walk() > and walk1()? > I think they should be merged. The current solution also does not allow you to print comments, doc types, or anything else outside of the root element.... If you are worried about breaking customizations on this interface, you could define walk1 and just have it call walk until everyone gets thier code modified.... > -- > A.M. Kuchling http://starship.skyport.net/crew/amk/ > May you go safe, my friend, across that dizzy way / No wider than a hair, by > which your people go / From earth to Paradise; may you go safe today / With > stars and space above, and time and stars below. > -- Lord Dunsany > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig From H.Jansen@math.tudelft.nl Mon Nov 30 09:49:24 1998 From: H.Jansen@math.tudelft.nl (Henk Jansen) Date: Mon, 30 Nov 1998 10:49:24 +0100 (MET) Subject: [XML-SIG] Re: XML-SIG digest, Vol 1 #156 - 1 msg In-Reply-To: <199811291700.MAA18483@python.org> from "xml-sig-admin@python.org" at Nov 29, 98 12:00:43 pm Message-ID: <199811300949.KAA23957@dutita4.twi.tudelft.nl> > A.M. Kuchling wrote: > > > However, this change really makes the distinction between walk() and > > walk1() unnecessary. walk() is basically there as a wrapper for > > walk1(), to get the root element if it's a Document node; if we just > > traverse all the children, this is consistent for any node type so > > walk() and walk1() could be collapsed into one function. This will > > break code that subclasses Walker and overrides walk() or walk1() with > > something customized. > > > > What do people think should be done? Just fix walk(), or merge walk() > > and walk1()? > > > > I think they should be merged. The current solution also does not allow you to > print comments, doc types, or anything else outside of the root element.... > > If you are worried about breaking customizations on this interface, you could > define walk1 and just have it call walk until everyone gets thier code > modified.... Since the DOM hierarchy resembles much of a general tree type, I wonder if the following walk method, which is quite general, could serve as a DOM tree walker interface (Note: I haven't checked the DOM code so I don't know how general it is already. I'm also not sure if it will break code or not.): def walk (_, co): """ Walk the tree with co a callable object having: .atleaf() .preorder() .postorder() methods. """ assert ... _co_has_these_methods_ ... _._walk (co) def _walk (_, co, depth=1): __doc__ = Node.walk.__doc__ for child in _.leaves: co.atleaf (child, depth) for child in _.branches: co.preorder (child, depth): child._walk (co, depth=depth+1) co.postorder (child, depth): return co For instance, printing the tree goes as follows: import StringIO class TreeRepr: def __init__ (_): _.t = StringIO.StringIO () def atleaf (_, child, depth): _.t.write ("--"*depth+`child`+'\n') def preorder (_, child, depth): _.t.write ("--"*depth+`child`+'\n') def postorder (_, child, depth): pass class Node: ... general tree node methods. def __repr__ (_) repr = TreeRepr() repr = _walk (repr) return repr._t.getvalue() Dependig on the callable object, all sorts of functionality can be exploited when walking the tree: translation, checking, finding, etc. The _walk procedure could be exteded with a "break condition": def _walk (_, co, depth=1): __doc__ = Node.walk.__doc__ for child in _.leaves: if co.atleaf (child, depth): # stop walking the current list of leaves break for child in _.branches: if co.preorder (child, depth): # stop walking the current branch break child._walk (co, depth=depth+1) if co.postorder (child, depth): # stop walking the current branch break return co This means that walking the tree can be aborted when one of the co. returns a true value. Henk. -- ----------------------------------------------------------------------- | Henk Jansen http://dutita0.twi.tudelft.nl/WAGM/people/H.Jansen.html | | Delft University of Technology | hjansen@math.tudelft.nl | | > Information Technoloy and Systems (ITS) | Mekelweg 4, 2628 CD | | >> Mathematics (TWI) | Delft, The Netherlands | | >>> Applied Analysis (TA) | phone: +31(0)15-278-7295 | | >>>> Analysis of Large Scale Models (WAGM) | fax: +31(0)15-278-7209 | ----------------------------------------------------------------------- From Fred L. Drake, Jr." References: <199811271555.KAA00331@mira.erols.com> Message-ID: <13923.758.240371.779901@weyr.cnri.reston.va.us> A.M. Kuchling writes: > What do people think should be done? Just fix walk(), or merge walk() > and walk1()? Merge the two to simply be walk(). This package doesn't have a 1.0 yet, so there's no compelling need to be particularly concerned about backward compatibility. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191