From stuart.hungerford@webone.com.au Wed Mar 3 10:07:06 1999 From: stuart.hungerford@webone.com.au (Stuart Hungerford) Date: Wed, 3 Mar 1999 21:07:06 +1100 Subject: [XML-SIG] A bit off topic: XML advice needed... Message-ID: <001701be655d$9861e930$0301a8c0@restless.com> Hi all, This is the text of a message I posted earlier to comp.text.xml. Since the people on this list are such nice folks, I thought I'd ask your opinions too. I apologize in advance for an off-topic posting, but good from-the-trenches XML advice is hard to find. ------------------------------------------------------------- I have a question about designing XML DTD's that I suspect shows me trying to think of XML too much like a programming language, and not enough like a markup language. Anyways, suppose I have a DTD that describes elements for dates, and in particular the element. Now suppose I want to re-use my element definition in a DTD for marking up "events". Each event has a start date and end date. (Please forgive any terminology errors here). My first mental reaction (in a sort of pseudocode is): class date { ... }; class event { start_date : date; end_date : date; }; But how to express the XML equivalent? One way would be: Use an external ENTITY to textually include the date.dtd in events.dtd and define That way I get self-documenting names for the date elements, but at the cost of another level of markup needed for each date in all my XML documents that use events.dtd. I could also: Use an external ENTITY to textually include the date.dtd in events.dtd and define: That way I can create markup like: ... ... But now I have to check in an application whether there's exactly one start and end etc--I seem to have given up some validity checking benefits for a level of markup tags. There's probably lots of other ways to do this--the question is, which are the "good" ways, where "good" in my case means I can re-use existing definitions and still come out with a DTD and documents that are not too hard to understand. I guess I'm looking for the gang-of-four patterns book for XML, but in the meantime can anyone share their wisdom and experiences on this issue? From Matt Gushee Wed Mar 3 10:08:22 1999 From: Matt Gushee (Matt Gushee) Date: Wed, 3 Mar 1999 19:08:22 +0900 Subject: [XML-SIG] A bit off topic: XML advice needed... In-Reply-To: <001701be655d$9861e930$0301a8c0@restless.com> References: <001701be655d$9861e930$0301a8c0@restless.com> Message-ID: <199903031008.TAA13056@crab91.it.osha.sut.ac.jp> Stuart Hungerford writes: > Anyways, suppose I have a DTD that describes elements for > dates, and in particular the element. Now suppose I > want to re-use my element definition in a DTD for > marking up "events". Each event has a start date and end > date. I'm not much more than a beginner myself, but my off-the-cuff approach would be to make the start-date and end-date attributes of the , rather than elements. If there's a particular reason you need them to be elements, then maybe you could do something like Afraid I can't really explain why I like this approach. Undoubtedly, there are people with more expertise who will give you better answers. Matt Gushee Oshamanbe, Hokkaido, Japan From betty@eccnet.eccnet.com Wed Mar 3 13:34:25 1999 From: betty@eccnet.eccnet.com (Betty Harvey) Date: Wed, 3 Mar 1999 08:34:25 -0500 (EST) Subject: [XML-SIG] A bit off topic: XML advice needed... In-Reply-To: <001701be655d$9861e930$0301a8c0@restless.com> Message-ID: Hi Stuart: Your question touches on a very important question in XML about when you use element and when to you attributes. As you are experiencing, it is not always a 'cut and dry' decision. My rule of thumb which usually works out to be 90% reliable is if the information is going to be displayed use an element. However, there are nuances and exceptions which you have found. I can't solve the question you have below but maybe I can give you some practical advice and have you answer some questions about the application: 1. Will people be creating the start-date and end-date? If the answer is yes, you may want to think about creating the elements and . You can control the input from the user better by creating explicit elements. An exception to the above though is if you are going to be supplying a static form-based application for the end-user to complete. If this is the case, the form can supply the input rules and the application can perform the mark-up. 2. Will the application create the dates? If dates are going to be created by the application without 'the person in-the-loop', I would tend to make the 'start' and 'end' as attributes of the element. The application itself can ensure that the required information is available to provide proper tagging. When you are creating your DTD, every situation tends to be a little different and it requires some analysis of the processes that will be involved in the final application. Hope this helps. Betty /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Betty Harvey | Phone: 301-540-8251 FAX: 4268 Electronic Commerce Connection, Inc. | 13017 Wisteria Drive, P.O. Box 333 | Germantown, Md. 20874 | harvey@eccnet.eccnet.com | Washington,DC SGML/XML Users Grp URL: http://www.eccnet.com | http://www.eccnet.com/sgmlug/ /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ On Wed, 3 Mar 1999, Stuart Hungerford wrote: > Hi all, > > This is the text of a message I posted earlier to > comp.text.xml. Since the people on this list are > such nice folks, I thought I'd ask your opinions > too. > > I apologize in advance for an off-topic posting, > but good from-the-trenches XML advice is hard > to find. > > ------------------------------------------------------------- > > I have a question about designing XML DTD's that I suspect > shows me trying to think of XML too much like a programming > language, and not enough like a markup language. > > Anyways, suppose I have a DTD that describes elements for > dates, and in particular the element. Now suppose I > want to re-use my element definition in a DTD for > marking up "events". Each event has a start date and end > date. (Please forgive any terminology errors here). > > My first mental reaction (in a sort of pseudocode is): > > class date > { > ... > }; > > > class event > { > start_date : date; > end_date : date; > }; > > But how to express the XML equivalent? One way would be: > > Use an external ENTITY to textually include the date.dtd in > events.dtd and define > > > > > > That way I get self-documenting names for the date elements, > but at the cost of another level of markup needed for each date > in all my XML documents that use events.dtd. > > I could also: > > Use an external ENTITY to textually include the date.dtd in > events.dtd and define: > > > > > That way I can create markup like: > ... ... > > But now I have to check in an application whether there's > exactly one start and end etc--I seem to have given up some > validity checking benefits for a level of markup tags. > > There's probably lots of other ways to do this--the question is, > which are the "good" ways, where "good" in my case means > I can re-use existing definitions and still come out with a > DTD and documents that are not too hard to understand. > > I guess I'm looking for the gang-of-four patterns book for > XML, but in the meantime can anyone share their wisdom > and experiences on this issue? > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig > From Fred L. Drake, Jr." References: <001701be655d$9861e930$0301a8c0@restless.com> Message-ID: <14045.27224.282232.478763@weyr.cnri.reston.va.us> Stuart Hungerford writes: > My first mental reaction (in a sort of pseudocode is): > > class date ... > class event Stuart, Another possibility you may want to seriously consider (and may have thought of by now), but that hasn't been proposed, would be to use a notation to specify that something is a date (specifying a data type), and elements (or attributes) for the structural aspects: Your document might then look like this: 1999-03-03 1999-03-04 ... The concerns brought up by Betty and Matt still apply. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From wunder@infoseek.com Wed Mar 3 18:27:27 1999 From: wunder@infoseek.com (Walter Underwood) Date: Wed, 03 Mar 1999 10:27:27 -0800 Subject: [XML-SIG] A bit off topic: XML advice needed... In-Reply-To: <001701be655d$9861e930$0301a8c0@restless.com> Message-ID: <3.0.5.32.19990303102727.03dcf9f0@corp> At 09:07 PM 3/3/99 +1100, Stuart Hungerford wrote: > >Anyways, suppose I have a DTD that describes elements for >dates, and in particular the element. Now suppose I >want to re-use my element definition in a DTD for >marking up "events". Each event has a start date and end >date. (Please forgive any terminology errors here). A different approach is to use a date-range format in the data, rather than in the XML. Here is an ISO 8601 time period with a specific start date: 1985-04-12T23:20:50/P1Y2M15DT12H That is a period that starts at 23:20:50 on 12 April 1985, and lasts for one year, two montsh, 15 days, and 12 hours. See a summary of the ISO spec here: http://www.cl.cam.ac.uk/~mgk25/iso-time.html ISO 8601:1988 here: http://www.iso.ch/markete/8601.pdf and ISO 8601:1998 (draft) here: http://www.cl.cam.ac.uk/~mgk25/8601v04.pdf W3.org recommends a subset of the ISO date format for web use. That subset ("profile") is online at www.w3.org, somewhere. It does not include the part of the spec dealing with time periods. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 From Stuart Hungerford" Hi again, Many thanks to the people who sent me very thoughtful responses to my earlier question about "design" alternatives in XML. At this point, I'm starting to realize there's another dimension to this issue, and that is of DTD re-use. I wondering if it's possible to apply markup design principles on one hand (e.g. using an attribute .vs. an element) and also achieve a re-useable and easily grown DTD? Now if I could just get a Python to XML translator I could do it all in Python... ;-) From l.szyster@ibm.net Thu Mar 4 09:16:17 1999 From: l.szyster@ibm.net (Laurent Szyster) Date: Thu, 04 Mar 1999 10:16:17 +0100 Subject: [XML-SIG] A bit off topic: XML advice needed... References: <005d01be65e6$3d93ea60$9d462c8a@act.cmis.csiro.au> Message-ID: <36DE4F61.973B8CE0@ibm.net> Stuart Hungerford wrote: > > Now if I could just get a Python to XML translator > I could do it all in Python... ;-) Do you mean something like the Pickle module, but producing XML instead of custom text format? Since the XML documents should reflect the instance classe specific constraint, such translator will need to produce both the DTD's and the documents, for each 'root' element. All instance members of type String, Int, Long, Float and None should be translated as ATTRIBUTES while instances and collections (such as tuple, dict and list) will be translated as ELEMENTS. The distinction beeing based on whether the object can contain others or not. To allow the translator to produce DTD _and_ reuse existing element definitions, we may add "hidden" class members (starting with "_xml') holding the element and attributes definition. Like, class date: _xml_ELEMENT = ('#CDATA') _xml_ATTLIST = ('notation NOTATION', 'value IDREF #REQUIRED') notation = "iso8601-w3c" # this is the default for the # 'notation' attribute def __init__(self, value=None): self.value = value ... class event: _xml_ELEMENT = ('startdate', 'enddate') _xml_ATTLIST = ('name') name = "unknown" def __init__(self): self.name = 'noname' self.startdate = date() self.enddate = date() ... Would result in the following document with its DTD included: ]> ... here comes the document root .... The nice thing with this design is that it may allow to add XML pickling for existing classes, at a relatively low cost in terms of work (because DTD's are generated automagically). Any volunteers? ;-) Laurent From Fred L. Drake, Jr." References: <005d01be65e6$3d93ea60$9d462c8a@act.cmis.csiro.au> Message-ID: <14046.37313.581924.655649@weyr.cnri.reston.va.us> Stuart Hungerford writes: > I wondering if it's possible to apply markup design > principles on one hand (e.g. using an attribute .vs. > an element) and also achieve a re-useable and > easily grown DTD? Stuart, I DTD design is new for you, I'd start by A) just doing it, and B) reading about it. There are at least three books I'm aware of: Eve Maler's "Designing SGML DTDs" (which is pretty good), Dave Megginson's "Structuring XML Documents" (which I haven't had time to read), and Rick Jelliffe's "The XML/SGML Cookbook" (which I also not had time to read). Once you've done that a bit, the issues begin to become more clear, and you'll see how the difficulties and alternatives interact. I'd have to say this is far more art than science! -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From Jeff.Johnson@icn.siemens.com Thu Mar 4 23:32:13 1999 From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com) Date: Thu, 4 Mar 1999 18:32:13 -0500 Subject: [XML-SIG] HtmlBuilder Message-ID: <8525672A.0080F99B.00@li01.lm.ssc.siemens.com> --0__=sXY2JC4AkSF7bAz8mYHxK6g8BPRmGdzByc5nbgIpNdDhSRqCCg1doDbB Content-type: text/plain; charset=us-ascii Content-Disposition: inline Hi all, I use a program for which I do not have the source code to convert RTF to HTML. I then use xml.dom to reformat it, add navigation bars, fix links, etc. Very rarely, the RTF to HTML converter will throw a into a document without a preceding . This causes HtmlBuilder to start popping elements off its stack while looking for the starting , including and . When it runs out of stack it happily continues and the DOM is created. Unfortunately, each following element that should have been a child of then becomes a sibling of the document element. This produces an invalid DOM document but no exceptions are thrown. Eventually I do something with the DOM that calls the method Node.get_documentElement which raises a HierarchyRequestException because there is more than one root element. Since I don't have the source code the RTF to HTML converter, I can't fix it. I did however add two lines of code to HtmlBuilder that will allow these bogus end tags to be ignored. I hope that the new code can be added to the CVS tree. If this were an XML document I would rather raise an exception and reject the document. XML should be perfectly well formed. Since it is an HTML document I am more inclined to fix what can be fixed logically because there are already so many invalid HTML files in the world. I often run into this same problem when processing hand made HTML files. This modification might allow the XML-SIG DOM implementation to be used to clean up the existing HTML mess. I added the following three lines: if tag not in self.stack: #print "ignoring end tag with no start", tag break to the following method of HtmlBuilder: def unknown_endtag(self, tag): tag = string.upper(tag) #print 'ending', tag while self.stack: if tag not in self.stack: #print "ignoring end tag with no start", tag break if tag in self.empties: continue start_tag = self.stack[-1] del self.stack[-1] Builder.endElement(self, start_tag) if start_tag == tag: break The entire file is attached: (See attached file: html_builder.py) Cheers, Jeff --0__=sXY2JC4AkSF7bAz8mYHxK6g8BPRmGdzByc5nbgIpNdDhSRqCCg1doDbB Content-type: application/octet-stream; name="html_builder.py" Content-Disposition: attachment; filename="html_builder.py" Content-transfer-encoding: base64 JycnSFRNTCBwYXJzZXIsIGJ1aWx0IGZyb20gc3RhbmRhcmQgbGliJ3Mgc2dtbGxpYi4NCg0KVGFn IG5hbWVzIGFyZSBub3JtYWxpc2VkIHRvIHVwcGVyIGNhc2UsIHRoZSB1c3VhbCBIVE1MIGZhc2hp b24uDQonJycNCg0KZnJvbSBzZ21sbGliIGltcG9ydCBTR01MUGFyc2VyDQpmcm9tIHhtbC5kb20g aW1wb3J0IGNvcmUNCmZyb20geG1sLmRvbS5idWlsZGVyIGltcG9ydCBCdWlsZGVyDQppbXBvcnQg c3RyaW5nDQoNCmNsYXNzIEh0bWxCdWlsZGVyKFNHTUxQYXJzZXIsIEJ1aWxkZXIpOg0KICAgICAg ICBmcm9tIGh0bWxlbnRpdHlkZWZzIGltcG9ydCBlbnRpdHlkZWZzDQogICAgICAgIA0KICAgICAg ICBkZWYgX19pbml0X18oc2VsZik6DQogICAgICAgICAgICAgICAgU0dNTFBhcnNlci5fX2luaXRf XyhzZWxmKQ0KICAgICAgICAgICAgICAgIEJ1aWxkZXIuX19pbml0X18oc2VsZikNCg0KICAgICAg ICAgICAgICAgIHNlbGYuZW1wdGllcyA9IFsNCiAgICAgICAgICAgICAgICAgICAgICAgICdNRVRB JywgJ0JBU0UnLCAnTElOSycsIA0KICAgICAgICAgICAgICAgICAgICAgICAgJ0hSJywgJ0JSJywN CiAgICAgICAgICAgICAgICAgICAgICAgICdJTUcnLCAnUEFSQU0nLA0KICAgICAgICAgICAgICAg ICAgICAgICAgJ0lOUFVUJywgJ09QVElPTicsICdJU0lOREVYJw0KICAgICAgICAgICAgICAgIF0N CiAgICAgICAgICAgICAgICBsaXN0ID0gKCdPTCcsICdVTCcsICdETCcpDQogICAgICAgICAgICAg ICAgaGVhZGluZyA9ICgnSDEnLCAnSDInLCAnSDMnLCAnSDQnLCAnSDUnLCAnSDYnKQ0KICAgICAg ICAgICAgICAgIGJsb2NrcyA9ICgnUCcsICdBRERSRVNTJywgJ0JMT0NLUVVPVEUnLCAnRk9STScs ICdUQUJMRScsICdQUkUnKSArIFwNCiAgICAgICAgICAgICAgICAgICAgICAgIGhlYWRpbmcgIyAr IGxpc3QNCiAgICAgICAgICAgICAgICBzZWxmLmluZmVyX2VuZHMgPSB7DQogICAgICAgICAgICAg ICAgICAgICAgICAnUCc6IGJsb2NrcywNCg0KICAgICAgICAgICAgICAgICAgICAgICAgJ0xJJzog KCdMSScsKSwNCiAgICAgICAgICAgICAgICAgICAgICAgICdEVCc6ICgnRFQnLCksDQogICAgICAg ICAgICAgICAgICAgICAgICAnREQnOiAoJ0RUJywgJ0REJyksDQoNCiAgICAgICAgICAgICAgICAg ICAgICAgICdUUic6ICgnVFInLCksIA0KICAgICAgICAgICAgICAgICAgICAgICAgJ1RIJzogKCdU SCcsICdURCcsICdUUicpLA0KICAgICAgICAgICAgICAgICAgICAgICAgJ1REJzogKCdUSCcsICdU RCcsICdUUicpLA0KICAgICAgICAgICAgICAgIH0NCg0KICAgICAgICANCiAgICAgICAgZGVmIHVu a25vd25fc3RhcnR0YWcoc2VsZiwgdGFnLCBhdHRycyk6DQogICAgICAgICAgICAgICAgdGFnID0g c3RyaW5nLnVwcGVyKHRhZykNCiAgICAgICAgICAgICAgICAjcHJpbnQgJ3N0YXJ0aW5nJywgdGFn DQogICAgICAgICAgICAgICAgYXR0cmlidXRlcyA9IHt9DQogICAgICAgICAgICAgICAgZm9yIGss IHYgaW4gYXR0cnM6DQogICAgICAgICAgICAgICAgICAgICAgICBhdHRyaWJ1dGVzW3N0cmluZy51 cHBlcihrKV0gPSB2DQoNCiAgICAgICAgICAgICAgICAjcHJpbnQgc2VsZi5zdGFjaw0KICAgICAg ICAgICAgICAgIHdoaWxlIHNlbGYuc3RhY2s6DQogICAgICAgICAgICAgICAgICAgICAgICBpZiBz ZWxmLmluZmVyX2VuZHMuaGFzX2tleShzZWxmLnN0YWNrWy0xXSk6IA0KICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICBpZiB0YWcgaW4gc2VsZi5pbmZlcl9lbmRzW3NlbGYuc3RhY2tbLTFd XToNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAjcHJpbnQgdGFnLCAn ZW5kaW5nJywgc2VsZi5zdGFja1stMV0NCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICBCdWlsZGVyLmVuZEVsZW1lbnQoc2VsZiwgc2VsZi5zdGFja1stMV0pDQogICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgZGVsIHNlbGYuc3RhY2tbLTFdDQogICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgI3ByaW50IHNlbGYuc3RhY2sNCiAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgZWxzZToNCiAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICBicmVhaw0KICAgICAgICAgICAgICAgICAgICAgICAgZWxzZToN CiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgYnJlYWsNCiMgICAgICAgICAgICAgICBw cmludCBzZWxmLnN0YWNrLCB0YWcsIGF0dHJpYnV0ZXMNCiAgICAgICAgICAgICAgICANCiAgICAg ICAgICAgICAgICBCdWlsZGVyLnN0YXJ0RWxlbWVudChzZWxmLCB0YWcsIGF0dHJpYnV0ZXMpDQog ICAgICAgICAgICAgICAgaWYgbm90IHRhZyBpbiBzZWxmLmVtcHRpZXM6DQogICAgICAgICAgICAg ICAgICAgICAgICBzZWxmLnN0YWNrLmFwcGVuZCh0YWcpDQogICAgICAgICAgICAgICAgZWxzZToN CiAgICAgICAgICAgICAgICAgICAgICAgIEJ1aWxkZXIuZW5kRWxlbWVudChzZWxmLCB0YWcpDQoN Cg0KICAgICAgICBkZWYgdW5rbm93bl9lbmR0YWcoc2VsZiwgdGFnKToNCiAgICAgICAgICAgICAg ICB0YWcgPSBzdHJpbmcudXBwZXIodGFnKQ0KICAgICAgICAgICAgICAgICNwcmludCAnZW5kaW5n JywgdGFnDQoNCiAgICAgICAgICAgICAgICB3aGlsZSBzZWxmLnN0YWNrOg0KICAgICAgICAgICAg ICAgICAgICAgICAgaWYgdGFnIG5vdCBpbiBzZWxmLnN0YWNrOg0KICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAjcHJpbnQgImlnbm9yaW5nIGVuZCB0YWcgd2l0aCBubyBzdGFydCIsIHRh Zw0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBicmVhaw0KICAgICAgICAgICAgICAg ICAgICAgICAgaWYgdGFnIGluIHNlbGYuZW1wdGllczoNCiAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgY29udGludWUNCiAgICAgICAgICAgICAgICAgICAgICAgIHN0YXJ0X3RhZyA9IHNl bGYuc3RhY2tbLTFdDQogICAgICAgICAgICAgICAgICAgICAgICBkZWwgc2VsZi5zdGFja1stMV0N CiAgICAgICAgICAgICAgICAgICAgICAgIEJ1aWxkZXIuZW5kRWxlbWVudChzZWxmLCBzdGFydF90 YWcpDQogICAgICAgICAgICAgICAgICAgICAgICBpZiBzdGFydF90YWcgPT0gdGFnOg0KICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICBicmVhaw0KDQogICAgICAgIGRlZiBoYW5kbGVfZGF0 YShzZWxmLCBzKToNCiAgICAgICAgICAgICAgICAjcHJpbnQgYHNgDQogICAgICAgICAgICAgICAg QnVpbGRlci50ZXh0KHNlbGYsIHMpDQoNCiAgICAgICAgZGVmIGhhbmRsZV9jb21tZW50KHNlbGYs IHMpOg0KICAgICAgICAgICAgICAgIEJ1aWxkZXIuY29tbWVudChzZWxmLCBzKQ0KDQoNCiMgVGVz dC4NCmlmIF9fbmFtZV9fID09ICdfX21haW5fXyc6DQogICAgICAgIGltcG9ydCBzeXMNCiAgICAg ICAgYiA9IEh0bWxCdWlsZGVyKCkNCiAgICAgICAgYi5mZWVkKG9wZW4oc3lzLmFyZ3ZbMV0pLnJl YWQoKSkNCiAgICAgICAgYi5jbG9zZSgpDQojICAgICAgIHByaW50IGIuZG9jdW1lbnQNCiMgICAg ICAgcHJpbnQgYi5kb2N1bWVudC5kb2N1bWVudEVsZW1lbnQNCg0KICAgICAgICBmcm9tIHdyaXRl ciBpbXBvcnQgSHRtbExpbmVhcmlzZXINCiAgICAgICAgdyA9IEh0bWxMaW5lYXJpc2VyKCkNCiAg ICAgICAgcHJpbnQgdy5saW5lYXJpc2UoYi5kb2N1bWVudCkNCg0K --0__=sXY2JC4AkSF7bAz8mYHxK6g8BPRmGdzByc5nbgIpNdDhSRqCCg1doDbB-- From Stuart Hungerford" >Stuart Hungerford wrote: >> >> Now if I could just get a Python to XML translator >> I could do it all in Python... ;-) > >Do you mean something like the Pickle module, but producing XML >instead of custom text format? Yes -- I was only half joking when I wrote that. Being frustrated with XML I was yearning for the clean, simple object model, inheritance and module system that I've come to love in Python. I wonder if it's possible, if we assume the DTD is created independently and that there's a simple mapping from Python classes and methods to elements and instance members to attributes? I know, I need to get out more... Stu From randrus@edocs.com Fri Mar 5 00:08:01 1999 From: randrus@edocs.com (ross andrus) Date: Thu, 04 Mar 1999 19:08:01 -0500 Subject: [XML-SIG] Re: A bit off topic: XML advice needed... (Laurent Szyster) References: <009401be6697$195be420$9d462c8a@act.cmis.csiro.au> Message-ID: <36DF2061.9D3D103B@edocs.com> Stuart Hungerford wrote: > > >Stuart Hungerford wrote: > >> > >> Now if I could just get a Python to XML translator > >> I could do it all in Python... ;-) > > > >Do you mean something like the Pickle module, but producing XML > >instead of custom text format? > > Yes -- I was only half joking when I wrote that. Being frustrated > with XML I was yearning for the clean, simple object model, > inheritance and module system that I've come to love in Python. > > I wonder if it's possible, if we assume the DTD is created > independently and that there's a simple mapping from Python > classes and methods to elements and instance members to > attributes? > > I know, I need to get out more... If you haven't seen it, this might be interesting... http://jabr.ne.mediaone.net/documents/xmop.htm > Stu > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Ross Andrus randrus@edocs.com eDocs, Inc 508.651.8889 x2104 From jtauber@jtauber.com Fri Mar 5 13:17:26 1999 From: jtauber@jtauber.com (James Tauber) Date: Fri, 5 Mar 1999 21:17:26 +0800 Subject: [XML-SIG] A bit off topic: XML advice needed... Message-ID: <00a301be670a$880deb20$0300000a@othniel.cygnus.uwa.edu.au> >But how to express the XML equivalent? One way would be: > > Use an external ENTITY to textually include the date.dtd in > events.dtd and define > > > > > > That way I get self-documenting names for the date elements, > but at the cost of another level of markup needed for each date > in all my XML documents that use events.dtd. Don't be afraid to take this sort of approach where embedded element types seems to alternate between entity (in the analysis sense, not SGML/XML) and relationship. There are a lot of advantages as you are explicitly indicating that events have start-date and end-dates and that these are, in turn, dates. I would also consider using NOTATIONs on the data elements if you need to tell your application what particular date format you are using in any given instance. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net From Fred L. Drake, Jr." References: <8525672A.0080F99B.00@li01.lm.ssc.siemens.com> Message-ID: <14048.25457.35824.951369@weyr.cnri.reston.va.us> Jeff.Johnson@icn.siemens.com writes: > it. I did however add two lines of code to HtmlBuilder that will allow > these bogus end tags to be ignored. I hope that the new code can be added Jeff, I think this would be fine. Perhaps a parameter to the constructor, or just an instance variable, could be used to determine whether an exception is raised or the problem silently ignored. > def unknown_endtag(self, tag): > tag = string.upper(tag) > #print 'ending', tag > > while self.stack: > if tag not in self.stack: > #print "ignoring end tag with no start", > tag > break ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The whole bit that you added could be moved before the while loop; that would avoid the extra overhead without loss of functionality. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From Jean-Michel.Bruel@univ-pau.fr Mon Mar 8 10:53:56 1999 From: Jean-Michel.Bruel@univ-pau.fr (Jean-Michel BRUEL) Date: Mon, 8 Mar 1999 11:53:56 +0100 (MET) Subject: [XML-SIG] [CFP:] UML'99 (2nd Call For Paper) Message-ID: <199903081053.LAA18817@crisv4.univ-pau.fr> [apologies if you receive multiple copies of this announcement] ================================================================= 2nd Call for Papers <>'99 ================================================================= Second International Conference on the Unified Modeling Language October 28-30, 1999, Fort Collins, Colorado, USA (just before OOPSLA) ================================================================= http://www.cs.colostate.edu/UML99 ================================================================= Important dates (deadlines are hard!): Deadline for abstract 05 May 1999 Deadline for submission 15 May 1999 Notification to authors 15 July 1999 Final version of accepted papers 25 August 1999 Submissions: Submit your 10-15 page manuscript electronically in Postscript or pdf using the Springer LNCS style. Details are available at the conference web page. The <>'99 proceedings will be published by Springer-Verlag in the LNCS series. Further Information: Robert B. France E-mail: france@cs.colostate.edu Computer Science Department Tel: 970-491-6356 Colorado State University Fax: 970-491-2466 Fort Collins, CO 80523, USA Bernhard Rumpe E-mail: rumpe@in.tum.de Institut fuer Informatik Tel: 0049-89-289-28129 T. Universitaet Muenchen Fax: 0049-89-289-28183 80290 Muenchen, Germany Sponsored by IEEE Computer Society Technical Committee on Complexity in Computing In Cooperation with ACM SIGSOFT With the Support of OMG From Dickon.Reed@cl.cam.ac.uk Mon Mar 8 15:29:32 1999 From: Dickon.Reed@cl.cam.ac.uk (Dickon Reed) Date: Mon, 08 Mar 1999 15:29:32 +0000 Subject: [XML-SIG] newbie HTMLBuilder question; avoiding HTML escaping Message-ID: Hello, I've recently started playing about with the python XML stuff. I'm currently trying to build, using HTMLBuilder, a HTML document containing things like non-breaking spaces (ie a   in the HTML). I can't however find a simple way of getting this through the HTMLBuilder without it being escaped by the doText method of XmlWriter. Am I missing something obvious? What is the best way of try to do this? (As a hacky workaround, I added magic characters to the start of my strings by means of a new method in XML builder, and hacked XmlWriter doText not to call escape if it detects the magic characters, but this is clearly inelegant. I'd attached patches if this wasn't such a nasty hack). (If anyone is interested, an early version of my code is at http://www.cl.cam.ac.uk/~dr10009/xmltophys-0.01.tar.gz, so people can see what I'm doing). Thanks, Dickon From akuchlin@cnri.reston.va.us Mon Mar 8 15:41:04 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 8 Mar 1999 10:41:04 -0500 (EST) Subject: [XML-SIG] newbie HTMLBuilder question; avoiding HTML escaping In-Reply-To: References: Message-ID: <14051.60894.791821.390962@amarok.cnri.reston.va.us> Dickon Reed writes: >HTML). I can't however find a simple way of getting this through the >HTMLBuilder without it being escaped by the doText method of >XmlWriter. Am I missing something obvious? What is the best way of try >to do this? HTMLBuilder is trying to produce a proper DOM tree, and therefore the right thing to do is to create EntityReference nodes. Unfortunately, neither the HTMLBuilder class, nor the Builder class from which HTMLBuilder derives, provides a method to do that. This is an omission in the Builder classes. The solution is, therefore, to add a handle_entityref() method to HTMLBuilder's parsing, and an entityReference() method to the Builder class. I'll work on this tonight, and try to get the changes checked in before tomorrow. -- A.M. Kuchling http://starship.python.net/crew/amk/ What on earth is less likely than *two* committees to produce a seamless web of anything but intrigue and deficit? Who said "three committees"? -- Stan Kelly-Bootle, _The Computer Contradictionary_ From Jeff.Johnson@icn.siemens.com Mon Mar 8 17:24:27 1999 From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com) Date: Mon, 8 Mar 1999 12:24:27 -0500 Subject: [XML-SIG] HtmlBuilder Message-ID: <8525672E.005F38BA.00@li01.lm.ssc.siemens.com> Oops, thanks for finding the overhead. I figured someone could have optimized it somehow but I should have caught that at least :) I like the idea of allowing the user to toggle raising exceptions or ignoring the error. Thanks again, Jeff "Fred L. Drake" on 03/05/99 06:06:25 PM Please respond to "Fred L. Drake, Jr." To: Jeff Johnson/Service/ICN cc: xml-sig@python.org Subject: Re: [XML-SIG] HtmlBuilder Jeff.Johnson@icn.siemens.com writes: > it. I did however add two lines of code to HtmlBuilder that will allow > these bogus end tags to be ignored. I hope that the new code can be added Jeff, I think this would be fine. Perhaps a parameter to the constructor, or just an instance variable, could be used to determine whether an exception is raised or the problem silently ignored. > def unknown_endtag(self, tag): > tag = string.upper(tag) > #print 'ending', tag > > while self.stack: > if tag not in self.stack: > #print "ignoring end tag with no start", > tag > break ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The whole bit that you added could be moved before the while loop; that would avoid the extra overhead without loss of functionality. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From akuchlin@cnri.reston.va.us Mon Mar 8 17:31:03 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 8 Mar 1999 12:31:03 -0500 (EST) Subject: [XML-SIG] HtmlBuilder In-Reply-To: <8525672E.005F38BA.00@li01.lm.ssc.siemens.com> References: <8525672E.005F38BA.00@li01.lm.ssc.siemens.com> Message-ID: <14052.1946.956463.843024@amarok.cnri.reston.va.us> Jeff.Johnson@icn.siemens.com writes: [on ignoring bogus end tags] >I like the idea of allowing the user to toggle raising exceptions or >ignoring the error. Indeed. Here's a proposal for the interface: b = HTMLBuilder( ignore_mismatched_end_tags = 1 ) (This would be implemented by allowing keyword arguments to the constructor, and saving a copy of the keyword dictionary. That allows for new options in future, and saves the constructor from having lots of lines like self.ignore_mismatched_end_tags = ignore_mismatched_end_tags .) The default behaviour would be to raise an exception, which is in keeping with Python's general philosophy. -- A.M. Kuchling http://starship.python.net/crew/amk/ Athens built the Acropolis. Corinth was a commercial city, interested in purely materialistic things. Today we admire Athens, visit it, preserve the old temples, yet we hardly ever set foot in Corinth. -- Harold Urey From Fred L. Drake, Jr." References: <8525672E.005F38BA.00@li01.lm.ssc.siemens.com> <14052.1946.956463.843024@amarok.cnri.reston.va.us> Message-ID: <14052.2632.589988.940343@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > b = HTMLBuilder( ignore_mismatched_end_tags = 1 ) > > (This would be implemented by allowing keyword arguments to > the constructor, and saving a copy of the keyword dictionary. That > allows for new options in future, and saves the constructor from > having lots of lines like > self.ignore_mismatched_end_tags = ignore_mismatched_end_tags .) And completely avoids the checks for matching parameter keywords to formal parameter names. With explicitly named options, errors can be caught earlier. > The default behaviour would be to raise an exception, which is > in keeping with Python's general philosophy. I agree. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From Jeff.Johnson@icn.siemens.com Mon Mar 8 18:08:10 1999 From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com) Date: Mon, 8 Mar 1999 13:08:10 -0500 Subject: [XML-SIG] HtmlBuilder Message-ID: <8525672E.006339EE.00@li01.lm.ssc.siemens.com> Another problem I often see in hand-made HTML files is improperly nested tags. I wonder if there is a clean way to detect those errors? Example: Bold italic text with end tags in wrong order. Actually, I guess the ignore_mismatched_end_tags will fix this specific error. When the tag is read it will pop off the open tag as it always did, then the tag will be ignored with the new fix. Never mind :) Thanks for adding this to the CVS tree! I've already told my HTML hacker friends that I can now work with and fix their buggy web pages for them. A few months back I swore to them that Python XML was the greatest tool I knew of to change their banner code but then I couldn't read half of the files due to bad end tags. Cheers, Jeff "Andrew M. Kuchling" on 03/08/99 12:31:03 PM To: xml-sig@python.org cc: (bcc: Jeff Johnson/Service/ICN) Subject: Re: [XML-SIG] HtmlBuilder Jeff.Johnson@icn.siemens.com writes: [on ignoring bogus end tags] >I like the idea of allowing the user to toggle raising exceptions or >ignoring the error. Indeed. Here's a proposal for the interface: b = HTMLBuilder( ignore_mismatched_end_tags = 1 ) (This would be implemented by allowing keyword arguments to the constructor, and saving a copy of the keyword dictionary. That allows for new options in future, and saves the constructor from having lots of lines like self.ignore_mismatched_end_tags = ignore_mismatched_end_tags .) The default behaviour would be to raise an exception, which is in keeping with Python's general philosophy. -- A.M. Kuchling http://starship.python.net/crew/amk/ Athens built the Acropolis. Corinth was a commercial city, interested in purely materialistic things. Today we admire Athens, visit it, preserve the old temples, yet we hardly ever set foot in Corinth. -- Harold Urey _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://www.python.org/mailman/listinfo/xml-sig From Fred L. Drake, Jr." References: <8525672E.006339EE.00@li01.lm.ssc.siemens.com> Message-ID: <14052.4981.493573.852261@weyr.cnri.reston.va.us> Jeff.Johnson@icn.siemens.com writes: > Another problem I often see in hand-made HTML files is improperly nested > tags. I wonder if there is a clean way to detect those errors? ... > Actually, I guess the ignore_mismatched_end_tags will fix this specific > error. When the tag is read it will pop off the open tag as it > always did, then the tag will be ignored with the new fix. Never mind Jeff, Well, this specific error would be handled, but not a lot of variations. That's probably something best handled by a sub-class, as you'll probably identify a lot of weird cases, and many may be specific to the group of authors you're supporting. Grail includes a few hints buried throughout the HTMLParser and GrailHTMLParser classes, but no concise listing of the problems and workarounds we developed. I think having an "HTML Fixer" class would be really nice! As an aside: expect a new Grail release in the near future. There are a few maintenance issues, and a license that will make more people happy! -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From deltab@ps.cus.umist.ac.uk Mon Mar 8 18:39:06 1999 From: deltab@ps.cus.umist.ac.uk (Daniel Biddle) Date: Mon, 8 Mar 1999 18:39:06 +0000 (GMT) Subject: [XML-SIG] HtmlBuilder In-Reply-To: <14052.4981.493573.852261@weyr.cnri.reston.va.us> Message-ID: On Mon, 8 Mar 1999, Fred L. Drake wrote: > Jeff.Johnson@icn.siemens.com writes: > > Another problem I often see in hand-made HTML files is improperly nested > > tags. I wonder if there is a clean way to detect those errors? : > [...] I think having an "HTML Fixer" class would be really nice! You might want to take a look at HTML Tidy then: | When editing HTML it's easy to make mistakes. Wouldn't it be nice if | there was a simple way to fix these mistakes automatically and tidy up | sloppy editing into nicely layed out markup? Well now there is! Dave | Raggett's HTML TIDY is a free utility for doing just that. It also works | great on the atrociously hard to read markup generated by specialized HTML | editors and conversion tools, and can help you identify where you need | to pay further attention on making your pages more accessible to people with | disabilities. Best of all, it comes with complete source, so it shouldn't be too hard to make a HTML Fixer class out of it. You can find HTML Tidy at . -- Daniel Biddle From Fred L. Drake, Jr." References: <14052.4981.493573.852261@weyr.cnri.reston.va.us> Message-ID: <14052.6899.349323.101431@weyr.cnri.reston.va.us> I said: > [...] I think having an "HTML Fixer" class would be really nice! Daniel Biddle replied: > You might want to take a look at HTML Tidy then: Yes, I'm aware of HTML Tidy. I was thinking not so much of cleaning up the HTML for a site as being able to load it into an arbitrary program. I've no beef with Dan, but having it as a Python class can be useful as well; this would definately be nice in the context of a Web browser or, perhaps, a program that performs transformations of pages, even even converts HTML documents to some other format (PostScript leaps to mind). There's a certain impedence mismatch between stylesheet application and ill-formed documents. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives 1895 Preston White Dr. Reston, VA 20191 From akuchlin@cnri.reston.va.us Mon Mar 8 20:17:01 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 8 Mar 1999 15:17:01 -0500 (EST) Subject: [XML-SIG] HtmlBuilder In-Reply-To: <14052.6899.349323.101431@weyr.cnri.reston.va.us> References: <14052.4981.493573.852261@weyr.cnri.reston.va.us> <14052.6899.349323.101431@weyr.cnri.reston.va.us> Message-ID: <14052.11711.539764.180720@amarok.cnri.reston.va.us> Fred L. Drake writes: > Yes, I'm aware of HTML Tidy. I was thinking not so much of cleaning >up the HTML for a site as being able to load it into an arbitrary >program. I've no beef with Dan, but having it as a Python class can >be useful as well; this would definately be nice in the context of a Indeed. In particular, it would be useful for Web discussion forums and other applications where users can produce HTML to be included. Some sites, such as slashdot.org, attempt to restrict the tags the users can use; you can't use
, for example.  But that
doesn't prevent a user entering an unclosed 
    tag, which will mess up the rest of the page. It would be far more powerful to parse the possibly-bogus HTML and produce a well-formed rendering of it. Unclosed tags could be handled now; just use a forgiving version of HTMLBuilder to get a DOM tree, and output well-formed HTML from the tree. But that wouldn't handle invalid HTML (like using
  • outside of
      or
        ) or style that's bad but legal (images without ALT attributes). I've been toying with the idea of converting my Web pages to XML-compatible HTML for a while, and may play with this a bit. -- A.M. Kuchling http://starship.python.net/crew/amk/ I am so scared. It's strange. For many thousand years I have prayed for death. I have prayed to all the gods for peace and relief and... I have prayed for an ending. -- Orpheus, in SANDMAN #49: "Brief Lives:9" From jps@warpspeed.net Tue Mar 9 07:59:07 1999 From: jps@warpspeed.net (Jeff Stearns) Date: Mon, 8 Mar 1999 23:59:07 -0800 Subject: [XML-SIG] Trouble with DLL file in XML distribution Message-ID: <000001be6a02$b5575170$3f02a8c0@norton.warpspeedcom.com> Folks - I'm reporting trouble with the XML beta source distribution from http://www.python.org/topics/xml/download.html. After unpacking, I'm trying to run the various test suites. This reveals trouble with the DLL file in the windows subdirectory. The Python interpreter reports that windows/sgmlop.dll is not a valid Windows NT DLL. The error code is 193. I'm running Windows NT 4.0 (service pack 3) on a Pentium. Sadly, I don't have the tools to regenerate this dll myself. Jeff Stearns / WarpSpeed Communications / 925-398-1048 / jps@warpspeed.net From Jacco.van.Ossenbruggen@cwi.nl Tue Mar 9 15:39:12 1999 From: Jacco.van.Ossenbruggen@cwi.nl (J.R. van Ossenbruggen) Date: Tue, 09 Mar 1999 16:39:12 +0100 Subject: [XML-SIG] Q: Wrapping DOMs Message-ID: Hello, My application provides a DOM-based API to its internal data-structure. This data-structure represents the XML document loaded into the application, but does not have the same hierarchical structure. What I want to do is to wrap the current DOM interface (DOM1) by a new one (DOM2), so that DOM2 reflects the structure of the XML document, and not that of my internal data-structure. Changes to the data-structure made by the application should be reflected in (both) DOMs, and the modification of the internal data-structure via the DOM(s) should also be supported. Any comments (has someone done something similar before?) are welcome... Jacco From jfarr@real.com Tue Mar 9 19:21:41 1999 From: jfarr@real.com (Jonothan Farr) Date: Tue, 9 Mar 1999 11:21:41 -0800 Subject: [XML-SIG] Re: HtmlBuilder Message-ID: <053101be6a62$0fc376c0$eb0210ac@two235.dev.prognet.com> >Jeff.Johnson@icn.siemens.com writes: > [on ignoring bogus end tags] >>I like the idea of allowing the user to toggle raising exceptions or >>ignoring the error. > >Indeed. Here's a proposal for the interface: > >b = HTMLBuilder( ignore_mismatched_end_tags = 1 ) >From: "Fred L. Drake" >To: Jeff.Johnson@icn.siemens.com > >Jeff, > Well, this specific error would be handled, but not a lot of >variations. That's probably something best handled by a sub-class, as >you'll probably identify a lot of weird cases, and many may be >specific to the group of authors you're supporting. If you're going to subclass, you could also just call a handle_mismatched_end_tag() method which raises an exception in the base class, but can be overridden in the derived class to do nothing or perhaps just print out a warning. --jfarr From jps@warpspeed.net Wed Mar 10 01:02:02 1999 From: jps@warpspeed.net (Jeff Stearns) Date: Tue, 9 Mar 1999 17:02:02 -0800 Subject: [XML-SIG] Trouble report #2 for XML beta source distribution Message-ID: <000a01be6a91$9b77bb70$3f02a8c0@norton.warpspeedcom.com> Folks - I'm reporting trouble with the XML beta source distribution from http://www.python.org/topics/xml/download.html. I'm trying to run the various demos and test suites. They generally fail on my NT box. One problem is that most of the imports fail because they use Python package syntax, and I think that the distribution is missing some files needed for packages to work. For example, I think that the collection should include a file xml/sax/__init__.py which contains the single line __all__ = ['saxexts', 'saxlib', 'saxutils'] Likewise, there should be an __init__.py file for the dom subdirectory. Other directories may be affected; I haven't gotten that far. Jeff Stearns / WarpSpeed Communications / 925-398-1048 / jps@warpspeed.net From jps@warpspeed.net Wed Mar 10 01:32:22 1999 From: jps@warpspeed.net (Jeff Stearns) Date: Tue, 9 Mar 1999 17:32:22 -0800 Subject: [XML-SIG] Trouble report #3 for XML beta source distribution Message-ID: <000b01be6a95$d83f21c0$3f02a8c0@norton.warpspeedcom.com> Folks - I'm reporting trouble with the XML beta source distribution from http://www.python.org/topics/xml/download.html. I believe that there's an error in the file xml/test/quotations.xml. The start tag doesn't match the DOCTYPE in the DTD. I don't think that's an intentional error. To fix this bug, the contents of the file should be enclosed within ... tags as specified by the DTD. Jeff Stearns / WarpSpeed Communications / 925-398-1048 / jps@warpspeed.net From tony.mcdonald@ncl.ac.uk Wed Mar 10 09:32:35 1999 From: tony.mcdonald@ncl.ac.uk (Tony McDonald) Date: Wed, 10 Mar 1999 09:32:35 +0000 Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML Message-ID: Hi all, I've recently fallen onto 'XML in python' from a 'XML in Perl/PHP' perspective and after downloading and playing with the XML-0.5 release I'm mightily impressed!. As I'm new to Python, I keep finding things that make me go 'ooo, thats neat!'. One thing I really need to do is to 'query' XML documents, and extract portions of them. In Perl I would use the XQL module, based on the XQL http://www.w3.org/TandS/QL/QL98/pp/xql.html W3C note, which allows you to select tags, and optionally, their children based upon quite a few search criteria (eg value of attribute or whether the parent has an attribute or content that matches your search term). Is there anything like this in the XML-Python world? any pointers gratefully received! ta tone ------ Dr Tony McDonald, FMCC, Networked Learning Environments Project The Medical School, Newcastle University Tel: +44 191 222 5888 Fingerprint: 3450 876D FA41 B926 D3DD F8C3 F2D0 C3B9 8B38 18A2 From akuchlin@cnri.reston.va.us Wed Mar 10 15:56:19 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Wed, 10 Mar 1999 10:56:19 -0500 (EST) Subject: [XML-SIG] News.com - Sun, Adobe offer bounty for XSL Message-ID: <14054.38263.515856.672209@amarok.cnri.reston.va.us> Hey, kids! Time to fire up JPython and get going on the second prize, by writing an XSL implementation.. [Reset any follow-ups to the XML-SIG, please...] From http://news.com/News/Item/0,4,33534,00.html: Sun will put up $30,000 for implementations of XSL to be added to the Mozilla.org open source effort, developing the source code to Netscape Communications' Communicator browser. This implementation would be a plug-in that would provide XSL formatting capabilities for the Mozilla browser and would fall under the Mozilla public license. The second set of prizes, funded in part by Adobe, will provide a $40,000 first prize and a $20,000 second prize for a print-oriented batch formatter written in Sun's Java programming language and that supports Adobe's portable document format (PDF). The batch formatter will let a printer process information from style sheets when printing batches of data. -- A.M. Kuchling http://starship.python.net/crew/amk/ Two paradoxes are better than one; they may even suggest a solution. -- Edward Teller From Fred L. Drake, Jr." References: <14054.38263.515856.672209@amarok.cnri.reston.va.us> Message-ID: <14054.38922.961633.521147@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > Hey, kids! Time to fire up JPython and get going on the second prize, > by writing an XSL implementation.. ... > The second set of prizes, funded in part by Adobe, will provide a > $40,000 first prize and a $20,000 second prize for a print-oriented > batch formatter written in Sun's Java programming language and that > supports Adobe's portable document format (PDF). The batch Andrew, Nice try, but no cigar! It says *written* in Java, not just "100% Pure Java". There's a real difference there! -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From akuchlin@cnri.reston.va.us Wed Mar 10 16:18:31 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Wed, 10 Mar 1999 11:18:31 -0500 (EST) Subject: [XML-SIG] News.com - Sun, Adobe offer bounty for XSL In-Reply-To: <14054.38922.961633.521147@weyr.cnri.reston.va.us> References: <14054.38263.515856.672209@amarok.cnri.reston.va.us> <14054.38922.961633.521147@weyr.cnri.reston.va.us> Message-ID: <14054.39522.817100.497278@amarok.cnri.reston.va.us> Fred L. Drake writes: > Nice try, but no cigar! It says *written* in Java, not just "100% >Pure Java". There's a real difference there! I had 2 reasons for considering it still relevant, despite the "written in Java" requirement. 1) I got the impression Python was good for prototyping; prototype it in Python, and then port it. Isn't that one of the advantages of JPython? 2) Perhaps an XSL processor with embedded JPython would be useful above just an XSL processor; I haven't read the XSL draft in a while, and don't know if an embedded scripting language would add additional power. -- A.M. Kuchling http://starship.python.net/crew/amk/ Confront a child, a puppy, and a kitten with a sudden danger; the child will turn instinctively for more assistance, the puppy will grovel in abject submission, the kitten will brace its tiny body for a frantic resistance. -- H.H. Munro From Fred L. Drake, Jr." References: <14054.38263.515856.672209@amarok.cnri.reston.va.us> <14054.38922.961633.521147@weyr.cnri.reston.va.us> <14054.39522.817100.497278@amarok.cnri.reston.va.us> Message-ID: <14054.40447.582089.969037@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > 1) I got the impression Python was good for prototyping; > prototype it in Python, and then port it. Isn't that one of the Gosh, that's not what I use it for... ;-) > 2) Perhaps an XSL processor with embedded JPython would be > useful above just an XSL processor; I haven't read the XSL draft in a > while, and don't know if an embedded scripting language would add > additional power. My understanding is that the scripting facility has been removed, but I'd have to look to be certain. I think it clear that a scripting engine embedded with a styling or transformation engine would be useful, especially Python. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From paul@prescod.net Wed Mar 10 15:49:29 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 10 Mar 1999 08:49:29 -0700 Subject: [XML-SIG] News.com - Sun, Adobe offer bounty for XSL References: <14054.38263.515856.672209@amarok.cnri.reston.va.us> <14054.38922.961633.521147@weyr.cnri.reston.va.us> <14054.39522.817100.497278@amarok.cnri.reston.va.us> Message-ID: <36E69489.E5E1CAB8@prescod.net> "Andrew M. Kuchling" wrote: > > 2) Perhaps an XSL processor with embedded JPython would be > useful above just an XSL processor; I haven't read the XSL draft in a > while, and don't know if an embedded scripting language would add > additional power. It would, but the mechanism for embedding scripting is not yet defined. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News From dieter@handshake.de Wed Mar 10 20:14:42 1999 From: dieter@handshake.de (Dieter Maurer) Date: Wed, 10 Mar 1999 20:14:42 +0000 (/etc/localtime) Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML In-Reply-To: References: Message-ID: <14054.53469.134597.176701@lindm.dm> Hello Tony > One thing I really need to do is to 'query' XML documents, and > extract portions of them. In Perl I would use the XQL module, based > on the XQL http://www.w3.org/TandS/QL/QL98/pp/xql.html W3C note, > which allows you to select tags, and optionally, their children > based upon quite a few search criteria (eg value of attribute or > whether the parent has an attribute or content that matches your > search term). If you can first transform your XML document into a DOM, you may find "xsl-pattern" useful. "xsl-pattern" is an implementation of the XSL pattern subset (which is a XQL subset) -- see URL:http://www.handshake.de/pyprojects/xslpattern.html for details. Recently, 4XSL has been announced on this list (look into the archive). At the time of the announcement, only match patterns had been implemented. But, this may have changed meantime. Dieter From Jeff.Johnson@icn.siemens.com Wed Mar 10 21:57:18 1999 From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com) Date: Wed, 10 Mar 1999 16:57:18 -0500 Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML Message-ID: <85256730.00782A3B.00@li01.lm.ssc.siemens.com> Looks like the tilda was missing from that URL... this one works: http://www.handshake.de/~dieter/pyprojects/xslpattern.html Cheers :) Dieter Maurer on 03/10/99 03:14:42 PM To: tony.mcdonald@ncl.ac.uk (Tony McDonald) cc: xml-sig@python.org (bcc: Jeff Johnson/Service/ICN) Subject: Re: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML Hello Tony > One thing I really need to do is to 'query' XML documents, and > extract portions of them. In Perl I would use the XQL module, based > on the XQL http://www.w3.org/TandS/QL/QL98/pp/xql.html W3C note, > which allows you to select tags, and optionally, their children > based upon quite a few search criteria (eg value of attribute or > whether the parent has an attribute or content that matches your > search term). If you can first transform your XML document into a DOM, you may find "xsl-pattern" useful. "xsl-pattern" is an implementation of the XSL pattern subset (which is a XQL subset) -- see URL:http://www.handshake.de/pyprojects/xslpattern.html for details. Recently, 4XSL has been announced on this list (look into the archive). At the time of the announcement, only match patterns had been implemented. But, this may have changed meantime. Dieter _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://www.python.org/mailman/listinfo/xml-sig From JWight@bigfoot.com Thu Mar 11 02:09:20 1999 From: JWight@bigfoot.com (Jonathan Wight) Date: Wed, 10 Mar 1999 21:09:20 -0500 Subject: [XML-SIG] Getting expat working on MacOS. Message-ID: <199903110206.VAA07741@smtp1.mindspring.com> I downloaded the xml-0.5 from the python/xml-sig page on the python.org website. Unfortunately my version of CodeWarrior is too old to load the expat project. Can someone advise me in how to get expat to compile on the Mac? I've fixed the minor problems but there are a few, especially with the xmltol files. Help needed. Thanks in advance. Jon From mike.olson@fourthought.com Thu Mar 11 05:06:43 1999 From: mike.olson@fourthought.com (Mike Olson) Date: Wed, 10 Mar 1999 23:06:43 -0600 Subject: [XML-SIG] News.com - Sun, Adobe offer bounty for XSL References: <14054.38263.515856.672209@amarok.cnri.reston.va.us> <14054.38922.961633.521147@weyr.cnri.reston.va.us> <14054.39522.817100.497278@amarok.cnri.reston.va.us> <14054.40447.582089.969037@weyr.cnri.reston.va.us> Message-ID: <36E74F63.534B990@fourthought.com> This is a cryptographically signed message in MIME format. --------------msCEB52D8E38A9FA026513AD23 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit We've been working on something (based alot on Paul's ideas talked about ealier on the list), where we take the XSL patterns and turn them into compilied patterns, similar to re. You can then use these compilied pattern objects to select a list of nodes that match given a context node. >>>p = CreateXSLPatternObject('chapter/title[first-of-type() or last-of-type()]') >>>results = p.select(node) Each compilied pattern has a select and a match. The select will return a list of nodes that match given a context node as a reference. The match will return 1/0 if the context node matches the pattern. We then added a visitor of sorts. It will let you register callback functions with specific patterns. Then visit the first callback that the context node matches too. >>e.register(p,ProcessChapterTitles) >>e.applyPattern(node) would call ProcessChapterTitles with node as a context node, if node matches the pattern p With this it would be very easy to extend XSL to use python as your scripting language, or in general use it to do functional programming based on a DOM tree. We have already extended it to implement most of the XSL spec. Basically defining callbacks for the differnt types of templates in the XSL document. This will then register events in a SAX like interface, ie. start_element_event, attribute_event, etc. In its native form these callbacks just build a result tree, but can be overridden. We are interested the Sun's XSL Bounty (as I am sure most are). Moving the python to Java would be pretty straightforward work. However we are looking for someone to team up with on the formatting object side of XSL. If anyone is interested in the pattern matching code, let me know. I am in the process now of finishing up all of the XSL templates and should have something ready for public consumtion by the end of week or early next week. Mike "Fred L. Drake" wrote: > Andrew M. Kuchling writes: > > 1) I got the impression Python was good for prototyping; > > prototype it in Python, and then port it. Isn't that one of the > > Gosh, that's not what I use it for... ;-) > > > 2) Perhaps an XSL processor with embedded JPython would be > > useful above just an XSL processor; I haven't read the XSL draft in a > > while, and don't know if an embedded scripting language would add > > additional power. > > My understanding is that the scripting facility has been removed, > but I'd have to look to be certain. I think it clear that a scripting > engine embedded with a styling or transformation engine would be > useful, especially Python. > > -Fred > > -- > Fred L. Drake, Jr. > Corporation for National Research Initiatives > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Mike Olson Member Consultant FourThought LLC http://www.fourthought.com http://opentechnology.org --- "No program is interesting in itself to a programmer. It's only interesting as long as there are new challenges and new ideas coming up." --- Linus Torvalds --------------msCEB52D8E38A9FA026513AD23 Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIIKmQYJKoZIhvcNAQcCoIIKijCCCoYCAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3DQEHAaCC CCUwggTvMIIEWKADAgECAhAOCY8cYeSQOObs5zKyDmWRMA0GCSqGSIb3DQEBBAUAMIHMMRcw FQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UECxMWVmVyaVNpZ24gVHJ1c3QgTmV0d29y azFGMEQGA1UECxM9d3d3LnZlcmlzaWduLmNvbS9yZXBvc2l0b3J5L1JQQSBJbmNvcnAuIEJ5 IFJlZi4sTElBQi5MVEQoYyk5ODFIMEYGA1UEAxM/VmVyaVNpZ24gQ2xhc3MgMSBDQSBJbmRp dmlkdWFsIFN1YnNjcmliZXItUGVyc29uYSBOb3QgVmFsaWRhdGVkMB4XDTk5MDMwNTAwMDAw MFoXDTk5MDUwNDIzNTk1OVowggEKMRcwFQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UE CxMWVmVyaVNpZ24gVHJ1c3QgTmV0d29yazFGMEQGA1UECxM9d3d3LnZlcmlzaWduLmNvbS9y ZXBvc2l0b3J5L1JQQSBJbmNvcnAuIGJ5IFJlZi4sTElBQi5MVEQoYyk5ODEeMBwGA1UECxMV UGVyc29uYSBOb3QgVmFsaWRhdGVkMSYwJAYDVQQLEx1EaWdpdGFsIElEIENsYXNzIDEgLSBO ZXRzY2FwZTETMBEGA1UEAxQKTWlrZSBPbHNvbjEpMCcGCSqGSIb3DQEJARYabWlrZS5vbHNv bkBmb3VydGhvdWdodC5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBANKGswZUnQ/B IfNlZWIIy6G6AkyjYgPRhXynebPtI5ARMq9xDo2zgLgWE+8QffdoZp2hUnTpm63B6cG8yqH1 PnA/7SB2roIfml1vnOwXgNuBctciTmnrac4GWgL0CM9839fJZh47QIVYPlCbOPtnvnH1NGGD jFWAVX7vmES72Dl9AgMBAAGjggGPMIIBizAJBgNVHRMEAjAAMIGsBgNVHSAEgaQwgaEwgZ4G C2CGSAGG+EUBBwEBMIGOMCgGCCsGAQUFBwIBFhxodHRwczovL3d3dy52ZXJpc2lnbi5jb20v Q1BTMGIGCCsGAQUFBwICMFYwFRYOVmVyaVNpZ24sIEluYy4wAwIBARo9VmVyaVNpZ24ncyBD UFMgaW5jb3JwLiBieSByZWZlcmVuY2UgbGlhYi4gbHRkLiAoYyk5NyBWZXJpU2lnbjARBglg hkgBhvhCAQEEBAMCB4AwgYYGCmCGSAGG+EUBBgMEeBZ2ZDQ2NTJiZDYzZjIwNDcwMjkyOTg3 NjNjOWQyZjI3NTA2OWM3MzU5YmVkMWIwNTlkYTc1YmM0YmM5NzAxNzQ3ZGE1ZDNmMjE0MWJl YWRiMmJkMmU4OTIxM2FlNmFmOWRmMTE0OTk5YTNiODQ1ZjlmM2VhNDUwYzAzBgNVHR8ELDAq MCigJqAkhiJodHRwOi8vY3JsLnZlcmlzaWduLmNvbS9jbGFzczEuY3JsMA0GCSqGSIb3DQEB BAUAA4GBAIuxBeIOBMHbj5yM/Vu4UJxDcz4Xtc7h0K8c6d82SiwwKLN5Gbew69PevcN6Ak+p D8LO4NyCH8Cfu3acoT0Efi99XjWvdi2eSbDJUw6MvgJtnAfY03zM+Cf31A/1iyrvr3hD45/c yhUNRh8f6qX1NzeKvvh5AcYD1bsi+0wnP0D8MIIDLjCCApegAwIBAgIRANJ2Lo0UDD19sqgl Xa/uDXUwDQYJKoZIhvcNAQECBQAwXzELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWdu LCBJbmMuMTcwNQYDVQQLEy5DbGFzcyAxIFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRpb24g QXV0aG9yaXR5MB4XDTk4MDUxMjAwMDAwMFoXDTA4MDUxMjIzNTk1OVowgcwxFzAVBgNVBAoT DlZlcmlTaWduLCBJbmMuMR8wHQYDVQQLExZWZXJpU2lnbiBUcnVzdCBOZXR3b3JrMUYwRAYD VQQLEz13d3cudmVyaXNpZ24uY29tL3JlcG9zaXRvcnkvUlBBIEluY29ycC4gQnkgUmVmLixM SUFCLkxURChjKTk4MUgwRgYDVQQDEz9WZXJpU2lnbiBDbGFzcyAxIENBIEluZGl2aWR1YWwg U3Vic2NyaWJlci1QZXJzb25hIE5vdCBWYWxpZGF0ZWQwgZ8wDQYJKoZIhvcNAQEBBQADgY0A MIGJAoGBALtaRIoEFrtV/QN6ii2UTxV4NrgNSrJvnFS/vOh3Kp258Gi7ldkxQXB6gUu5SBNW LccI4YRCq8CikqtEXKpC8IIOAukv+8I7u77JJwpdtrA2QjO1blSIT4dKvxna+RXoD4e2HOPM xpqOf2okkuP84GW6p7F+78nbN2rISsgJBuSZAgMBAAGjfDB6MBEGCWCGSAGG+EIBAQQEAwIB BjBHBgNVHSAEQDA+MDwGC2CGSAGG+EUBBwEBMC0wKwYIKwYBBQUHAgEWH3d3dy52ZXJpc2ln bi5jb20vcmVwb3NpdG9yeS9SUEEwDwYDVR0TBAgwBgEB/wIBADALBgNVHQ8EBAMCAQYwDQYJ KoZIhvcNAQECBQADgYEAiLg3O93alDcAraqf4YEBcR6Sam0v9vGd08pkONwbmAwHhluFFWoP uUmFpJXxF31ntH8tLN2aQp7DPrSOquULBt7yVir6M8e+GddTTMO9yOMXtaRJQmPswqYXD11Y Gkk8kFxVo2UgAP0YIOVfgqaxqJLFWGrBjQM868PNBaKQrm4xggI8MIICOAIBATCB4TCBzDEX MBUGA1UEChMOVmVyaVNpZ24sIEluYy4xHzAdBgNVBAsTFlZlcmlTaWduIFRydXN0IE5ldHdv cmsxRjBEBgNVBAsTPXd3dy52ZXJpc2lnbi5jb20vcmVwb3NpdG9yeS9SUEEgSW5jb3JwLiBC eSBSZWYuLExJQUIuTFREKGMpOTgxSDBGBgNVBAMTP1ZlcmlTaWduIENsYXNzIDEgQ0EgSW5k aXZpZHVhbCBTdWJzY3JpYmVyLVBlcnNvbmEgTm90IFZhbGlkYXRlZAIQDgmPHGHkkDjm7Ocy sg5lkTAJBgUrDgMCGgUAoIGxMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcN AQkFMQ8XDTk5MDMxMTA1MDY0NFowIwYJKoZIhvcNAQkEMRYEFDyS5B14jBhuR1lIQUBvrldt RH+QMFIGCSqGSIb3DQEJDzFFMEMwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMAcGBSsO AwIHMA0GCCqGSIb3DQMCAgFAMA0GCCqGSIb3DQMCAgEoMA0GCSqGSIb3DQEBAQUABIGAe+0w QSb9T+lrbqkI4X5EVcgwGeM9Y9ccYqE4f7UXfAn2dnJ9hpup/QI+B/X0FRMfvoSsgqmeS/J+ TjDEIKKY+gtzxbxE2B9TXxLHsdOdSsRg5VzSHSPQVZZmcAP/3bLYxSqXRiL2sO46EYnDdnHT C41Yh19+Ze4hx7GLn6iE/4c= --------------msCEB52D8E38A9FA026513AD23-- From uche.ogbuji@fourthought.com Thu Mar 11 05:23:09 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Wed, 10 Mar 1999 22:23:09 -0700 Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML In-Reply-To: Your message of "Wed, 10 Mar 1999 20:14:42 GMT." <14054.53469.134597.176701@lindm.dm> Message-ID: <199903110523.WAA03757@malatesta.local> > Recently, 4XSL has been announced on this list (look into the > archive). At the time of the announcement, only match patterns > had been implemented. But, this may have changed meantime. > > Dieter Well, we didn't so much announce 4XSL as mention that we were working on it and close to release. As it happened, this spawned an excellent conversation with Paul Prescod which led to a thorough re-design of 4XSL. The bad news is that this causes a delay in our release (2-3 weeks, it seems). The GOOD news is that 4XSL will support an API similar to what Paul suggested, and that since re-designing, we have found this new approach to be extraordinarily powerful. We are now a bit further along with the re-designed 4XSL as we were with the old approach we mentioned earlier. We just want to add a few more transformation elements (we think we have just about all the pattern-matching), and do some more testing. I hope to be able to announce the first public version next week. Thanks -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Thu Mar 11 05:29:29 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Wed, 10 Mar 1999 22:29:29 -0700 Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML In-Reply-To: Your message of "Wed, 10 Mar 1999 09:32:35 GMT." Message-ID: <199903110529.WAA03772@malatesta.local> > Hi all, > I've recently fallen onto 'XML in python' from a 'XML in Perl/PHP' > perspective and after downloading and playing with the XML-0.5 > release I'm mightily impressed!. As I'm new to Python, I keep finding > things that make me go 'ooo, thats neat!'. > > One thing I really need to do is to 'query' XML documents, and > extract portions of them. In Perl I would use the XQL module, based > on the XQL http://www.w3.org/TandS/QL/QL98/pp/xql.html W3C note, > which allows you to select tags, and optionally, their children > based upon quite a few search criteria (eg value of attribute or > whether the parent has an attribute or content that matches your > search term). > > Is there anything like this in the XML-Python world? > > any pointers gratefully received! Well, speaking of "pointers", you might want to look at Lars Garshol's PyPointers package for the purpose. It implements part of the XPointer spec, allowing you to point to portions of a document, and if you combine this with a DOM representation of your doc, you might be able to get what you need. It really depends on what exactly you need to extract/query: XPointer is not as ambitious as XQL. The package is at http://www.stud.ifi.uio.no/~larsga/download/python/xml/xptr.html It comes with examples that use PyDOM, and 4DOM ships with a modified xptr.py with support for 4DOM, so you have choice (ever a lovely thing). -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From jtauber@jtauber.com Thu Mar 11 07:53:09 1999 From: jtauber@jtauber.com (James Tauber) Date: Thu, 11 Mar 1999 15:53:09 +0800 Subject: [XML-SIG] News.com - Sun, Adobe offer bounty for XSL Message-ID: <012501be6b94$3620be80$0300000a@othniel.cygnus.uwa.edu.au> > 1) I got the impression Python was good for prototyping; >prototype it in Python, and then port it. Isn't that one of the >advantages of JPython? That's what I did with FOP[1]. I wrote a partical FO to PDF formatter in Python, then ported class at a time using JPython until it was entirely in Java. James [1] http://www.jtauber.com/fop/ From digitome@iol.ie Thu Mar 11 08:37:42 1999 From: digitome@iol.ie (Sean Mc Grath) Date: Thu, 11 Mar 1999 08:37:42 +0000 Subject: [XML-SIG] News.com - Sun, Adobe offer bounty for XSL In-Reply-To: <012501be6b94$3620be80$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <3.0.6.32.19990311083742.0092bc00@gpo.iol.ie> A fable:- Once upon a time a developer whose first love was Python found himself deep in the dark, thick jungle known as C++. The big bad customer had bellowed "You must use C++ to get out of this jungle or you will not get paid!". It was clear to the developer that Python was the right tool for the job. What to do? A cunning plan was hatched over the campfire. The developer used Python in all is iridescent, commodius, munificence to work on the the problem and with a few deft strokes, generated C++ ready for static compilation. The jungle was tamed. The customer paid up. The developer moved on to the next jungle with a faint hint of swagger, hiding a surrepticous grin in his docstrings... From Fred L. Drake, Jr." References: <012501be6b94$3620be80$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <14055.58944.862224.562811@weyr.cnri.reston.va.us> James Tauber writes: > That's what I did with FOP[1]. I wrote a partical FO to PDF formatter in > Python, then ported class at a time using JPython until it was entirely in James, Any chance that the prototype might makes an appearance on the FOP Web page? ;-) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From tony.mcdonald@ncl.ac.uk Thu Mar 11 15:49:46 1999 From: tony.mcdonald@ncl.ac.uk (Tony McDonald) Date: Thu, 11 Mar 1999 15:49:46 +0000 Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML Message-ID: <199903111550.PAA16904@cheviot.ncl.ac.uk> > > > Looks like the tilda was missing from that URL... this one works: > > http://www.handshake.de/~dieter/pyprojects/xslpattern.html > > Cheers :) > > [snip] > If you can first transform your XML document into a DOM, > you may find "xsl-pattern" useful. > "xsl-pattern" is an implementation of the XSL pattern subset > (which is a XQL subset) -- see > URL:http://www.handshake.de/pyprojects/xslpattern.html > for details. > [snip] Many thanks for that Dieter (for the info.) and Jeff (for the URL :). Dieter, do you have any sample code where the package is 'doing its stuff'? I've downloaded it and had a look around - I had always thought that XSL was a stylesheet translation language for XML, and where I've seen it mentioned, DSSSL and Jade (both of which blew my head off!) haven't been far behind. Have I missed something really obvious? At the moment, I have well formed and valid XML generated from an Omnimark program. I want to search on it and extract from it, based on tag structure and attribute values, sub-parts of the original XML document. I then want to churn the resultant tag-soup into HTML and RTF. XQL 'hid' all the DOM stuff from me (enter a command line query and the XML that 'matched' would come spurting out). The main problems are that the Perl XQL is quite slow, and it doesn't do any StyleSheet manipulations. It seems like it may be time to start playing with the big boys and DOM's, but being a Python newbie (via Zope) I'm a wee bit apprehensive about this.... :) So I guess *now* I'm asking - how do you create a DOM given that you have a well-formed XML document to feed it... Thanks for any pointers... tone From akuchlin@cnri.reston.va.us Thu Mar 11 16:06:26 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Thu, 11 Mar 1999 11:06:26 -0500 (EST) Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML In-Reply-To: <199903111550.PAA16904@cheviot.ncl.ac.uk> References: <199903111550.PAA16904@cheviot.ncl.ac.uk> Message-ID: <14055.59238.557383.671149@amarok.cnri.reston.va.us> Tony McDonald writes: >Dieter, do you have any sample code where the package is 'doing its stuff'? >I've downloaded it and had a look around - I had always thought that XSL was >a stylesheet translation language for XML, and where I've seen it mentioned, >DSSSL and Jade (both of which blew my head off!) haven't been far XSL seems to actually roll two ideas together. There's a generic transformation language, which takes a document tree and a specification for the transformation, and outputs the transformed document. This could turn any XML document into an document for any other XML DTD, so you could convert database entries into an HTMLish or DocBook-ish form. XSL also specifies a whole bunch of standard names for formatting documents into a printed form, so an XSL processor could take those and output matching TeX, PostScript, or whatever. However, the transformation part of XSL is useful even if you're not formatting for printing; the two ideas are separate and mushed together into one specification. I think Paul Prescod suggested dividing the XSL work into the transformation language and the formatting objects, and that seems like an excellent idea. >So I guess *now* I'm asking - how do you create a DOM given that you have a >well-formed XML document to feed it... There's a section on that in the HOWTO: http://www.python.org/doc/howto/xml/node12.html . The easiest way would be to read from a file: from xml.dom import utils reader = utils.FileReader('foo.xml') reader.document is then the root of the DOM tree. I don't know anything about how the XQL stuff works. -- A.M. Kuchling http://starship.python.net/crew/amk/ "Since I'm so close to the pickle module, I just look at the pickles directly, as I'm pretty good at reading pickles." "As you all can imagine, this trick goes over really well at parties." -- Jim Fulton and Paul Everitt on the Bobo list, 17 Jul 1998 From jtauber@jtauber.com Thu Mar 11 22:23:25 1999 From: jtauber@jtauber.com (James Tauber) Date: Fri, 12 Mar 1999 06:23:25 +0800 Subject: [XML-SIG] News.com - Sun, Adobe offer bounty for XSL Message-ID: <012c01be6c0f$55909300$0300000a@othniel.cygnus.uwa.edu.au> I wrote: > > That's what I did with FOP[1]. I wrote a partical FO to PDF formatter in > > Python, then ported class at a time using JPython until it was entirely in Fred wrote: > Any chance that the prototype might makes an appearance on the FOP >Web page? ;-) Well, it's probably too out of date to be of much use and I hadn't done much before I switched to Java. The PDF code could be turned into a relatively useful module, I guess. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net From akuchlin@cnri.reston.va.us Fri Mar 12 04:27:01 1999 From: akuchlin@cnri.reston.va.us (A.M. Kuchling) Date: Thu, 11 Mar 1999 23:27:01 -0500 Subject: [XML-SIG] DTD for recipes Message-ID: <199903120427.XAA01295@207-172-113-225.s225.tnt5.ann.va.dialup.rcn.com> I've taken a first whack at a DTD for storing recipes. It's nowhere near complete, and there are a bunch of open issues. Please see http://starship.python.net/crew/amk/recipe/ for the DTD and a little sample document. As we did with XBEL, we can discuss the DTD here on the XML-SIG list. (Actually, LMG once used a recipe DTD as an example on Usenet, but that seems to have been purely for pedagogical purposes, not as a serious attempt at a DTD. See http://www.dejanews.com/[ST_rn=qs]/getdoc.xp?AN=299409885 for Lars' post.) -- A.M. Kuchling http://starship.python.net/crew/amk/ I am afraid of the worst, but I am not sure what that is. -- Abraham Rotstein From tony.mcdonald@ncl.ac.uk Fri Mar 12 08:10:13 1999 From: tony.mcdonald@ncl.ac.uk (Tony McDonald) Date: Fri, 12 Mar 1999 08:10:13 +0000 Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML In-Reply-To: <14055.59238.557383.671149@amarok.cnri.reston.va.us> References: <199903111550.PAA16904@cheviot.ncl.ac.uk> <199903111550.PAA16904@cheviot.ncl.ac.uk> Message-ID: > [ excellent description of XSL snipped for future ref.] > >>So I guess *now* I'm asking - how do you create a DOM given that you have a >>well-formed XML document to feed it... > > There's a section on that in the HOWTO: > http://www.python.org/doc/howto/xml/node12.html . The easiest way > would be to read from a file: > > from xml.dom import utils > reader = utils.FileReader('foo.xml') > reader.document is then the root of the DOM tree. Andrew, this is *a real help*. Thanks for the information, now all I need to do is to figure out how to plumb the DOM into the XSL module of Dieters' and I *think* I'm away. This is very refreshing for me - I've come from the XML-DEV and XML-L lists - and using Python I can start to see that some of the things I wanted to do are at least possible! (won't do Java, *can't* do Perl). Tone. ------ Dr Tony McDonald, FMCC, Networked Learning Environments Project The Medical School, Newcastle University Tel: +44 191 222 5888 Fingerprint: 3450 876D FA41 B926 D3DD F8C3 F2D0 C3B9 8B38 18A2 From larsga@ifi.uio.no Fri Mar 12 10:35:50 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 12 Mar 1999 11:35:50 +0100 Subject: [XML-SIG] DTD for recipes In-Reply-To: <199903120427.XAA01295@207-172-113-225.s225.tnt5.ann.va.dialup.rcn.com> References: <199903120427.XAA01295@207-172-113-225.s225.tnt5.ann.va.dialup.rcn.com> Message-ID: * A. M. Kuchling | | I've taken a first whack at a DTD for storing recipes. It's nowhere | near complete, and there are a bunch of open issues. Please see | | http://starship.python.net/crew/amk/recipe/ | | for the DTD and a little sample document. In fact, this is pretty much what I envisioned, although simpler. As for your open issues: - Metadata in general: I think the DTD should in fact hard-wire optional elements for what we can usefully standardize (possibly in a separate metadata section), and then allow for extensibility, either through generic property name/value (and maybe values) sets or possibly ANY content models. (Possibly both.) - I think part of the point of using XML here is that you need not store nutritional information in the recipe, but can instead calculate it from the ingredient list and a nutrional database - Number of servings: another good thing about XML: we should be able to do automatic scaling of the ingredients (within reasonable limits, of course) - Ingredients: I think we need different kinds of ingredient elements, classified by amount type: exact, approximate and optional. - Also, alternatives for ingredients would also be nice, as would some way of referring to the alternative from the steps, so that presentation software can choose the correct alternative - I think we need something more advanced than 'note', although I have no immediate ideas for the form it should take - Also, some means of referring to other recipes from a step would be nice, as in 'Prepare some white sauce, as described in the recipe for white sauce.' | (Actually, LMG once used a recipe DTD as an example on Usenet, but | that seems to have been purely for pedagogical purposes, not as a | serious attempt at a DTD. Actually, an online database of marked-up recipes is my favourite example when I give talks and try to explain just what is cool about XML. To get all excited, see: (page 30 and onwards) (search for RML) --Lars M. From tony.mcdonald@ncl.ac.uk Fri Mar 12 10:36:51 1999 From: tony.mcdonald@ncl.ac.uk (Tony McDonald) Date: Fri, 12 Mar 1999 10:36:51 +0000 Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML Message-ID: > > from xml.dom import utils > reader = utils.FileReader('foo.xml') > reader.document is then the root of the DOM tree. There doesn't seem to be a utils.py in the DOM distribution of XML-0.5... 47 % tar ztf xml-0.5.tar.gz | grep dom xml-0.5/demo/dom/ xml-0.5/demo/dom/building.py xml-0.5/demo/dom/domconv.py xml-0.5/demo/dom/html2html xml-0.5/dom/ xml-0.5/dom/.writer.py.swp xml-0.5/dom/README xml-0.5/dom/__init__.py xml-0.5/dom/builder.py xml-0.5/dom/core.py xml-0.5/dom/esis_builder.py xml-0.5/dom/html_builder.py xml-0.5/dom/sax_builder.py xml-0.5/dom/transform.py xml-0.5/dom/transformer.py xml-0.5/dom/walker.py xml-0.5/dom/writer.py xml-0.5/test/test_dom.py xml-0.5/test/output/test_dom I worked-around it though, so I'm still on track for figuring out the nuances of Python, DOM and XSL a bit more.... Tone. ------ Dr Tony McDonald, FMCC, Networked Learning Environments Project The Medical School, Newcastle University Tel: +44 191 222 5888 Fingerprint: 3450 876D FA41 B926 D3DD F8C3 F2D0 C3B9 8B38 18A2 From tony.mcdonald@ncl.ac.uk Fri Mar 12 14:40:59 1999 From: tony.mcdonald@ncl.ac.uk (Tony McDonald) Date: Fri, 12 Mar 1999 14:40:59 +0000 Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML In-Reply-To: Message-ID: >> from xml.dom import utils >> reader = utils.FileReader('foo.xml') >> reader.document is then the root of the DOM tree. > > There doesn't seem to be a utils.py in the DOM distribution of XML-0.5... > My fault. It's in the CVS distribution, not the 0.5 tarball. tone From akuchlin@cnri.reston.va.us Fri Mar 12 14:54:12 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Fri, 12 Mar 1999 09:54:12 -0500 (EST) Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML In-Reply-To: References: Message-ID: <14057.10656.418323.925733@amarok.cnri.reston.va.us> Tony McDonald writes: >My fault. It's in the CVS distribution, not the 0.5 tarball. Sorry about that; I tend to assume people are following the CVS tree. We should try to get a 0.5.1 release out, since there have been a bunch of changes and bugfixes since the 0.5 release was made. -- A.M. Kuchling http://starship.python.net/crew/amk/ Tourist, Rincewind decided, meant "idiot". -- Terry Pratchett, _The Colour of Magic_ From Fred L. Drake, Jr." References: <199903120427.XAA01295@207-172-113-225.s225.tnt5.ann.va.dialup.rcn.com> Message-ID: <14057.17152.785958.173234@weyr.cnri.reston.va.us> Lars Marius Garshol writes: > - Metadata in general: I think the DTD should in fact hard-wire > optional elements for what we can usefully standardize (possibly in a > separate metadata section), and then allow for extensibility, either > through generic property name/value (and maybe values) sets or > possibly ANY content models. (Possibly both.) Lars, This sounds familliar! We had a lot of discussion on the metadata issue for XBEL, and came up with a solution I still think is a little messy. As for ANY: isn't that just a little wide-open? I'd imagine that namespaces would be used for metadata, and then validation is out the door anyway until someone comes up with an accepted standard for validating documents which use namespaces. (It would be nice to have a way to specify "element content", or "element content, but not from this DTD".) > - Also, alternatives for ingredients would also be nice, as would > some way of referring to the alternative from the steps, so that > presentation software can choose the correct alternative Alternatives should be able to include comments to indicate *why* an alternate might be preferable for either dietary reasons (which could also be done through your nutritional database) or non-dietary reasons ("This pasta could be used instead of spaghetti when feeding children, because they like the exploding dinosaurs.") -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From paul@prescod.net Fri Mar 12 05:54:13 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 11 Mar 1999 22:54:13 -0700 Subject: [XML-SIG] Python, XML and XSL References: <14054.38263.515856.672209@amarok.cnri.reston.va.us> <14054.38922.961633.521147@weyr.cnri.reston.va.us> <14054.39522.817100.497278@amarok.cnri.reston.va.us> <14054.40447.582089.969037@weyr.cnri.reston.va.us> <36E74F63.534B990@fourthought.com> Message-ID: <36E8AC05.ED0A1F97@prescod.net> Mike Olson wrote: > > >>e.register(p,ProcessChapterTitles) > >>e.applyPattern(node) > > would call ProcessChapterTitles with node as a context node, if node matches > the pattern p That's neat stuff. It will really help me to "sell" XML as a replacement for expensive tools like Balise and Omnimark. The last thing we needed was a DOM-compatible pattern-based visitor. > We are interested the Sun's XSL Bounty (as I am sure most are). Moving the > python to Java would be pretty straightforward work. However we are looking > for someone to team up with on the formatting object side of XSL. Just so you know, the bounty is almost entirely about the formatting object side. There are a plethora of transformation-side implementations for Java (too many, thanks for doing a Python one!). Sun is trying to kick-start the formatting side. The media support was pretty vague because Sun hasn't worked out and published the details yet. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News From dieter@handshake.de Fri Mar 12 20:10:12 1999 From: dieter@handshake.de (Dieter Maurer) Date: Fri, 12 Mar 1999 20:10:12 +0000 (/etc/localtime) Subject: [XML-SIG] 'searching' XML documents to extract 'chunks' of XML In-Reply-To: References: <14055.59238.557383.671149@amarok.cnri.reston.va.us> Message-ID: <14057.28137.11691.527610@lindm.dm> Tony McDonald writes: > Andrew, this is *a real help*. Thanks for the information, now all I > need to do is to figure out how to plumb the DOM into the XSL module > of Dieters' and I *think* I'm away. Section 3 of URL:http://www.handshake.de/~dieter/pyprojects/xslpattern.html contains a small partial example. You use the "Parser" object to transform an XSL pattern string into an XSL pattern object. Pattern objects have several methods, the most essential of which are "select" and "match". Both get a DOM node as first parameter. "select" returns the list of DOM nodes selected by the pattern with the node as context (this is defined in the XSL working draft spec URL:http://www.w3.org/TR/1998/WD-xsl-19981216). With "match", you can check whether the node is matched by the pattern (defined in the XSL spec, too). Suppose you have build a DOM tree in "domtree" (e.g. in the way Andrew has pointed out). Suppose, you want to select all "chapters" in the tree. You build the pattern object: p= Parser('//chapter') Now: chapters= p.select(domtree) gives you the list of "chapter" nodes in the tree. If you would like to access a required "no" attribute of the first chapter, you can use: no= Parser('@no').select(chapters[0])[0] (however, you probably would use PyDOM for this directly). A more complex example would be to select the "table" with "no=1" in the chapter with "no=3" (first table in third chapter): table= Parser('//table[@no="1" and ancestor(chapter[@no="3"])]').select(domtree)[0] or equivalently (but more efficiently): table= Parser('//chapter[@no="3"]//table[@no="1"]').select(domtree)[0] Good luck - Dieter From akuchlin@cnri.reston.va.us Fri Mar 12 22:00:51 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Fri, 12 Mar 1999 17:00:51 -0500 (EST) Subject: [XML-SIG] DTD for recipes In-Reply-To: References: <199903120427.XAA01295@207-172-113-225.s225.tnt5.ann.va.dialup.rcn.com> Message-ID: <14057.34888.549972.952256@amarok.cnri.reston.va.us> Lars Marius Garshol writes: > - Metadata in general: I think the DTD should in fact hard-wire > optional elements for what we can usefully standardize (possibly in a > separate metadata section), and then allow for extensibility, either > through generic property name/value (and maybe values) sets or > possibly ANY content models. (Possibly both.) Maybe use RDF? I don't know anything about metadata, and so am completely at sea on this issue. > - Number of servings: another good thing about XML: we should be able > to do automatic scaling of the ingredients (within reasonable limits, > of course) We do need to store the number of servings for the recipe as given, though, to know what to scale it to; I don't think software could look at a recipe and say "Aha! 10 people!". Servings might be measured in various units: serves 10 people, makes 36 cookies. > - Ingredients: I think we need different kinds of ingredient > elements, classified by amount type: exact, approximate and optional. Ah, "optional" is a good idea; I'll add an 'optional' attribute to the ingredient element that defaults to "no". Still don't know what to do about approximate amounts, though. > - Also, alternatives for ingredients would also be nice, as would > some way of referring to the alternative from the steps, so that > presentation software can choose the correct alternative Hmm... Should the ingredient section of the DTD be changed to ( (ingredient | alt-ingredients)* | group*). The content of the 'group' element would then be: The 'alt-ingredients' element would just be a list of ingredients. I'm not sure how to refer to ingredients. Here's another design question: is it a good idea to have explicit elements for each section? Right now recipes are defined like this, with the sections being only implicitly visible: Contrast this with HTML, where you have HEAD and BODY elements, explicitly breaking the document into those two sections. In other words, should the above be changed to something like: I suppose this would make some processing jobs easier, but have no strong feelings on this. > - I think we need something more advanced than 'note', although I > have no immediate ideas for the form it should take. Anyone producing a cookbook from this DTD will have their own ideas for what textual commentary should accompany a recipe; I'd be happy with just providing an application-specific class attribute for 'note'. > - Also, some means of referring to other recipes from a step would be > nice, as in 'Prepare some white sauce, as described in > the recipe for white > sauce.' Yes; it should be possible to link to other recipes in the same document, or to a recipe in another collection. -- A.M. Kuchling http://starship.python.net/crew/amk/ The nice thing about list comprehensions is that their most useful forms could be implemented directly as light sugar for ordinary Python loops, leaving lambdas out of it entirely. You end up with a subtly different beast, but so far it appears to be a beast that's compatible with cuddly pythons. -- Tim Peters, 6 Aug 1998 From djsharpe@home.com Sun Mar 14 17:24:20 1999 From: djsharpe@home.com (Bruce Sharpe) Date: Sun, 14 Mar 1999 09:24:20 -0800 Subject: [XML-SIG] Python XML package and Windows Message-ID: <000501be6e3f$7fa0eb40$0400a8c0@surrey1.bc.wave.home.com> Hi, I am a recent convert to Python and am very interested in doing some XML processing with it. I have downloaded and successfully installed xml-0.5.zip on a Solaris machine, but the makefile breaks under Windows for both MSVC and DJGPP. Before I go digging much deeper into the reasons why it is not working for Windows, I thought I would check to see if I should expect it to work, and whether I should expect that just minor tweaking is required or a major effort. Looks like the XML-SIG is doing great work! I look forward to participating in it. Thanks. Bruce From wes@rishel.com Sun Mar 14 20:02:38 1999 From: wes@rishel.com (Wes Rishel) Date: Sun, 14 Mar 1999 12:02:38 -0800 Subject: [XML-SIG] Newbie question not answered? (was "RE: [XML-SIG] XML-Howto update--->Where?") Message-ID: <000201be6e55$9c4fbee0$69510018@c79145-a.almda1.sfba.home.com> Perhaps I am asking the wrong list, or the answer is just so obvious that I am wasting people's time. I apologize if that is so, and I will accept any advice, no matter how deeply enclosed in flames... The question is where to find the most recent complete XML package? The original note (below) detailed the places I tried unsuccessfully. I only have Windows available and I do not currently have a C/C++ development environment set up. If the answer is that I must setup a development environment, I will do so, as much as I would prefer to spend my time writing Python code. I would be grateful for any advice in terms of which environment to set up. (If it comes to a choice of spending a few hundred bucks more and facing less hassle to set up the environment, I would up to spend the money and save the time.) Thanks in advance. Wes > -----Original Message----- > From: xml-sig-admin@python.org > [mailto:xml-sig-admin@python.org] On > Behalf Of Wes Rishel > Sent: Sunday, February 28, 1999 12:13 AM > To: xml-sig@python.org > Subject: RE: [XML-SIG] XML-Howto update--->Where? > > > > > Dinu C. Gherman writes: > > >sections, at least. Unfortunately, this update holds only > > >for the individual HTML pages, not the other formats and not > > >even the bundled HTML pages. Can we expect this to change > Kuchling writes: > > Anyway, the issue is fixed now; the PDF, PS, dvi, and > bundled HTML > > files should now all have been updated to the current version on > > python.org; the mirror sites will catch up soon. > > Where are people finding these updates, especially the > downloads in a > complete package? > http://www.python.org/topics/xml/docs.html contains the documents that Ghreman is and Kuchling are probably referring to, but only as a set of discrete HTML files. http://www.python.org/topics/xml/download.html contains a link to http://www.python.org/sigs/xml-sig/files/xml-0.4.tgz which is advertised as "the complete package" and seemingly has not changed since 8/6/98. http://www.python.org/sigs/xml-sig/files/ contains no files that have changed since Dec 4. Thanks in advance. BTW, one recognizes that those who are busy advancing the state of the code base need to minimize the time spent documenting, but someone might consider bringing http://www.python.org/topics/xml/download.html up to release 0.5, since it has been out for nearly three months. Being a newby, I struggled with rel 4 for a long time, because the page seemed authoritative and I was not sure of the status of stuff in the bare .../files directory. _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://www.python.org/mailman/listinfo/xml-sig From MHammond@skippinet.com.au Mon Mar 15 05:00:14 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Mon, 15 Mar 1999 16:00:14 +1100 Subject: [XML-SIG] Newbie question not answered? (was "RE: [XML-SIG] XML-Howto update--->Where?") In-Reply-To: <000201be6e55$9c4fbee0$69510018@c79145-a.almda1.sfba.home.com> Message-ID: <003101be6ea0$b7705e50$0801a8c0@bobcat> > The question is where to find the most recent complete XML package? > The original note (below) detailed the places I tried > unsuccessfully. > > I only have Windows available and I do not currently have a C/C++ > development environment set up. If the answer is that I must setup a The page http://www.python.org/sigs/xml-sig/files/ has the releases. I just downloaded version 0.5, and it seems to include all the binaries you need for windows. testxml.py failed with: File "L:\src\pythonex\xml\test\regrtest.py", line 31, in ? import test_support ImportError: No module named test_support But all the individual tests seem fine. So you shouldnt need C++ at all! Mark. From akuchlin@cnri.reston.va.us Mon Mar 15 15:38:33 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 15 Mar 1999 10:38:33 -0500 (EST) Subject: [XML-SIG] Recipe DTD revision Message-ID: <14061.8662.109615.261704@amarok.cnri.reston.va.us> I've made some changes to the proposed recipe DTD. The two most important open issues are metadata -- I'm still reading the RDF spec -- and how to represent a range as an amount, such as "2 - 3 cups". Once those issues are resolved, I'll feel confident enough to actually start writing code that uses the DTD. List of changes: * Renamed the 'note' to 'comment', and added a class attribute, so you can have , , or whatever. * Added 'optional' and 'precision' attributes to ingredient element. 'optional' has a yes/no value, and 'precision' is either 'exact' or 'approximate'. * Added 'id' attributes to recipe, ingredient, alt-ingredient, group, and comment elements, that all have the ID type, and are intended to allow referring to a specific element. * I'm not going to bother lowercasing the element names in the IBTWSH DTD, which is used to provide HTML-like elements for textual content. While it's inconsistent to have the recipe elements all be lowercase, and the HTML elements uppercase, documents will almost certainly be generated by software. I've tried entering a recipe by hand, and found the most annoying part is entering the list of ingredients, because it's so mark-up heavy. You really, really want a GUI to do this for you, and that would also hide this inconsistency. -- A.M. Kuchling http://starship.python.net/crew/amk/ Well, there are these two people here, Sir. The man says he drank wine with you somewhere called Babylon, and the lady... she's making little frogs. -- The receptionist, in SANDMAN #43: "Brief Lives:3" From Jeff.Johnson@icn.siemens.com Mon Mar 15 16:02:12 1999 From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com) Date: Mon, 15 Mar 1999 11:02:12 -0500 Subject: [XML-SIG] Newbie question not answered? (was "RE: [XML-SIG] XML-Howto update--->Where?") Message-ID: <85256735.005823EE.00@li01.lm.ssc.siemens.com> >The question is where to find the most recent complete XML package? >The original note (below) detailed the places I tried >unsuccessfully. To get the latest code you can use CVS. CVS is free too. The zip files on the HTML page are usually a month or two old. This page gives the CVS commands to get the latest code. http://www.python.org/sigs/xml-sig/anon-cvs.html The instructions neglect to tell us to set the environment variable "HOME" to the directory that you want the xml package copied to. Once you do that, it should work. After you get the xml package you should copy xml\windows\*.dll to xml\parsers. You should also copy xml\expat\bin\*.dll to a directory in your path (I just throw them in my Python directory). From akuchlin@cnri.reston.va.us Mon Mar 15 17:33:40 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 15 Mar 1999 12:33:40 -0500 (EST) Subject: [XML-SIG] Newbie question not answered? In-Reply-To: <000201be6e55$9c4fbee0$69510018@c79145-a.almda1.sfba.home.com> References: <000201be6e55$9c4fbee0$69510018@c79145-a.almda1.sfba.home.com> Message-ID: <14061.17373.429111.548078@amarok.cnri.reston.va.us> Wes Rishel writes: >The question is where to find the most recent complete XML package? >The original note (below) detailed the places I tried >unsuccessfully. The Web pages on python.org were greatly out of date; I've fixed them now to point to the latest tree. In general, all the files are available from http://www.python.org/sigs/xml-sig/files/ . In the next little while, there should be a 0.5.1 release, which will make it easier for people who don't have CVS. -- A.M. Kuchling http://starship.python.net/crew/amk/ Counting is the most simple and primitive of narratives -- 1 2 3 4 5 6 7 8 9 10 -- a tale with a beginning, a middle and an end and a sense of progression -- arriving at a finish of two digits -- a goal attained, a dénouement reached. -- Peter Greenaway, _Fear of Drowning By Numbers_ (1988) From wunder@infoseek.com Mon Mar 15 17:59:17 1999 From: wunder@infoseek.com (Walter Underwood) Date: Mon, 15 Mar 1999 09:59:17 -0800 Subject: [XML-SIG] DTD for recipes In-Reply-To: <199903120427.XAA01295@207-172-113-225.s225.tnt5.ann.va.dia lup.rcn.com> Message-ID: <3.0.5.32.19990315095917.00b93370@corp> At 11:27 PM 3/11/99 -0500, A.M. Kuchling wrote: >I've taken a first whack at a DTD for storing recipes. It's nowhere >near complete, and there are a bunch of open issues. Please see > > http://starship.python.net/crew/amk/recipe/ > >for the DTD and a little sample document. As we did with XBEL, we can >discuss the DTD here on the XML-SIG list. Good start. Here's a search engine perspective on metadata: References: <199903120427.XAA01295@207-172-113-225.s225.tnt5.ann.va.dia lup.rcn.com> <3.0.5.32.19990315095917.00b93370@corp> Message-ID: <14061.20278.704882.394878@amarok.cnri.reston.va.us> Walter Underwood writes: >I recommend using the Dublin Core elements as needed, with the >formats of the content specified. These three are particularly >important for search engines, since they are the ones that are >shown in the results pages: Good suggestion. Do you have any opinion about whether the metadata elements should be left as children of the 'recipe' element, as opposed to introducing a new mid-level element? That is, from: Finally, I think that Dublin Core is perfectly adequate for >this application. RDF is too complex (or perhaps inadequately >explained) for the average author or developer, and the HTML >conventions (title, keywords, description) are just not enough >information. Dublin Core is the "Goldilocks" solution, just right. Hmm... but Dublin Core doesn't seem to be extensible at all, and the primary problem is that someone may want to track suitability for diabetics, or degree-of-vegan-ness, and we can't think of all the possibilities ahead of time. I'm not happy about requiring knowledge of RDF, but don't see another option. -- A.M. Kuchling http://starship.python.net/crew/amk/ Instead, you and I became the cornerstones of the Brotherhood of Evil! An empire of crime such as I'd dreamed of back in the old school, when the other children used to laugh at me because I was a brain in a tank. -- The Brain, in DOOM PATROL #34 From wunder@infoseek.com Mon Mar 15 20:18:08 1999 From: wunder@infoseek.com (Walter Underwood) Date: Mon, 15 Mar 1999 12:18:08 -0800 Subject: [XML-SIG] DTD for recipes In-Reply-To: <14061.20278.704882.394878@amarok.cnri.reston.va.us> References: <3.0.5.32.19990315095917.00b93370@corp> <199903120427.XAA01295@207-172-113-225.s225.tnt5.ann.va.dia lup.rcn.com> <3.0.5.32.19990315095917.00b93370@corp> Message-ID: <3.0.5.32.19990315121808.00bc3eb0@corp> At 02:16 PM 3/15/99 -0500, Andrew M. Kuchling wrote: > Good suggestion. Do you have any opinion about whether the >metadata elements should be left as children of the 'recipe' element, >as opposed to introducing a new mid-level element? Either way, as long as they are near the beginning of the doc. Things can get confusing if there is a for the document and also <title> elements in a bibliography. > Hmm... but Dublin Core doesn't seem to be extensible at all, >and the primary problem is that someone may want to track suitability >for diabetics, or degree-of-vegan-ness, and we can't think of all the >possibilities ahead of time. I'm not happy about requiring knowledge >of RDF, but don't see another option. We could also represent the entire ingredients list in RDF, but what would it buy? My suggestion is to use standard representations for *common* metadata. Parts of the data model specific to recipies probably should be represented with specific structure. Now, it sure wouldn't hurt to put things in <subject> elements like "vegetarian", "vegan", "low-fat", "kosher-dairy", "cookies", "kids cooking", etc. Off the top of my head, I'd start with the Library of Congress categories under TX643 to TX840 (Cookery). Of course, then someone will want to find wheat-free recipies or anything containing both chocolate and garlic, which is why we still have full-text search ... Hmm, this is an interesting sample case for search, too. Maybe I can use this to demonstrate the advanced XML support in our next release. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 From akuchlin@cnri.reston.va.us Tue Mar 16 00:11:45 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 15 Mar 1999 19:11:45 -0500 (EST) Subject: [XML-SIG] Newbie question not answered? In-Reply-To: <85256735.005823EE.00@li01.lm.ssc.siemens.com> References: <85256735.005823EE.00@li01.lm.ssc.siemens.com> Message-ID: <14061.41323.448090.710884@amarok.cnri.reston.va.us> Jeff.Johnson@icn.siemens.com writes: >The instructions neglect to tell us to set the environment variable "HOME" >to the directory that you want the xml package copied to. Once you do >that, it should work. Hm? That's not necessary on Unix; won't CVS on Windows place the files relative to the current directory when you do a checkout? -- A.M. Kuchling http://starship.python.net/crew/amk/ The only imperfection in life then was that we didn't really have much money. -- Tom Baker, in his autobiography From jody@ldeo.columbia.edu Tue Mar 16 00:24:08 1999 From: jody@ldeo.columbia.edu (Jody Winston) Date: Mon, 15 Mar 1999 19:24:08 -0500 (EST) Subject: [XML-SIG] Newbie question not answered? In-Reply-To: <14061.41323.448090.710884@amarok.cnri.reston.va.us> (akuchlin@cnri.reston.va.us) Message-ID: <199903160024.TAA08596@hog> >>>>> "Andrew" == Andrew M Kuchling <akuchlin@cnri.reston.va.us> writes: Andrew> Jeff.Johnson@icn.siemens.com writes: >> The instructions neglect to tell us to set the environment >> variable "HOME" to the directory that you want the xml package >> copied to. Once you do that, it should work. Andrew> Hm? That's not necessary on Unix; won't CVS on Andrew> Windows place the files relative to the current directory Andrew> when you do a checkout? cvs on windows works the same way that cvs on unix. Jody From Jeff.Johnson@icn.siemens.com Tue Mar 16 15:33:07 1999 From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com) Date: Tue, 16 Mar 1999 10:33:07 -0500 Subject: [XML-SIG] Newbie question not answered? Message-ID: <85256736.005578AE.00@li01.lm.ssc.siemens.com> >>>>> "Andrew" == Andrew M Kuchling <akuchlin@cnri.reston.va.us> writes: Andrew> Jeff.Johnson@icn.siemens.com writes: >> The instructions neglect to tell us to set the environment >> variable "HOME" to the directory that you want the xml package >> copied to. Once you do that, it should work. Andrew> Hm? That's not necessary on Unix; won't CVS on Andrew> Windows place the files relative to the current directory Andrew> when you do a checkout? Jody>cvs on windows works the same way that cvs on unix. After Jody confirmed that CVS works on Windows I tried an experiment. CVS works without 'HOME' if I run it from my C: drive. If I run it from an NT network share, it fails. I don't know if this is a bug or not. Sorry for not testing this on a local drive earlier. For the benefit of bug-cvs@gnu.org I shall repeat the error message: Here's the output without 'HOME' set, 'F:' is a network share from NT 4.0: F:\cvs>cvs -d :pserver:xmlcvs@cvs.python.org:/projects/cvsroot login (Logging in to xmlcvs@cvs.python.org) CVS.EXE [login aborted]: could not find out home directory: No such file or dire ctory F:\cvs>cvs -z3 -d :pserver:xmlcvs@cvs.python.org:/projects/cvsroot co xml CVS.EXE [checkout aborted]: could not find out home directory Regards, Jeff From Fred L. Drake, Jr." <fdrake@acm.org Tue Mar 16 16:58:28 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Tue, 16 Mar 1999 11:58:28 -0500 (EST) Subject: [XML-SIG] XBEL revision possibility? Message-ID: <14062.36276.434308.107504@weyr.cnri.reston.va.us> I happened to be at Norm Walsh's Web site the other day, and was reminded that he's exporting a couple of XBEL files. This made me think about things I still want to add to Grail's XBEL support (export w/o access information, export of part of a bookmarks list, and import of an XBEL instance into an existing bookmarks list). Then I thought, "import" isn't always what's wanted; I'd like to be able to use XLink to refer to a published XBEL file. An application such as Grail can then load the linked instance as needed, presenting it as a subtree of the containing tree. Looking at the XBEL 1.0 DTD to see how to go about doing this, I didn't see a way to do this and still have a valid XBEL 1.0 document. What is missing is an element that can be used as the link element; none of the existing elements really make sense in that role anyway. I propose the addition an an element <link> which provides a simple link and supports all the XLink attributes and those from %node.att;. This element is added to %nodes.mix; and is defined by the DTD fragment appended below my signature. Comments? -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives <!--=================== Link ======================================--> <!-- <link> is intended to be used as an anchor for a link to another XBEL instance; I don't know of any way to specify this constraint. Hopefully I have all the XLink stuff in the right way. The parameter entities are taken directly from an XLink working draft. --> <!ENTITY % locator.att "href CDATA #REQUIRED" > <!ENTITY % link-semantics.att "inline (true|false) 'true' role CDATA #IMPLIED" > <!ENTITY % simple-link-semantics.att "inline (true|false) 'true'" > <!ENTITY % remote-resource-semantics.att "role CDATA #IMPLIED title CDATA #IMPLIED show (embed|replace|new) #IMPLIED actuate (auto|user) #IMPLIED behavior CDATA #IMPLIED" > <!ENTITY % local-resource-semantics.att "content-role CDATA #IMPLIED content-title CDATA #IMPLIED" > <!ELEMENT link (#PCDATA)> <!ATTLIST link %node.att; xml:link CDATA #FIXED "simple" %locator.att; %remote-resource-semantics.att; %local-resource-semantics.att; %simple-link-semantics.att; > From akuchlin@cnri.reston.va.us Wed Mar 17 22:36:11 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Wed, 17 Mar 1999 17:36:11 -0500 (EST) Subject: [XML-SIG] Major upcoming DOM changes in CVS Message-ID: <199903172236.RAA15924@amarok.cnri.reston.va.us> Last night I embarked on a fairly extensive restructuring of the xml.dom.core module, to fix the outstanding problem with parent pointers; there are many cases where the existing code is unable to figure out the parent of a node. This makes it difficult to walk from a node back to the root, which is unfortunately just what's required for implementing namespaces. This restructuring has been tested, but it has the potential to destabilize the DOM code, so I didn't want to commit it without warning. If you follow the CVS tree, be aware that the DOM might become buggy with the next update that affects xml/dom/core.py.. On the other hand, for tonight I'm planning to finally work on a DOM test suite, and hopefully the test suite will be comprehensive enough to ensure that the code works reasonably. I won't commit the changes to core.py until I believe they're reasonably stable; that may be tonight, or it may take until the weekend. BTW, once the DOM code has settled down again, I'd like to implement namespace handling for it, but I don't think anyone's proposed what that interface should look like. Anyone have suggestions? Also BTW, check out LMG's plumbo.py module for finding circular references; ooh, is it ever useful! -- A.M. Kuchling http://starship.python.net/crew/amk/ About ten days later, it being the time of year when the National collected down and outs to walk on and understudy I arrived at the head office of the National Theatre in Aquinas Street in Waterloo. -- Tom Baker, in his autobiography From larsga@ifi.uio.no Thu Mar 18 07:27:48 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 18 Mar 1999 08:27:48 +0100 Subject: [XML-SIG] Major upcoming DOM changes in CVS In-Reply-To: <199903172236.RAA15924@amarok.cnri.reston.va.us> References: <199903172236.RAA15924@amarok.cnri.reston.va.us> Message-ID: <wk7lsfmf4b.fsf@ifi.uio.no> * Andrew M. Kuchling | | On the other hand, for tonight I'm planning to finally work on a DOM | test suite, May I humbly request that another feature be added? A user-settable factory for creating nodes would be very useful, especially for PyPointers, which need ID support to be of much use. The main reason why they haven't been ported to the new DOM is the lack of this feature. (If somebody has added this lately, please tell me.) I think 4DOM has a factory, and if so it would be nice if both DOMs used the same interface for this. --Lars M. From mike.olson@fourthought.com Thu Mar 18 07:57:20 1999 From: mike.olson@fourthought.com (Mike Olson) Date: Thu, 18 Mar 1999 01:57:20 -0600 Subject: [XML-SIG] Major upcoming DOM changes in CVS References: <199903172236.RAA15924@amarok.cnri.reston.va.us> <wk7lsfmf4b.fsf@ifi.uio.no> Message-ID: <36F0B1E0.6BDE90A9@fourthought.com> This is a cryptographically signed message in MIME format. --------------ms5038570BD42CBFA8649334CD Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lars Marius Garshol wrote: > > I think 4DOM has a factory, and if so it would be nice if both DOMs > used the same interface for this. > 4DOM does have a Factory setup. It comes with 2, a local and a orb based server, but can be replaced with anything that meets the interface and returns nodes that meet the node interface (or element interface...). The interface is very similar to that of the document's factory functions, except all nodes add an extra parameter, the owner document. It also adds methods for creating things that documents cannot, ie Documents, DOMImplementations, NodeLists and such. We've even gone one step further in that a document uses one of these factories to implement its factory functions. You can then create a document with a local, orb based factory, or your own so that calls to document.createElement will create the correct type of node. The idea is that you can use the standard DOM interface for creating your elements and text, then move them around the ORB by changing the factory used in the document. Your application should not have to change. There is an IDL for the Factory in DOM/Ext/Factory/NodeFactory.idl if you want to see the interface we've used. Mike > > --Lars M. > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Mike Olson Member Consultant FourThought LLC http://www.fourthought.com http://opentechnology.org --- "No program is interesting in itself to a programmer. It's only interesting as long as there are new challenges and new ideas coming up." --- Linus Torvalds --------------ms5038570BD42CBFA8649334CD Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIIKmQYJKoZIhvcNAQcCoIIKijCCCoYCAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3DQEHAaCC CCUwggTvMIIEWKADAgECAhAOCY8cYeSQOObs5zKyDmWRMA0GCSqGSIb3DQEBBAUAMIHMMRcw FQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UECxMWVmVyaVNpZ24gVHJ1c3QgTmV0d29y azFGMEQGA1UECxM9d3d3LnZlcmlzaWduLmNvbS9yZXBvc2l0b3J5L1JQQSBJbmNvcnAuIEJ5 IFJlZi4sTElBQi5MVEQoYyk5ODFIMEYGA1UEAxM/VmVyaVNpZ24gQ2xhc3MgMSBDQSBJbmRp dmlkdWFsIFN1YnNjcmliZXItUGVyc29uYSBOb3QgVmFsaWRhdGVkMB4XDTk5MDMwNTAwMDAw MFoXDTk5MDUwNDIzNTk1OVowggEKMRcwFQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UE CxMWVmVyaVNpZ24gVHJ1c3QgTmV0d29yazFGMEQGA1UECxM9d3d3LnZlcmlzaWduLmNvbS9y ZXBvc2l0b3J5L1JQQSBJbmNvcnAuIGJ5IFJlZi4sTElBQi5MVEQoYyk5ODEeMBwGA1UECxMV UGVyc29uYSBOb3QgVmFsaWRhdGVkMSYwJAYDVQQLEx1EaWdpdGFsIElEIENsYXNzIDEgLSBO ZXRzY2FwZTETMBEGA1UEAxQKTWlrZSBPbHNvbjEpMCcGCSqGSIb3DQEJARYabWlrZS5vbHNv bkBmb3VydGhvdWdodC5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBANKGswZUnQ/B IfNlZWIIy6G6AkyjYgPRhXynebPtI5ARMq9xDo2zgLgWE+8QffdoZp2hUnTpm63B6cG8yqH1 PnA/7SB2roIfml1vnOwXgNuBctciTmnrac4GWgL0CM9839fJZh47QIVYPlCbOPtnvnH1NGGD jFWAVX7vmES72Dl9AgMBAAGjggGPMIIBizAJBgNVHRMEAjAAMIGsBgNVHSAEgaQwgaEwgZ4G C2CGSAGG+EUBBwEBMIGOMCgGCCsGAQUFBwIBFhxodHRwczovL3d3dy52ZXJpc2lnbi5jb20v Q1BTMGIGCCsGAQUFBwICMFYwFRYOVmVyaVNpZ24sIEluYy4wAwIBARo9VmVyaVNpZ24ncyBD UFMgaW5jb3JwLiBieSByZWZlcmVuY2UgbGlhYi4gbHRkLiAoYyk5NyBWZXJpU2lnbjARBglg hkgBhvhCAQEEBAMCB4AwgYYGCmCGSAGG+EUBBgMEeBZ2ZDQ2NTJiZDYzZjIwNDcwMjkyOTg3 NjNjOWQyZjI3NTA2OWM3MzU5YmVkMWIwNTlkYTc1YmM0YmM5NzAxNzQ3ZGE1ZDNmMjE0MWJl YWRiMmJkMmU4OTIxM2FlNmFmOWRmMTE0OTk5YTNiODQ1ZjlmM2VhNDUwYzAzBgNVHR8ELDAq MCigJqAkhiJodHRwOi8vY3JsLnZlcmlzaWduLmNvbS9jbGFzczEuY3JsMA0GCSqGSIb3DQEB BAUAA4GBAIuxBeIOBMHbj5yM/Vu4UJxDcz4Xtc7h0K8c6d82SiwwKLN5Gbew69PevcN6Ak+p D8LO4NyCH8Cfu3acoT0Efi99XjWvdi2eSbDJUw6MvgJtnAfY03zM+Cf31A/1iyrvr3hD45/c yhUNRh8f6qX1NzeKvvh5AcYD1bsi+0wnP0D8MIIDLjCCApegAwIBAgIRANJ2Lo0UDD19sqgl Xa/uDXUwDQYJKoZIhvcNAQECBQAwXzELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWdu LCBJbmMuMTcwNQYDVQQLEy5DbGFzcyAxIFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRpb24g QXV0aG9yaXR5MB4XDTk4MDUxMjAwMDAwMFoXDTA4MDUxMjIzNTk1OVowgcwxFzAVBgNVBAoT DlZlcmlTaWduLCBJbmMuMR8wHQYDVQQLExZWZXJpU2lnbiBUcnVzdCBOZXR3b3JrMUYwRAYD VQQLEz13d3cudmVyaXNpZ24uY29tL3JlcG9zaXRvcnkvUlBBIEluY29ycC4gQnkgUmVmLixM SUFCLkxURChjKTk4MUgwRgYDVQQDEz9WZXJpU2lnbiBDbGFzcyAxIENBIEluZGl2aWR1YWwg U3Vic2NyaWJlci1QZXJzb25hIE5vdCBWYWxpZGF0ZWQwgZ8wDQYJKoZIhvcNAQEBBQADgY0A MIGJAoGBALtaRIoEFrtV/QN6ii2UTxV4NrgNSrJvnFS/vOh3Kp258Gi7ldkxQXB6gUu5SBNW LccI4YRCq8CikqtEXKpC8IIOAukv+8I7u77JJwpdtrA2QjO1blSIT4dKvxna+RXoD4e2HOPM xpqOf2okkuP84GW6p7F+78nbN2rISsgJBuSZAgMBAAGjfDB6MBEGCWCGSAGG+EIBAQQEAwIB BjBHBgNVHSAEQDA+MDwGC2CGSAGG+EUBBwEBMC0wKwYIKwYBBQUHAgEWH3d3dy52ZXJpc2ln bi5jb20vcmVwb3NpdG9yeS9SUEEwDwYDVR0TBAgwBgEB/wIBADALBgNVHQ8EBAMCAQYwDQYJ KoZIhvcNAQECBQADgYEAiLg3O93alDcAraqf4YEBcR6Sam0v9vGd08pkONwbmAwHhluFFWoP uUmFpJXxF31ntH8tLN2aQp7DPrSOquULBt7yVir6M8e+GddTTMO9yOMXtaRJQmPswqYXD11Y Gkk8kFxVo2UgAP0YIOVfgqaxqJLFWGrBjQM868PNBaKQrm4xggI8MIICOAIBATCB4TCBzDEX MBUGA1UEChMOVmVyaVNpZ24sIEluYy4xHzAdBgNVBAsTFlZlcmlTaWduIFRydXN0IE5ldHdv cmsxRjBEBgNVBAsTPXd3dy52ZXJpc2lnbi5jb20vcmVwb3NpdG9yeS9SUEEgSW5jb3JwLiBC eSBSZWYuLExJQUIuTFREKGMpOTgxSDBGBgNVBAMTP1ZlcmlTaWduIENsYXNzIDEgQ0EgSW5k aXZpZHVhbCBTdWJzY3JpYmVyLVBlcnNvbmEgTm90IFZhbGlkYXRlZAIQDgmPHGHkkDjm7Ocy sg5lkTAJBgUrDgMCGgUAoIGxMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcN AQkFMQ8XDTk5MDMxODA3NTcyMVowIwYJKoZIhvcNAQkEMRYEFMJcOBxpZk0M4G6iHmuEo3VI 1CEJMFIGCSqGSIb3DQEJDzFFMEMwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMAcGBSsO AwIHMA0GCCqGSIb3DQMCAgFAMA0GCCqGSIb3DQMCAgEoMA0GCSqGSIb3DQEBAQUABIGAkpGe sHO4osSIJxb/7tXJ3ycyA5Oo+OB0yCnOajKQWLi9rbe0o9Oiu5xRotdlxXbp56prPoeUcPq9 zQhkcH61ZsH7QsTAGXVKO0VXeB7cAUPbGrXC6FbmnnmQ3Vl8//LTJuNmgO181yh3Uuu8xSIh 6KiHhOoq9vHw5pHvRKvJOEs= --------------ms5038570BD42CBFA8649334CD-- From gstein@lyra.org Thu Mar 18 07:57:37 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 17 Mar 1999 23:57:37 -0800 Subject: [XML-SIG] Major upcoming DOM changes in CVS References: <199903172236.RAA15924@amarok.cnri.reston.va.us> Message-ID: <36F0B1F1.29FB4624@lyra.org> Andrew M. Kuchling wrote: > Last night I embarked on a fairly extensive restructuring of the > xml.dom.core module, to fix the outstanding problem with parent > pointers; there are many cases where the existing code is unable to > figure out the parent of a node. This makes it difficult to walk from > a node back to the root, which is unfortunately just what's required > for implementing namespaces. Actually, I use a stack to deal with namespace processing (in a post-DOM-construction walk of the DOM tree). Each element can define any number of prefixes, so my stack is a list (one item per element depth) of dictionaries (prefix to URI mapping). When an element "starts", I push an empty namespace map on, and when the elem "ends" I pop the stack. On each node, I record just the URI. The prefix is not relevant and must actually be *removed*. It is very difficult to look for the "multistatus" element in the "DAV:" namespace when the element name could be "whatever-the-hell-namespace:multistatus". Anyhow... once the DOM is constructed, the parent nodes are not required. Oh... crap. Just thought of something. I'm viewing it from the "parse XML into DOM" angle rather than the "build DOM to generate XML" angle. I have no ideas for the latter... you actually would need some notion of where a namespace is scoped and what prefix should be used. icky. (no ideas... I don't think I would ever use a DOM to generate XML, so I've never thought on it) >... > BTW, once the DOM code has settled down again, I'd like to implement > namespace handling for it, but I don't think anyone's proposed what > that interface should look like. Anyone have suggestions? I just inserted a "namespace" attribute onto each _nodeData instance. It did mean that I had to "know the insides" to insert and retrieve the value, though. I also ran into a *big* problem. _nodeData uses a mapping for the attributes. The key is the attribute name. The key really should be a namespace/name tuple. For the "namespace" attribute, I use None to mean "no namespace" or a string holding the namespace URI. During parsing, there are three types of prefixes: None, empty string, non-empty string. None means no prefix and no default namespace was defined; empty string means no prefix, which refers to a defined, default namespace; non-empty means a prefix which refers to a defined namespace. Since my namespace processing was done during a post-construction walk, the code might not be helpful. Let me know if you'd like it, tho, and I'll post it. Cheers, -g p.s. I looked on the www-dom mailing list a couple days ago to see if they had any suggestions for namespace APIs on the DOM... nada. not on the list or in the draft. sigh. -- Greg Stein, http://www.lyra.org/ From uche.ogbuji@fourthought.com Thu Mar 18 13:43:38 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Thu, 18 Mar 1999 06:43:38 -0700 Subject: [XML-SIG] Major upcoming DOM changes in CVS In-Reply-To: Your message of "Wed, 17 Mar 1999 17:36:11 EST." <199903172236.RAA15924@amarok.cnri.reston.va.us> Message-ID: <199903181343.GAA31075@malatesta.local> > BTW, once the DOM code has settled down again, I'd like to implement > namespace handling for it, but I don't think anyone's proposed what > that interface should look like. Anyone have suggestions? Ha! Wouldn't it serve us all well if the flipping W3C DOM WG would get its act together. I just got through venting sulphur on another group about the W3C's having published a new level 2 spec with a heading for Namespaces, but no content. Yet they made a raft of incoherent changes to the obscure area of node iterator fix-ups. I only glanced through even the CSS and Event model sections, but they looked to have received a lot of work as well. Once again, it seems the DOM WG is entirely in the service of browser vendors. We keep hoping to wait for the W3C to define an interface before implementing NS in 4DOM. We already use them in several projects (including 4XSL, of course), and we're OK if we avoid any defaulting. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From co@daisybytes.su.uunet.de Thu Mar 18 14:13:03 1999 From: co@daisybytes.su.uunet.de (Carsten Oberscheid) Date: Thu, 18 Mar 1999 15:13:03 +0100 Subject: [XML-SIG] (Py)DOM: Character References Message-ID: <01BE7151.D3021010.co@daisybytes.su.uunet.de> ------ =_NextPart_000_01BE7151.D30396B0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi all, I applied a little patch to xml.dom.writer.XmlWriter to have it replace non-ASCII characters by character references. To make this optional I've added a flag doCharRef to the class. Additionally my version of XmlWriter takes care of doublequotes in attribute values, by changing the attribute value delimiters to single quotes if necessary. The charref replacement is done by the new function xml.utils.escape2() (it's in xml/utils/__init__.py, sorry for the silly name...). I send the sources along, just in case someone finds this useful. Can anybody tell why character references are not modeled explicitely in the DOM? In XML they have their own identity, explicitely distinct from entity references. Just wondering... Regards .co. +------------------------------------------------------- daisy bytes! --------+ Carsten Oberscheid co@daisybytes.su.uunet.de digital document processing http://www.pweb.de/daisybytes.su electronic publishing ------ =_NextPart_000_01BE7151.D30396B0 Content-Type: application/x-zip-compressed; name="charrefs.zip" Content-Transfer-Encoding: base64 UEsDBBQAAAAIAHZ5ciau+v0jGgcAAGwaAAANABAAZG9tL3dyaXRlci5weVVYDAC0CPE2oAnxNvQB 9AHNGWtv2kjwM/4Ve0SR7Qsl0NcH1FTqpfSIlJca+pA4RB28NFb9kndpiKr+95uZ3fUDG0Jyp+uh yrFn5z2zszPbdrt9mwWSZwOm/h6GQcy9LBA8Y/PQE4ILtkgy5i+jNIi/srcXZ0xmnDOZsEUQ8q5l tdtty1pkScRWUdj1k6g7TzLOgihNMsl+ry7deuE3YK0XP9GXpb+EzEBEh2W8w8SdKDFdyiAUhoiL uZfyrYtPrZZlkfrsYinTpbwCnb1oYLV8vmCzWRAHcjZzBA8XHTLDhaUWfnbxix0R0IBifjtDtwC4 bykW5CxNL4h4LwXlJbNpxe6wL+JLmWVXUWS8K5bXjv1XfAA48MdGehcwgwUTzIt9JiZP+lN2dESr yLlBiVaLh4I3Lva0hgA6BQjpSAqCgDgBJ5fxCw5KPRTpWsZLPg+1k6ru6S7CpbhxAFH7+HMUfqLs cVRAAb3R04KicASx7QrpJ0vZYXE484O5PPrxs5Ch0MCWcuwcBXUNEo9SGUByHrHJtEQYpBUI5MdM yLuQz3jsS++r0O4zXkAnzK45ZDgHPA8SqExtELwF2LZlXTMAEVvI11cjL515UmYAzL9BQ/gMveja 99hqwFYGeSa4PFcRFY72mWuxtR+h+snxjZe95wu0tcdMJMoMKBqGC7odd/i3DvvOgthEpAvxjIRD y5g73ye9Kb1v9l3XS1Ow0vnm7oan9nt3CV8ZELm5pH6TpFIQtglqQNso5+kWi4BwB3tKWBulPNts zT1Cakg1GfVqtDKbXd6l3Fm5WEno1bbdYrurzaR3/cqlglKigTRw6N1x3Y6CTqauosdcuUOMVcks xejOrVUmrPcoQGtKcRmGPOKx1Apz9eUyJMIDB6u4gnW/cjmLE5+PUQMyZXg6PBuejy3cGJDf9qt9 YbP9yg5y1qnPvQipUTmd6zFAIN29cMnRkjIBbsngeilhm7jFHljfaeaHVbu66QZsI7L57TE4A4QM JEiBmhw/eXN1fHLC5sBBsNtA3tBrBszuZQXafvRCcIRTqSkOGuh27iUvfvrcdMgnuec+4hfkQL3U 5HQU7fst9hM6feaQdmD0r7LvweZhfNWeWwSw/xrJ4Rxv2y57DdV2F0/kCabzb57E0gtiwdr3EmPG C3bA2mxfHNmQ+fCiXfSPA7Sulp9wEdvSqPcA7WzUrr0v2nahnW5wmo9k7HowOcrbcH4ThP45ONkc QTn3w9d2pcoYOICVlA3bH2zYeCLVCqPpnvJ2o1IwhfsgSaUjaYsgXSHBKc318VHl0d5JUdNLkXYp CDHnQchjZ1NUGAnsqZ61MbBVduvxenW4L17vXrsfE1vQ47+I7HYxOq5+MuYrE1MJr8RXjQ548hoI irz0MhCOfoYzCLyC0nPZ1OQSWcv3pIfuVBWKFpyCEa6iOXvG9xq9IqpcyGo7vOF021gF9mBsS0Nv /tgTrR4LcySRHXXlthc2PHGofO144GwSb6SbIB4nUWlv5iGsk5ODZQKbwnErk9JpPmA7+dzUPDCp Tk7NtVcU45MLI+t6uVhwnB3MQte8UGbnnLvrI1hBbIzKJ/5GowprkG/G5TKLy1wwib7r/MmH7pHM J8KyjTvNhCR6s/7VKXDzEFVulbdNgz2rYaLE3WwH0Vec0a8zfN7QM4jn4dLH8d4OA3xGXHpqAQZV u0N0J2d/IuiP9/gc0fPk/Pj0w9shvp6e4PNsOH6jFi4/jImuPsQSs5s+SX9Kz2f0fE7PF/R8CU/C U+ooXVN8kFrQY4ak7QJOca3eiDiOiOOIOI6I44g4jnKOSlFlxSU+SOHxyfiU7Hh3cW4Uh4ceGkHt H0pt7vn2gDn9DtP/XCX9OvHvGheUqk0r5ON8oQf/9MIybMQHVyhwr4INniyDDTa4thH8rBn8vBn8 ohn8shGcNkIhdnUjf6qZZcsdQHnLlUpLsQP/9dpSsP41xeXN1eWOtQVOpFn1dg8hJlF/GmAKpxR/ z9N3gOrkNOh5xbWy3sTZdFo4WoZ4UCcwrRecQKrnhxS2/DolnPTUJd8rdcfXSsOl0H0LVaUcRj1G DoqL0gYDCV85YQdbX3V5ACVthpMR3iFN+oN4SlDQY71JCCfxQX8wVVSgDaJgHwd/jV4HWq81xfoK 2MATaTfxNDeaVabGsu08B0BcZqo5VMI5MZZPcVQsadxhhSRopSGXkKe6nCl60sdxs7XXW+upPuM7 X3Cc6+693LuXEgq7N9iFeO8IOlmVYFSVwebLHIK5HaUudqy6SMedmlmsdNn8m7psJkhJQrVfBwSr +ER5OzAi7ep8kFG/vGtosFi7ijlGGDTCxaYhrMJngTA+0zdRyBIeB8oN9XgQPQWNpps6u/Hws+a1 VzDTjaDCNk11WVqxYumsYjWFZFZIL7lR4H8oQIyH4DCIcPfBIbYPbVSxMdD/izibom5ZfwNQSwME FAAAAAgAGHNyJs0MDE71AQAAnQQAABEAEAB1dGlscy9fX2luaXRfXy5weVVYDAC0CPE2oP7wNvQB 9AGNU8Fu00AQvfsrRq4Ue4VrBQ6oIm0lVDj0wgG4oLayJutxsqq9a+2uIRHi31lP1qQhEcEHWxq/ efNm5k1SVdi2VQU38JApZ67ezl9nT0miut5YD85bpVcFWEoqbfR7JwPQUilN16uW8uzhcXM1v3zc NM1TJpKkpgbISewpr9FjAaS98opcSPv5S7xLIDxpmn5kDMwKuC4AdQ23oDRgrAemgTG9ZPg3M4BE HXnB+DXZCHQTEpZb6NG5MRmhVtIro9FuAR0wSUgC03O03Yvq0WJHnmwJ8DUgnmnrWM53bIfwvxuc hzAfWBKzxKoLIJTrEQ0/FP8NM+lblFSHgF+D8g6ksZZcb3Q9imLCcmqfv6z7JnKWkSCOLZ2l4wu7 fpGK8+hrRrf+v8C3DF79ATfGglyjdXFb23EV04hK5alzedzcOe4DGjFlcKolP1jN2TubVN3zXYB/ pibvzN4ZOGymYXaBB5pB8zJZZnCeG5a5mEYYSbPZRQavRkm5sXWgK1fWDH0+F0KEeLbIXjrzU/Dx l7v7e9a8LxyL8rqD1S8Zwx2hDA5xu83iPhKqN2RJS5rkXMTZRJkHx1G8aLjgMRysKl4X550GHg1w 182bfx/aBzMtNQg1OysqnG5UFONhaDjXO7OFEzvRu/vLzievXxwjDrdw1CH75jdQSwECFQMUAAAA CAB2eXImrvr9IxoHAABsGgAADQAMAAAAAAABAABAtoEAAAAAZG9tL3dyaXRlci5weVVYCAC0CPE2 oAnxNlBLAQIVAxQAAAAIABhzcibNDAxO9QEAAJ0EAAARAAwAAAAAAAEAAEC2gVUHAAB1dGlscy9f X2luaXRfXy5weVVYCAC0CPE2oP7wNlBLBQYAAAAAAgACAJIAAACJCQAAAAA= ------ =_NextPart_000_01BE7151.D30396B0-- From Fred L. Drake, Jr." <fdrake@acm.org Thu Mar 18 15:31:30 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Thu, 18 Mar 1999 10:31:30 -0500 (EST) Subject: [XML-SIG] (Py)DOM: Character References In-Reply-To: <01BE7151.D3021010.co@daisybytes.su.uunet.de> References: <01BE7151.D3021010.co@daisybytes.su.uunet.de> Message-ID: <14065.7250.923412.478739@weyr.cnri.reston.va.us> Carsten Oberscheid writes: > Can anybody tell why character references are not modeled explicitely in the > DOM? In XML they have their own identity, explicitely distinct from entity Carsten, Good question. I don't know why character references need explicit nodes in the DOM; I'm not terribly interested in knowing that something was encoded as "+" or "+". I would like to be able to have this: <!DOCTYPE thing> <thing>&foo;</thing> provide a reference to &foo; as a child of the <thing> node. Here's what I get now: >>> buffer = '<!DOCTYPE thing>\n<thing>&foo;</thing>' >>> import xml.dom.utils >>> reader = xml.dom.utils.FileReader() >>> import cStringIO >>> sio = cStringIO.StringIO(buffer) >>> dom = reader.readStream(sio) >>> dom.documentElement <Element 'thing'> >>> len(dom.documentElement.childNodes) 0 And here's a bug ;-) : >>> dom.documentElement.childNodes <NodeList]> -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives From co@daisybytes.su.uunet.de Thu Mar 18 16:31:45 1999 From: co@daisybytes.su.uunet.de (Carsten Oberscheid) Date: Thu, 18 Mar 1999 17:31:45 +0100 Subject: [XML-SIG] (Py)DOM: Character References Message-ID: <01BE7165.31EFFE80.co@daisybytes.su.uunet.de> > > Carsten Oberscheid writes: > > Can anybody tell why character references are not modeled explicitely in > > the > > DOM? In XML they have their own identity, explicitely distinct from entity > > > > Carsten, > Good question. I don't know why character references need explicit > nodes in the DOM; I'm not terribly interested in knowing that > something was encoded as "+" or "+". Ok, since charrefs encode only characters from the document's base character set (Unicode for XML, ASCII for SGML -- is that right?), it would be unnecessary overhead to create a distinct DOM node for each charref. Forget that, should have thought before I wrote... > I would like to be able to > have this: > > <!DOCTYPE thing> > <thing>&foo;</thing> > > provide a reference to &foo; as a child of the <thing> node. Here's > what I get now: > > >>> buffer = '<!DOCTYPE thing>\n<thing>&foo;</thing>' > >>> import xml.dom.utils > >>> reader = xml.dom.utils.FileReader() > >>> import cStringIO > >>> sio = cStringIO.StringIO(buffer) > >>> dom = reader.readStream(sio) > >>> dom.documentElement > <Element 'thing'> > >>> len(dom.documentElement.childNodes) > 0 That's ok (unless you have a DTD for doctype "thing" which declares "&foo;" -- in well-formed XML, only some default entities (&, <, >) are allowed -- replace &foo; by & and it works. > > And here's a bug ;-) : > > >>> dom.documentElement.childNodes > <NodeList]> I'm not sure, but this could be caused by the last line of xml.dom.core.SingleParentNodeList.__repr__(). I guess "-2" should be "-1"... > > -Fred > .co. +------------------------------------------------------- daisy bytes! --------+ Carsten Oberscheid co@daisybytes.su.uunet.de digital document processing http://www.pweb.de/daisybytes.su electronic publishing From akuchlin@cnri.reston.va.us Thu Mar 18 16:50:36 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Thu, 18 Mar 1999 11:50:36 -0500 (EST) Subject: [XML-SIG] (Py)DOM: Character References In-Reply-To: <01BE7165.31EFFE80.co@daisybytes.su.uunet.de> References: <01BE7165.31EFFE80.co@daisybytes.su.uunet.de> Message-ID: <14065.11644.300639.849814@amarok.cnri.reston.va.us> Carsten Oberscheid writes: >> >>> len(dom.documentElement.childNodes) >> 0 I think this might be because sax_builder doesn't do anything for entity references; I'll look into it tonight. >I'm not sure, but this could be caused by the last line of >xml.dom.core.SingleParentNodeList.__repr__(). I guess "-2" should be "-1"... If that's the case, this bug is already fixed, since SingleParentNodeList disappeared in the restructuring. BTW, I now have a 250-line test suite written, which the DOM code passes; this gives me a bit more confidence that the new code won't break things. -- A.M. Kuchling http://starship.python.net/crew/amk/ He who wonders discovers that this in itself is wonder. -- M.C. Escher From jody@ldeo.columbia.edu Thu Mar 18 17:04:52 1999 From: jody@ldeo.columbia.edu (Jody Winston) Date: Thu, 18 Mar 1999 12:04:52 -0500 (EST) Subject: [XML-SIG] XML and GUI Message-ID: <199903181704.MAA19583@hog> I'm looking for XML tools help to build graphical user interfaces, such as XPToolkit (http://www.mozilla.org/xpfe/) from Mozilla. I've started to look at using XPToolkit, but it may be difficult to extract what I need out of Mozilla. Jody From Fred L. Drake, Jr." <fdrake@acm.org Thu Mar 18 18:42:08 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Thu, 18 Mar 1999 13:42:08 -0500 (EST) Subject: [XML-SIG] Major upcoming DOM changes in CVS In-Reply-To: <wk7lsfmf4b.fsf@ifi.uio.no> References: <199903172236.RAA15924@amarok.cnri.reston.va.us> <wk7lsfmf4b.fsf@ifi.uio.no> Message-ID: <14065.18688.948267.392438@weyr.cnri.reston.va.us> Lars Marius Garshol writes: > May I humbly request that another feature be added? A user-settable > factory for creating nodes would be very useful, especially for > PyPointers, which need ID support to be of much use. The main reason I agree; this would be really nice. -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives From Fred L. Drake, Jr." <fdrake@acm.org Thu Mar 18 19:26:43 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Thu, 18 Mar 1999 14:26:43 -0500 (EST) Subject: [XML-SIG] XBEL revision possibility? In-Reply-To: <14062.36276.434308.107504@weyr.cnri.reston.va.us> References: <14062.36276.434308.107504@weyr.cnri.reston.va.us> Message-ID: <14065.21363.277063.506787@weyr.cnri.reston.va.us> Fred L. Drake writes: [... discussion of potential XBEL revision ...] I've only received one reply (which wasn't sent to the list); is this revision uninteresting, or has XBEL already died? -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives From akuchlin@cnri.reston.va.us Thu Mar 18 22:01:21 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Thu, 18 Mar 1999 17:01:21 -0500 (EST) Subject: [XML-SIG] Major upcoming DOM changes in CVS In-Reply-To: <199903181343.GAA31075@malatesta.local> References: <199903172236.RAA15924@amarok.cnri.reston.va.us> <199903181343.GAA31075@malatesta.local> Message-ID: <14065.29921.236963.137663@amarok.cnri.reston.va.us> uche.ogbuji@fourthought.com writes: > I just got through venting sulphur on another group about the >W3C's having published a new level 2 spec with a heading for Namespaces, but >no content. Yet they made a raft of incoherent changes to the obscure area of >node iterator fix-ups. Hearty agreement here; with some new W3C specs making use of namespaces, it seems obvious that you'd want to have standards for using namespaces. Otherwise, every DOM implementor will invent their own nonstandard way of handling things, which is just what the W3C is intended to prevent! -- A.M. Kuchling http://starship.python.net/crew/amk/ I claim complete innocence and ignorance! It must have been Tim. I wouldn't know a Trondheim Hammer if it fell on my foot! -- Steve Majewski, 10 Jan 1995 From akuchlin@cnri.reston.va.us Thu Mar 18 22:19:55 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Thu, 18 Mar 1999 17:19:55 -0500 (EST) Subject: [XML-SIG] Major upcoming DOM changes in CVS In-Reply-To: <36F0B1F1.29FB4624@lyra.org> References: <199903172236.RAA15924@amarok.cnri.reston.va.us> <36F0B1F1.29FB4624@lyra.org> Message-ID: <14065.31144.635092.167056@amarok.cnri.reston.va.us> Greg Stein writes: >post-DOM-construction walk of the DOM tree). Each element can define any >number of prefixes, so my stack is a list (one item per element depth) >of dictionaries (prefix to URI mapping). When an element "starts", I I could do that, keeping a dictionary on each _nodeData instance. Finding the namespace for a given prefix is then proportional to the height of the DOM tree, because you have to start at the node and scan back toward the root. A common operation is likely to be "find attribute X in namespace with URI Y", and that would be terribly slow; scan back until you find a namespace declaration with URI Y, and then check for an attribute with that prefix. That's O(height of tree * # of attributes), but I can't think of a better way. It would obviously be better to store a cumulative map on each node, reducing the height-of-tree factor to a constant, but I'm frightened of that approach, fearing it'll make changing the tree expensive or difficult, since you'd either have to recompute the maps on an entire subtree every time you change an attribute or move something around (expensive), or use smart updating to saveCPU time (difficult, and potentially a source of bugs from complicated updating logic). In a recent xml-dev posting, David Megginson mentioned that some implementors are turning the element names into longer, "URI-prefix tagName" strings, like "http://www.w3.org/RDF rdf". This is apparently of dubious legality, but it gets their job done. I think it's an ugly hack, myself... -- A.M. Kuchling http://starship.python.net/crew/amk/ REMIND ME AGAIN, he said, HOW THE LITTLE HORSE-SHAPED ONES MOVE. -- Death on symbolic last games, in Terry Pratchett's _Small Gods_ From gstein@lyra.org Thu Mar 18 22:33:49 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 18 Mar 1999 14:33:49 -0800 Subject: [XML-SIG] Major upcoming DOM changes in CVS References: <199903172236.RAA15924@amarok.cnri.reston.va.us> <36F0B1F1.29FB4624@lyra.org> <14065.31144.635092.167056@amarok.cnri.reston.va.us> Message-ID: <36F17F4D.1A513181@lyra.org> Andrew M. Kuchling wrote: > > Greg Stein writes: > >post-DOM-construction walk of the DOM tree). Each element can define any > >number of prefixes, so my stack is a list (one item per element depth) > >of dictionaries (prefix to URI mapping). When an element "starts", I > > I could do that, keeping a dictionary on each _nodeData > instance. Finding the namespace for a given prefix is then > proportional to the height of the DOM tree, because you have to start > at the node and scan back toward the root. A common operation is > likely to be "find attribute X in namespace with URI Y", and that > would be terribly slow; scan back until you find a namespace > declaration with URI Y, and then check for an attribute with that > prefix. That's O(height of tree * # of attributes), but I can't think > of a better way. As I mentioned: store attributes as (URI, name) pairs for the key. The lookup will be quite fast then. Remember: the prefixes are only important when you parse or "render" the XML. While you're operating on the DOM, those prefixes are meaningless/bogus. In fact, I might posit that preparing the prefixes is the job of the XmlWriter class. (and the toxml() method is no longer as useful) > It would obviously be better to store a cumulative map on each > node, reducing the height-of-tree factor to a constant, but I'm > frightened of that approach, fearing it'll make changing the tree > expensive or difficult, since you'd either have to recompute the maps > on an entire subtree every time you change an attribute or move > something around (expensive), or use smart updating to saveCPU time > (difficult, and potentially a source of bugs from complicated updating > logic). The map between prefix and URI is only used at parse/render time. In this case, I think the idea of a cumulative map works great. Normally, I associate a set of all (used) namespaces with the document. It becomes very easy to know ahead of time how many namespaces there are, define prefixes as ns%d, declare them on the document element, and then use them throughout the doc. It would be a bit harder for the DOM to do this, but (logically) a DOM has a set of namespaces. If the XmlWriter knew those ahead of time, then the generation part would be easy. The alternative is to simply insert xmlns: declarations everywhere, as they occur (and reuse parent URI<->prefix mappings). > In a recent xml-dev posting, David Megginson mentioned that > some implementors are turning the element names into longer, > "URI-prefix tagName" strings, like "http://www.w3.org/RDF rdf". This > is apparently of dubious legality, but it gets their job done. I > think it's an ugly hack, myself... If they actually remap the element name... yes, I'd say that is a hack. However, the URI/name pair is, by definition, the actual value of the element or attribute. We Pythoneers can easily deal with the pair, so we don't need to resort to the hacks. Cheers, -g -- Greg Stein, http://www.lyra.org/ From tony.mcdonald@ncl.ac.uk Thu Mar 18 22:33:00 1999 From: tony.mcdonald@ncl.ac.uk (Tony McDonald) Date: Thu, 18 Mar 1999 22:33:00 +0000 Subject: [XML-SIG] Optimising/strategies for DOM/XSL lookup/parsing Message-ID: <199903182236.WAA24267@cheviot.ncl.ac.uk> Hi, Thanks to the help from this list I've managed to get use the XSL Parser of Dieters' and the DOM routines to do some searching/extraction of element 'chunks' my XML documents. Many thanks for that! One thing that I've noticed is that the initial DOM 'parsing' is slow relative to the XSL pattern matching. On my iMac 266, DOM parsing a 76k XML file took 4-5 seconds (utils.FileReader), whilst the XSL pattern matching took 1.8 seconds (tp = Parser(pattern), topics = tp.select(reader.document)). By the way, is there any way of telling how much memory a DOM tree is occupying? The way I think things are likely to happen is that there will be large numbers of XSL queries and very few DOM creations. However, there are something like 140 documents that need to be 'available' for XSL querying and subsequent transformation into HTML/RTF. In addition, there will be times when an XSL query across all 140 documents will definitely happen. Would one strategy be to load up all 140 documents into memory on startup, do the DOM processing then and then when an XSL query comes along, 'route' it to the appropriate DOM tree (now in memory)? If this isn't possible, is it possible to 'save' a DOM tree to an external file and re-read it in once a relevant XSL query is ready to be acted upon? Now I'm not going to be serving my XML docs from my iMac!...but if at all possible I'd like to limit the DOM parsing as much as possible. Am I being naive and missing something obvious here? Any thoughts would be appreciated, thanks Tone From larsga@ifi.uio.no Fri Mar 19 08:07:05 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 19 Mar 1999 09:07:05 +0100 Subject: [XML-SIG] XBEL revision possibility? In-Reply-To: <14065.21363.277063.506787@weyr.cnri.reston.va.us> References: <14062.36276.434308.107504@weyr.cnri.reston.va.us> <14065.21363.277063.506787@weyr.cnri.reston.va.us> Message-ID: <wkyaktlx7a.fsf@ifi.uio.no> * Fred L. Drake | | I've only received one reply (which wasn't sent to the list); is | this revision uninteresting, or has XBEL already died? I find it interesting, at least, and I've seen some glimmers of interest on comp.text.xml although the person in question didn't use it because of the lack of non-Python software. Also, I use XBEL for publishing a couple of sets of URLs and find it very nice for manual maintenance of URL lists. However, I'm also terribly busy now and find it hard to involve myself in much beyond simple questions when online. --Lars M. From larsga@ifi.uio.no Fri Mar 19 08:15:30 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 19 Mar 1999 09:15:30 +0100 Subject: [XML-SIG] (Py)DOM: Character References In-Reply-To: <01BE7165.31EFFE80.co@daisybytes.su.uunet.de> References: <01BE7165.31EFFE80.co@daisybytes.su.uunet.de> Message-ID: <wkww0dlwt9.fsf@ifi.uio.no> * Carsten Oberscheid | | Ok, since charrefs encode only characters from the document's base | character set (Unicode for XML, ASCII for SGML -- is that right?) No. XML uses Unicode, but since XML is SGML (an SGML application profile, to be correct), it follows that this isn't true. And in fact SGML as a meta-language does not have a fixed document character set. In fact, the SGML declaration allows you to define your own character set in terms of well-known character sets. So, SGML can use Unicode/ISO 10646, as for example HTML 4.0 does[1], but it can also use any other character set which consists of well-known characters. It also has standard ways of handling characters that are not in the character sets. However, I don't think it can handle every character encoding, but I might be wrong. --Lars M. [1] <URL:http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html> From co@daisybytes.su.uunet.de Fri Mar 19 09:52:18 1999 From: co@daisybytes.su.uunet.de (Carsten Oberscheid) Date: Fri, 19 Mar 1999 10:52:18 +0100 Subject: [XML-SIG] (Py)DOM: Character References Message-ID: <01BE71F6.8F46D4F0.co@daisybytes.su.uunet.de> > > * Carsten Oberscheid > | > | Ok, since charrefs encode only characters from the document's base > | character set (Unicode for XML, ASCII for SGML -- is that right?) > > No. XML uses Unicode, but since XML is SGML (an SGML application > profile, to be correct), it follows that this isn't true. And in fact > SGML as a meta-language does not have a fixed document character set. > In fact, the SGML declaration allows you to define your own character > set in terms of well-known character sets. Allright, I should have said "SGML according to the standard declaration a.k.a. reference concrete syntax" ;^) > > So, SGML can use Unicode/ISO 10646, as for example HTML 4.0 does[1], > but it can also use any other character set which consists of > well-known characters. It also has standard ways of handling > characters that are not in the character sets. However, I don't think > it can handle every character encoding, but I might be wrong. But that leads be back to my original train of thought. Guess I'm processing SGML/XML/HTMLx.x documents on a system that can't cope with the documents' full character set, e.g. it can display ASCII only. Since the source and the target systems are not limited that way, I don't want to restrict the character set itself. I just want, in my intermediate processing, to consequently represent the non-ASCII characters as character references. As far as I can see from my zen level (I'm down hee-eeere!!), the DOM doesn't know about charrefs, and PyDOM expects them to be resolved (which xmlproc, for example, silently does). All I can do is to tell the XML lineariser to translate certain characters back to charrefs on output. But as I type this (learning by chatting away, hope you don't mind...) I see that this should be ok, since, to be XML (or SGML) conformant, my system (and the DOM implementation and the parser and so on) MUST be able to cope with the full charset internally. Hope I got this right now in my small brain, and thanks for making me think about it again. > > --Lars M. .co. +------------------------------------------------------- daisy bytes! --------+ Carsten Oberscheid co@daisybytes.su.uunet.de digital document processing http://www.pweb.de/daisybytes.su electronic publishing From gstein@lyra.org Sat Mar 20 01:48:30 1999 From: gstein@lyra.org (Greg Stein) Date: Fri, 19 Mar 1999 17:48:30 -0800 Subject: [XML-SIG] Uche's XML article at LinuxWorld Message-ID: <36F2FE6E.461550D2@lyra.org> Hey everybody... If you haven't seen it yet, our own Uche has written an extensive article for LinuxWorld. It's about XML standards and the future world and present state of XML on Linux. It's quite a good article. And yes... he mentions DAV's use of XML and my mod_dav module (thanx Uche! :-) The URL is: http://www.linuxworld.com/linuxworld/lw-1999-03/lw-03-xml.html Cheers, -g -- Greg Stein, http://www.lyra.org/ From dieter@handshake.de Sun Mar 21 14:42:20 1999 From: dieter@handshake.de (Dieter Maurer) Date: Sun, 21 Mar 1999 14:42:20 +0000 (/etc/localtime) Subject: [XML-SIG] Optimising/strategies for DOM/XSL lookup/parsing In-Reply-To: <199903182236.WAA24267@cheviot.ncl.ac.uk> References: <199903182236.WAA24267@cheviot.ncl.ac.uk> Message-ID: <14069.339.532658.966877@lindm.dm> Hello Tony > One thing that I've noticed is that the initial DOM 'parsing' is slow > relative to the XSL pattern matching. On my iMac 266, DOM parsing a 76k XML > file took 4-5 seconds (utils.FileReader), whilst the XSL pattern matching > took 1.8 seconds (tp = Parser(pattern), topics = > tp.select(reader.document)). By the way, is there any way of telling how > much memory a DOM tree is occupying? There are several parsers, different (e.g.) in speed. For example, "pyexapt" is a Python interface to James Clarks "expat" parser written in "C". It should be quite fast. On the other hand, "xmlproc" is purely written in Python. One would expect this parser to be considerably slower. Which one are you using? > The way I think things are likely to happen is that there will be large > numbers of XSL queries and very few DOM creations. However, there are > something like 140 documents that need to be 'available' for XSL querying > and subsequent transformation into HTML/RTF. In addition, there will be > times when an XSL query across all 140 documents will definitely happen. > > Would one strategy be to load up all 140 documents into memory on startup, > do the DOM processing then and then when an XSL query comes along, 'route' > it to the appropriate DOM tree (now in memory)? Neither Python nor PyDOM will prevent you to do this; only your Mac (or other computer, but Mac are especially sensitive wrt high memory consumption) may feel uncomfortable. > If this isn't possible, is it possible to 'save' a DOM tree to an external > file and re-read it in once a relevant XSL query is ready to be acted upon? You could try to [c]pickle the documents. [c]pickle is a standard module that allows you to serialize (e.g. write/read) complex python objects ("cpickle" is implemented in "C" and much faster than "pickle"). However, objects must obey some restrictions to be picklable. I am not sure, whether DOM objects fulfill them, just try it. - Dieter From gstein@lyra.org Sun Mar 21 15:11:36 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 21 Mar 1999 07:11:36 -0800 Subject: [XML-SIG] any IE5 users out there? Message-ID: <36F50C28.39DCC3E8@lyra.org> Hey gang, I just discovered something cool as hell. IE5 has a fancy shmancy viewer for XML documents. The CDIN project is using XML for its communication, so this is way cool. For an example, point your IE5 browser at: http://www.cdin.org/server/cdin.cgi/1,ae0d9e0d The CD info will be fetched from the database, assembled into an XML response, returned to the browser, and IE5 will display it quite nicely! Even has little +/- markers to expand/collapse the XML. Heh. Cheers, -g -- Greg Stein, http://www.lyra.org/ From akuchlin@cnri.reston.va.us Sun Mar 21 19:33:06 1999 From: akuchlin@cnri.reston.va.us (A.M. Kuchling) Date: Sun, 21 Mar 1999 14:33:06 -0500 Subject: [XML-SIG] New DOM code checked in Message-ID: <199903211933.OAA16149@207-172-38-202.s202.tnt8.ann.va.dialup.rcn.com> Just a heads-up: after fixing some minor bugs, I'm now confident enough to commit the revised DOM code to the CVS repository. If you can afford to risk it, please check out a copy, try it with your DOM-using code, and let me know -- loudly -- if something breaks. -- A.M. Kuchling http://starship.python.net/crew/amk/ I'm sure the reason such young nitwits are produced in our schools is because they have no contact with anything of any use in everyday life. -- Petronius, _The Satyricon_ From gstein@lyra.org Sun Mar 21 22:12:05 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 21 Mar 1999 14:12:05 -0800 Subject: [XML-SIG] New DOM code checked in References: <199903211933.OAA16149@207-172-38-202.s202.tnt8.ann.va.dialup.rcn.com> Message-ID: <36F56EB5.1BD7ABB@lyra.org> A.M. Kuchling wrote: > > Just a heads-up: after fixing some minor bugs, I'm now confident > enough to commit the revised DOM code to the CVS repository. If you > can afford to risk it, please check out a copy, try it with your > DOM-using code, and let me know -- loudly -- if something breaks. I would posit that you are too tentative with releases. Drop a 0.6 if you've made a bunch of code changes. Nothing to be afraid of. If you think that users not-in-the-know would get burned, then alter the web page to list a "semi-stable" and "development" version (0.5 and 0.6 respectively). In the past, you've also suggested a 0.5.1 ... probably no reason for that since you're pre-1.0 anyhow. Just bump to 0.6. "release early, release often" For myself, I drop a release once add'l functionality has been added. This allows everybody to test it, rather than those who set up CVS and pull a copy down. Cheers, -g -- Greg Stein, http://www.lyra.org/ From Cepl@fpm.cz Mon Mar 22 08:36:48 1999 From: Cepl@fpm.cz (=?iso-8859-2?Q?Mat=ECj_Cepl?=) Date: Mon, 22 Mar 1999 08:36:48 -0000 Subject: [XML-SIG] RE: XML-SIG digest, Vol 1 #246 - 1 msg Message-ID: <1318D78C9072D11195C9006094EA98A723A5E4@ocesrv> The only comment to that article is, that XML _is not_ lanugage like SGML. I am afraid, that such comments on XML only supports M$ in providing their "XML supporting" wordprocessors. :-( Have a nice day Matthew > -----Pùvodní zpráva----- > Od: xml-sig-admin@python.org [SMTP:xml-sig-admin@python.org] > Odesláno: 20. bøezna 1999 7:02 > Komu: xml-sig@python.org > Pøedmìt: XML-SIG digest, Vol 1 #246 - 1 msg > > > Send XML-SIG maillist submissions to > xml-sig@python.org > > To subscribe or unsubscribe via the web, visit > http://www.python.org/mailman/listinfo/xml-sig > or, via email, send a message with subject or body 'help' to > xml-sig-request@python.org > You can reach the person managing the list at > xml-sig-admin@python.org > > When replying, please edit your Subject line so it is more specific > than > "Re: Contents of XML-SIG digest...") > > > Today's Topics: > > 1. Uche's XML article at LinuxWorld (Greg Stein) > > --__--__-- > > Message: 1 > Date: Fri, 19 Mar 1999 17:48:30 -0800 > From: Greg Stein <gstein@lyra.org> > To: xml-sig@python.org > Subject: [XML-SIG] Uche's XML article at LinuxWorld > > Hey everybody... > > If you haven't seen it yet, our own Uche has written an extensive > article for LinuxWorld. It's about XML standards and the future world > and present state of XML on Linux. It's quite a good article. And > yes... > he mentions DAV's use of XML and my mod_dav module (thanx Uche! :-) > > The URL is: > http://www.linuxworld.com/linuxworld/lw-1999-03/lw-03-xml.html > > Cheers, > -g > > -- > Greg Stein, http://www.lyra.org/ > > > > --__--__---- > > End of XML-SIG Digest From fredrik@pythonware.com Mon Mar 22 09:16:28 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 22 Mar 1999 10:16:28 +0100 Subject: [XML-SIG] RE: XML-SIG digest, Vol 1 #246 - 1 msg References: <1318D78C9072D11195C9006094EA98A723A5E4@ocesrv> Message-ID: <007301be7444$ac3e8850$f29b12c2@pythonware.com> > The only comment to that article is, that XML _is not_ lanugage like > SGML. I am afraid, that such comments on XML only supports M$ in > providing their "XML supporting" wordprocessors. :-( what on earth is this supposed to mean? trolls on the xml-sig? /F From uche.ogbuji@fourthought.com Mon Mar 22 13:18:09 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 22 Mar 1999 06:18:09 -0700 Subject: [XML-SIG] ANN: 4XSL 0.6.0 Message-ID: <199903221318.GAA02043@malatesta.local> 4XSL is an XSL processor written in Python, using 4DOM. This is really an alpha-level release, although we have used it successfully to render our Web site (www.FourThought.com), which quite thoroughly exercises the features. We expect to round out the list of supported templates and release a beta soon. You can download 4XSL file from ftp:///starship.python.net/pub/crew/uche/4XSL/ See the README in the archive to get started. Feedback welcome (to 4Web@fourthought.com). A list of the templates supported is in docs/API.html. The full set of patterns should be supported. Thanks for all the interest from this group. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Mon Mar 22 13:29:33 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 22 Mar 1999 06:29:33 -0700 Subject: [XML-SIG] Re: ANN: 4XSL 0.6.0 In-Reply-To: Your message of "Mon, 22 Mar 1999 06:18:09 MST." Message-ID: <199903221329.GAA02083@malatesta.local> Oops. I bloody forgot to package the license. For this release, the license is similar to the Python License (we always plan to release under an open-source license): """ Copyright 1999 by FourThought LLC, U.S.A All Rights Reserved Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of FourThought LLC not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. FOURTHOUGHT LLC DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL FOURTHOUGHT BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. """ Have a nice day. --Uche > 4XSL is an XSL processor written in Python, using 4DOM. > > This is really an alpha-level release, although we have used it successfully > to render our Web site (www.FourThought.com), which quite thoroughly exercises > the features. We expect to round out the list of supported templates and > release a beta soon. > > You can download 4XSL file from > > ftp:///starship.python.net/pub/crew/uche/4XSL/ > > See the README in the archive to get started. Feedback welcome (to > 4Web@fourthought.com). A list of the templates supported is in docs/API.html. > The full set of patterns should be supported. > > Thanks for all the interest from this group. > > -- > Uche Ogbuji > FourThought LLC, IT Consultants > uche.ogbuji@fourthought.com (970)481-0805 > Software engineering, project management, Intranets and Extranets > http://FourThought.com http://OpenTechnology.org > > From akuchlin@cnri.reston.va.us Mon Mar 22 14:41:49 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 22 Mar 1999 09:41:49 -0500 (EST) Subject: [XML-SIG] New DOM code checked in In-Reply-To: <36F56EB5.1BD7ABB@lyra.org> References: <199903211933.OAA16149@207-172-38-202.s202.tnt8.ann.va.dialup.rcn.com> <36F56EB5.1BD7ABB@lyra.org> Message-ID: <14070.21124.428148.234835@amarok.cnri.reston.va.us> Greg Stein writes: >A.M. Kuchling wrote: >I would posit that you are too tentative with releases. Drop a 0.6 if Yeah, probably; I'm going to try to make releases more frequent from now on. That said, I want to delay a new release just a tiny bit longer. There are now only 2 major items left on the TODO list for the package: * Namespace support for DOM This should only take an evening or two to implement, once we've come up with an interface for it. I want to do this before releasing a 0.5.1. Namespace support for SAX can wait until the SAX2 discussions on xml-dev converge on a specification. * Integrate widestring support with the PyExpat module (major thing) Now, this is a knotty problem. I'd really like to be able to handle Unicode strings, but Unicode support for Python is only on the roadmap for 1.6, and the form of this support isn't known yet. A while back Martin's wstring module was added to the XML package. Subsequently, the String-SIG briefly awoke from its torpor and considered Unicode once more; in that round, Fredrik wrote "yet another Unicode string class" (http://www.pythonware.com/madscientist/). Now I have no idea what to do; switch to using Fredrik's code and adapt PyExpat to it, stick with Martin's module, or what? I need to go whine at Guido... I want to knock the DOM/namespace issue off the list; the Unicode question will have to wait. (Or be ignored until Python 1.6, though that's an unappealing prospect.) The rest of the items on the TODO list are smaller -- documentation, re-indenting some files, etc. Anyway, I'm hoping to do a 0.5.1 release later this week, which will probably be followed fairly quickly by a 0.5.2 release to fix any installation glitches. -- A.M. Kuchling http://starship.python.net/crew/amk/ For a moment he hesitates, memories stirring. Another life, another person. Suntans, lies and half-truths, a wife, a mistress, the clandestine clinic for cocaine dependency... A package arriving one morning containing a see-through, a brutually honest see-through body stocking... -- The Truth falters, in ENIGMA #2: "The Truth" From larsga@ifi.uio.no Mon Mar 22 15:23:14 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 22 Mar 1999 16:23:14 +0100 Subject: [XML-SIG] New DOM code checked in In-Reply-To: <14070.21124.428148.234835@amarok.cnri.reston.va.us> References: <199903211933.OAA16149@207-172-38-202.s202.tnt8.ann.va.dialup.rcn.com> <36F56EB5.1BD7ABB@lyra.org> <14070.21124.428148.234835@amarok.cnri.reston.va.us> Message-ID: <wkd821o8f1.fsf@ifi.uio.no> * Andrew M. Kuchling | | * Namespace support for DOM Shouldn't this wait for the next DOM WD in the hope that this part of the DOM will be fleshed out then? It seems pointless to me to implement this now only to have it obsoleted by the next draft. | Namespace support for SAX can wait until the SAX2 discussions on | xml-dev converge on a specification. Yup. I'll initiate a discussion here on the Python mapping once the Java interfaces seem more or less stable. When agreement has been reached here I'll go off and implement it. | [Unicode] Now I have no idea what to do; switch to using Fredrik's | code and adapt PyExpat to it, stick with Martin's module, or what? I'm in a similar quandary. I'd very much like to add Unicode support to xmlproc, but to do that I need support in RE. Any feedback on what Guido thinks and what the others here think would be welcome. Anyway, we should think about this and consider which in-memory representation we want (UTF-8 or UCS-2), whether we want to always use it and so on... | The rest of the items on the TODO list are smaller -- documentation, | re-indenting some files, etc. What about the factory I proposed here earlier? It would be nice to know whether that is on the list, and if so, where. | Anyway, I'm hoping to do a 0.5.1 release later this week, which will | probably be followed fairly quickly by a 0.5.2 release to fix any | installation glitches. I'm still hoping to do xmlproc 0.61 this week and it still seems possible. Maybe Wednesday... --Lars M. From akuchlin@cnri.reston.va.us Mon Mar 22 15:36:12 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Mon, 22 Mar 1999 10:36:12 -0500 (EST) Subject: [XML-SIG] New DOM code checked in In-Reply-To: <wkd821o8f1.fsf@ifi.uio.no> References: <199903211933.OAA16149@207-172-38-202.s202.tnt8.ann.va.dialup.rcn.com> <36F56EB5.1BD7ABB@lyra.org> <14070.21124.428148.234835@amarok.cnri.reston.va.us> <wkd821o8f1.fsf@ifi.uio.no> Message-ID: <14070.24995.128078.6216@amarok.cnri.reston.va.us> Lars Marius Garshol writes: >Shouldn't this wait for the next DOM WD in the hope that this part of >the DOM will be fleshed out then? It seems pointless to me to >implement this now only to have it obsoleted by the next draft. The next DOM WD is probably some time away, since the March 4 draft doesn't mention namespaces at all and WDs come out fairly slowly. There are technologies like RDF and XSL that pretty much require namespaces *now*, so I think we can't sit on our hands here. >I'm in a similar quandary. I'd very much like to add Unicode support >to xmlproc, but to do that I need support in RE. Any feedback on what >Guido thinks and what the others here think would be welcome. Unicode in REs? <ulp> I may be ill... <blargh> >What about the factory I proposed here earlier? It would be nice to >know whether that is on the list, and if so, where. Yes, it's definitely on the list, but I haven't yet had time to look at 4DOM's interface. Anyone want to propose how this should look for PyDOM and save me the trouble? >I'm still hoping to do xmlproc 0.61 this week and it still seems >possible. Maybe Wednesday... Don't rush; I don't expect 0.5.1 will have a long life, so xmlproc could be updated in 0.5.2. -- A.M. Kuchling http://starship.python.net/crew/amk/ I never realized it before, but having looked that over I'm certain I'd rather have my eyes burned out by zombies with flaming dung sticks than work on a conscientious Unicode regex engine. -- Tim Peters, 3 Dec 1998 From Fred L. Drake, Jr." <fdrake@acm.org Mon Mar 22 16:10:48 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Mon, 22 Mar 1999 11:10:48 -0500 (EST) Subject: [XML-SIG] any IE5 users out there? In-Reply-To: <36F50C28.39DCC3E8@lyra.org> References: <36F50C28.39DCC3E8@lyra.org> Message-ID: <14070.27528.868156.27149@weyr.cnri.reston.va.us> Greg Stein writes: > http://www.cdin.org/server/cdin.cgi/1,ae0d9e0d > > The CD info will be fetched from the database, assembled into an XML > response, returned to the browser, and IE5 will display it quite nicely! > Even has little +/- markers to expand/collapse the XML. Heh. Greg, This is cool! I hope you're not expecting the neat collapsing-XML trick from Grail 0.6 unless you're willing to send me a filetype handler ASAP. ;-) -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives From digitome@iol.ie Mon Mar 22 16:29:56 1999 From: digitome@iol.ie (Sean Mc Grath) Date: Mon, 22 Mar 1999 16:29:56 +0000 Subject: [XML-SIG] Python Tutorial at WWW8, Toronto, May Message-ID: <3.0.6.32.19990322162956.009c0040@gpo.iol.ie> <alert type="yellow"> I have heard from the organizers of WWW8 (http://www.www8.org) that the half day Python tutorial is well underbooked at the moment. Unless they get more bookings, they will not run the tutorial. Anyone out there who knows anyone out there who wants an overview of Python for HTML/XML/CGI/HTTP work, please let them know about the tutorial. </alert> thanks, Sean <Sean uri="http://www.digitome.com/sean.htm"/> From dieter@handshake.de Mon Mar 22 18:01:10 1999 From: dieter@handshake.de (Dieter Maurer) Date: Mon, 22 Mar 1999 18:01:10 +0000 (/etc/localtime) Subject: [XML-SIG] New DOM code checked in In-Reply-To: <wkd821o8f1.fsf@ifi.uio.no> References: <199903211933.OAA16149@207-172-38-202.s202.tnt8.ann.va.dialup.rcn.com> <wkd821o8f1.fsf@ifi.uio.no> Message-ID: <14070.32485.514325.793772@lindm.dm> Lars Marius Garshol writes: > | [Unicode] Now I have no idea what to do; switch to using Fredrik's > | code and adapt PyExpat to it, stick with Martin's module, or what? > > I'm in a similar quandary. I'd very much like to add Unicode support > to xmlproc, but to do that I need support in RE. Any feedback on what > Guido thinks and what the others here think would be welcome. I have started to extend "pcre" for wide character string handling (strings may consist (uniformly) of either 1, 2 or 4 byte units; thus this is a fixed width (UCS-2) rather than multibyte (UTF-8) approach). Currently, I require, that all RE metacharacters are still ASCII; e.g. in "{n,m}" I would only recognize ASCII digits but not ARABIC-INDIC digits. There is, of cause, no restriction with respect to the characters that match themselves. Things like canonical mapping and canonical writing direction are not handled (it is more a wide character than a unicode support PCRE). Of cause, I will announce the module, as soon as it becomes alpha. I make, however, only slow progress. - Dieter From mike.olson@fourthought.com Mon Mar 22 20:55:30 1999 From: mike.olson@fourthought.com (Mike Olson) Date: Mon, 22 Mar 1999 14:55:30 -0600 Subject: [XML-SIG] 4XSL 0.6.0 References: <199903221329.GAA02083@malatesta.local> Message-ID: <36F6AE42.E118104B@fourthought.com> This is a cryptographically signed message in MIME format. --------------ms8F670C7D5D453A4D3CDFAA90 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Just wanted to let everyone know that there was a typeo in the URL previous posting. The other URL should work but some people seem to be having problems with it. This URL does work, however I needed to try downloading three times as starship seems to be have a cranky day. ftp://starship.python.net/pub/crew/uche/4XSL/ Later -- Mike Olson Member Consultant FourThought LLC http://www.fourthought.com http://opentechnology.org --- "No program is interesting in itself to a programmer. It's only interesting as long as there are new challenges and new ideas coming up." --- Linus Torvalds --------------ms8F670C7D5D453A4D3CDFAA90 Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIIKmQYJKoZIhvcNAQcCoIIKijCCCoYCAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3DQEHAaCC CCUwggTvMIIEWKADAgECAhAOCY8cYeSQOObs5zKyDmWRMA0GCSqGSIb3DQEBBAUAMIHMMRcw FQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UECxMWVmVyaVNpZ24gVHJ1c3QgTmV0d29y azFGMEQGA1UECxM9d3d3LnZlcmlzaWduLmNvbS9yZXBvc2l0b3J5L1JQQSBJbmNvcnAuIEJ5 IFJlZi4sTElBQi5MVEQoYyk5ODFIMEYGA1UEAxM/VmVyaVNpZ24gQ2xhc3MgMSBDQSBJbmRp dmlkdWFsIFN1YnNjcmliZXItUGVyc29uYSBOb3QgVmFsaWRhdGVkMB4XDTk5MDMwNTAwMDAw MFoXDTk5MDUwNDIzNTk1OVowggEKMRcwFQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UE CxMWVmVyaVNpZ24gVHJ1c3QgTmV0d29yazFGMEQGA1UECxM9d3d3LnZlcmlzaWduLmNvbS9y ZXBvc2l0b3J5L1JQQSBJbmNvcnAuIGJ5IFJlZi4sTElBQi5MVEQoYyk5ODEeMBwGA1UECxMV UGVyc29uYSBOb3QgVmFsaWRhdGVkMSYwJAYDVQQLEx1EaWdpdGFsIElEIENsYXNzIDEgLSBO ZXRzY2FwZTETMBEGA1UEAxQKTWlrZSBPbHNvbjEpMCcGCSqGSIb3DQEJARYabWlrZS5vbHNv bkBmb3VydGhvdWdodC5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBANKGswZUnQ/B IfNlZWIIy6G6AkyjYgPRhXynebPtI5ARMq9xDo2zgLgWE+8QffdoZp2hUnTpm63B6cG8yqH1 PnA/7SB2roIfml1vnOwXgNuBctciTmnrac4GWgL0CM9839fJZh47QIVYPlCbOPtnvnH1NGGD jFWAVX7vmES72Dl9AgMBAAGjggGPMIIBizAJBgNVHRMEAjAAMIGsBgNVHSAEgaQwgaEwgZ4G C2CGSAGG+EUBBwEBMIGOMCgGCCsGAQUFBwIBFhxodHRwczovL3d3dy52ZXJpc2lnbi5jb20v Q1BTMGIGCCsGAQUFBwICMFYwFRYOVmVyaVNpZ24sIEluYy4wAwIBARo9VmVyaVNpZ24ncyBD UFMgaW5jb3JwLiBieSByZWZlcmVuY2UgbGlhYi4gbHRkLiAoYyk5NyBWZXJpU2lnbjARBglg hkgBhvhCAQEEBAMCB4AwgYYGCmCGSAGG+EUBBgMEeBZ2ZDQ2NTJiZDYzZjIwNDcwMjkyOTg3 NjNjOWQyZjI3NTA2OWM3MzU5YmVkMWIwNTlkYTc1YmM0YmM5NzAxNzQ3ZGE1ZDNmMjE0MWJl YWRiMmJkMmU4OTIxM2FlNmFmOWRmMTE0OTk5YTNiODQ1ZjlmM2VhNDUwYzAzBgNVHR8ELDAq MCigJqAkhiJodHRwOi8vY3JsLnZlcmlzaWduLmNvbS9jbGFzczEuY3JsMA0GCSqGSIb3DQEB BAUAA4GBAIuxBeIOBMHbj5yM/Vu4UJxDcz4Xtc7h0K8c6d82SiwwKLN5Gbew69PevcN6Ak+p D8LO4NyCH8Cfu3acoT0Efi99XjWvdi2eSbDJUw6MvgJtnAfY03zM+Cf31A/1iyrvr3hD45/c yhUNRh8f6qX1NzeKvvh5AcYD1bsi+0wnP0D8MIIDLjCCApegAwIBAgIRANJ2Lo0UDD19sqgl Xa/uDXUwDQYJKoZIhvcNAQECBQAwXzELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWdu LCBJbmMuMTcwNQYDVQQLEy5DbGFzcyAxIFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRpb24g QXV0aG9yaXR5MB4XDTk4MDUxMjAwMDAwMFoXDTA4MDUxMjIzNTk1OVowgcwxFzAVBgNVBAoT DlZlcmlTaWduLCBJbmMuMR8wHQYDVQQLExZWZXJpU2lnbiBUcnVzdCBOZXR3b3JrMUYwRAYD VQQLEz13d3cudmVyaXNpZ24uY29tL3JlcG9zaXRvcnkvUlBBIEluY29ycC4gQnkgUmVmLixM SUFCLkxURChjKTk4MUgwRgYDVQQDEz9WZXJpU2lnbiBDbGFzcyAxIENBIEluZGl2aWR1YWwg U3Vic2NyaWJlci1QZXJzb25hIE5vdCBWYWxpZGF0ZWQwgZ8wDQYJKoZIhvcNAQEBBQADgY0A MIGJAoGBALtaRIoEFrtV/QN6ii2UTxV4NrgNSrJvnFS/vOh3Kp258Gi7ldkxQXB6gUu5SBNW LccI4YRCq8CikqtEXKpC8IIOAukv+8I7u77JJwpdtrA2QjO1blSIT4dKvxna+RXoD4e2HOPM xpqOf2okkuP84GW6p7F+78nbN2rISsgJBuSZAgMBAAGjfDB6MBEGCWCGSAGG+EIBAQQEAwIB BjBHBgNVHSAEQDA+MDwGC2CGSAGG+EUBBwEBMC0wKwYIKwYBBQUHAgEWH3d3dy52ZXJpc2ln bi5jb20vcmVwb3NpdG9yeS9SUEEwDwYDVR0TBAgwBgEB/wIBADALBgNVHQ8EBAMCAQYwDQYJ KoZIhvcNAQECBQADgYEAiLg3O93alDcAraqf4YEBcR6Sam0v9vGd08pkONwbmAwHhluFFWoP uUmFpJXxF31ntH8tLN2aQp7DPrSOquULBt7yVir6M8e+GddTTMO9yOMXtaRJQmPswqYXD11Y Gkk8kFxVo2UgAP0YIOVfgqaxqJLFWGrBjQM868PNBaKQrm4xggI8MIICOAIBATCB4TCBzDEX MBUGA1UEChMOVmVyaVNpZ24sIEluYy4xHzAdBgNVBAsTFlZlcmlTaWduIFRydXN0IE5ldHdv cmsxRjBEBgNVBAsTPXd3dy52ZXJpc2lnbi5jb20vcmVwb3NpdG9yeS9SUEEgSW5jb3JwLiBC eSBSZWYuLExJQUIuTFREKGMpOTgxSDBGBgNVBAMTP1ZlcmlTaWduIENsYXNzIDEgQ0EgSW5k aXZpZHVhbCBTdWJzY3JpYmVyLVBlcnNvbmEgTm90IFZhbGlkYXRlZAIQDgmPHGHkkDjm7Ocy sg5lkTAJBgUrDgMCGgUAoIGxMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcN AQkFMQ8XDTk5MDMyMjIwNTUzMlowIwYJKoZIhvcNAQkEMRYEFF5nTumE1ykazKppj1ky9fyf TVIfMFIGCSqGSIb3DQEJDzFFMEMwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMAcGBSsO AwIHMA0GCCqGSIb3DQMCAgFAMA0GCCqGSIb3DQMCAgEoMA0GCSqGSIb3DQEBAQUABIGAgBVD 5sJ6NImlobzt0pd6hvBzQUbNXTrOlFUyYGEi08/h7hILwDjWAqnMuxGb9uE9Cm57Ci71hCbV 4p+L2JH8oz+MgWsLK6ulofJAKIN4+MjiHmLBLqv3tfDpFTfVMXvEN6sCB+Jc9Xtp/j0H6AOA si0Lp+E/9uhraCRRZ+fTaWo= --------------ms8F670C7D5D453A4D3CDFAA90-- From krussll@cc.UManitoba.CA Mon Mar 22 22:37:00 1999 From: krussll@cc.UManitoba.CA (Kevin Russell) Date: Mon, 22 Mar 1999 16:37:00 -0600 Subject: [XML-SIG] Optimising/strategies for DOM/XSL lookup-parsing Message-ID: <36F6C60C.AF24B818@cc.umanitoba.ca> >> If this isn't possible, is it possible to 'save' a DOM tree to an external >> file and re-read it in once a relevant XSL query is ready to be acted upon? >You could try to [c]pickle the documents. [c]pickle is >a standard module that allows you to serialize (e.g. write/read) >complex python objects ("cpickle" is implemented in "C" >and much faster than "pickle"). However, objects must obey >some restrictions to be picklable. I am not sure, whether >DOM objects fulfill them, just try it. I actually tried this once (though only with pickle, not cpickle). There's a class reference somewhere near the top of the DOM tree that will prevent pickling. Just del it from the appropriate dictionary, and you're fine. BUT: unpickling the file took *considerably* longer than just reading in and parsing the XML in the first place. Unless cpickle is a helluva lot faster, there won't be much saved time. -- Kevin From Fred L. Drake, Jr." <fdrake@acm.org Mon Mar 22 22:36:58 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Mon, 22 Mar 1999 17:36:58 -0500 (EST) Subject: [XML-SIG] Optimising/strategies for DOM/XSL lookup-parsing In-Reply-To: <36F6C60C.AF24B818@cc.umanitoba.ca> References: <36F6C60C.AF24B818@cc.umanitoba.ca> Message-ID: <14070.50698.129053.147352@weyr.cnri.reston.va.us> Kevin Russell writes: > BUT: unpickling the file took *considerably* longer than just > reading in and parsing the XML in the first place. Unless > cpickle is a helluva lot faster, there won't be much saved time. cPickle is a helluva lot faster. Some tests have reported as much as 1000 times faster. Using "binary" pickles should help a lot in either case, since the DOM data structures are heavily biased toward string data, which is a lot faster in the binary format. The best combo is: * cPickle if it's available (otherwise just use pickle), and * binary format, in either case. -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives From uche.ogbuji@fourthought.com Tue Mar 23 04:41:23 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 22 Mar 1999 21:41:23 -0700 Subject: [XML-SIG] New DOM code checked in In-Reply-To: Your message of "Mon, 22 Mar 1999 10:36:12 EST." <14070.24995.128078.6216@amarok.cnri.reston.va.us> Message-ID: <199903230441.VAA02661@malatesta.local> > Lars Marius Garshol writes: > >Shouldn't this wait for the next DOM WD in the hope that this part of > >the DOM will be fleshed out then? It seems pointless to me to > >implement this now only to have it obsoleted by the next draft. > > The next DOM WD is probably some time away, since the March 4 > draft doesn't mention namespaces at all and WDs come out fairly > slowly. There are technologies like RDF and XSL that pretty much > require namespaces *now*, so I think we can't sit on our hands here. Actually, last week's DOM Level 2 WD _does_ mention namespaces, but does no more. > >What about the factory I proposed here earlier? It would be nice to > >know whether that is on the list, and if so, where. > > Yes, it's definitely on the list, but I haven't yet had time > to look at 4DOM's interface. Anyone want to propose how this should > look for PyDOM and save me the trouble? Here is the IDL for 4DOM's NodeFactory. You'll probably only need part of it, though: we enforce very strict factory usage because the node objects _could_ be remote. #pragma prefix "fourthought.com" #include "../../DOM.idl" #include "../../HTML/HTML.idl" module NodeFactoryIF { typedef sequence<DOMIF::Node> listofnodes; interface NodeFactory { //The user should only call these four methods HTMLIF::HTMLDocument createHTMLDocument(); DOMIF::Document createDocument(); HTMLIF::HTMLElement createHTMLElement(in HTMLIF::HTMLDocument parent,in string tag); void releaseNode(in DOMIF::Node node); //Non public interface: user shouldn't call these //All require the ownerDocument, but when called from //Document.py, this is provided for the user DOMIF::DOMImplementation createDOMImplementation(in string feature, in string version); DOMIF::NodeList createNodeList(in listofnodes nodes); DOMIF::NamedNodeMap createNamedNodeMap(); DOMIF::Element createElement(in DOMIF::Document ownerDocument, in string tagName); DOMIF::DocumentFragment createDocumentFragment(in DOMIF::Document ownerDocument); DOMIF::DocumentType createDocumentType(in DOMIF::Document ownerDocument, in string name, in DOMIF::NamedNodeMap entities, in DOMIF::NamedNodeMap notations); DOMIF::Text createTextNode(in DOMIF::Document ownerDocument, in string data); DOMIF::Comment createComment(in DOMIF::Document ownerDocument, in string data); DOMIF::CDATASection createCDATASection(in DOMIF::Document ownerDocument, in string data); DOMIF::ProcessingInstruction createProcessingInstruction(in DOMIF::Document ownerDocument, in string target, in string data); DOMIF::Attr createAttribute(in DOMIF::Document ownerDocument, in string name); DOMIF::Entity createEntity(in DOMIF::Document ownerDocument, in string publicId, in string systemId, in string notationName); DOMIF::EntityReference createEntityReference(in DOMIF::Document ownerDocument,in string name); DOMIF::Notation createNotation(in DOMIF::Document ownerDocument, in string publicId, in string systemId, in string name); DOMIF::NodeIterator createNodeIterator(in DOMIF::Node start_node); DOMIF::NodeIterator createSelectiveNodeIterator(in DOMIF::Node start_node, in unsigned short what_to_show); DOMIF::NodeIterator createFilteredNodeIterator(in DOMIF::Node start_node, in DOMIF::NodeFilter filter); DOMIF::NodeIterator createSelectiveFilteredNodeIterator(in DOMIF::Node start_node, in unsigned short what_to_show, in DOMIF::NodeFilter filter); HTMLIF::HTMLCollection createHTMLCollection(in listofnodes nodes); }; }; -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Tue Mar 23 04:46:04 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 22 Mar 1999 21:46:04 -0700 Subject: [XML-SIG] Uche's XML article at LinuxWorld In-Reply-To: Your message of "Fri, 19 Mar 1999 17:48:30 PST." <36F2FE6E.461550D2@lyra.org> Message-ID: <199903230446.VAA02675@malatesta.local> > If you haven't seen it yet, our own Uche has written an extensive > article for LinuxWorld. It's about XML standards and the future world > and present state of XML on Linux. It's quite a good article. And yes... > he mentions DAV's use of XML and my mod_dav module (thanx Uche! :-) > > The URL is: > http://www.linuxworld.com/linuxworld/lw-1999-03/lw-03-xml.html Thanks for the plug, Greg. Minor disclaimer: I wrote that article in December (even Web zines have lead times, it appears), and that's something like a full Internet year, so some info might be a bit out of date. And lo! I went into my time machine to return the favor, I have an even stronger plug for your work on WebDAV in my PythonJournal column, which I submitted a while back (I'm hoping the new issue will be published soon because it too is slowly gathering rust). -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From grove@infotek.no Tue Mar 23 07:46:56 1999 From: grove@infotek.no (Geir Ove Grønmo) Date: 22 Mar 1999 08:46:56 -2300 Subject: [XML-SIG] ANN: tmproc 0.10, a Topic Map implementation Message-ID: <GROVE-82iubswsun.fsf@pc-grove.infotek.no> Hello, I'm pleased to announce the first release of tmproc, a Topic Map processor. This release is meant to be a technology preview. Enjoy! Geir O. -------------------------------------------------------------------------- Title: tmproc Version: 0.10 Released: March 23rd 1999 Author: Geir O. Grønmo, grove@infotek.no Homepage: http://www.infotek.no/~grove/software/tmproc/index.html Requirements: - Python 1.5.1 or newer [1] - An SGML/XML parser with a SAX driver - SAX for Python [2] - xmlarch 0.25, optional unless architectural processing is needed [3] - -- >>> What is tmproc? tmproc is an implementation of the new international standard ISO/IEC 13250 Topic Maps[4]. tmproc is written in Python, and it should work on any platform to which Python have been ported[2]. tmproc is a set of classes that represents a framework for doing topic map processing in Python. The current release includes the following set of classes: o classes for representing topic map objects like TopicMap, Topic, TopicName, Occurrence, Locator, Association, AssociationRole, Facet and FacetValue. o a factory class for creating topic map objects. o a class for importing topic maps, TMImporter. It listens to SAX events and use a factory class and interfaces to build a Topic Map. o an export class, TMExporter, that emits SAX events in the topic map interchange format so that any SAX document handler may be used for export. o statistical and information printing classes, TMUtils and TMStats. A command line utility is also included in the distribution. The implementation is currently based on a draft released some time before the final ballot. Some deviations from the - soon to be released - final standard is expected. Currently only a in-memory implementation is available. A relational database implementation have also been written, but is not available in the distribution because it is a bit crude at the moment. Fortunately tmproc has been written in a way that makes it easy to do additional implementations. - -- >>> Some of the features are: o Import, export, query and manipulation of topic maps. o Full set of extensible topic map classes with clearly defined interfaces. Association, AssociationRole, Facet, FacetValue, Locator, Occurrence, Topic, TopicMap, TopicMapFactory and TopicName. o Access to data in topic map objects using getter and setter methods. o Get types including transitive types of topics, associations and facets. o Get objects [e.g. topics, associations and facets] that are of given types or more specific types. o Get objects [e.g. associations] that exists in a scope or in any of the scopes' subscopes. o Optional architectural processing [requires xmlarch]. o Introduction and reference documentation. Suggestions and bug reports should be sent to: grove@infotek.no - -- [1] http://www.python.org/ [2] http://www.stud.ifi.uio.no/~larsga/download/python/xml/saxlib.html [3] http://www.infotek.no/~grove/software/xmlarch/index.html [4] Final CD Text for ISO/IEC 13250, Topic Navigation Maps, http://www.ornl.gov/sgml/sc34/document/0008.htm <P><A HREF="http://www.infotek.no/~grove/software/tmproc/index.html">tmproc 0.10</A> - an implementation of the new international standard ISO/IEC 13250 Topic Maps. (22-Mar-99) -- ================== Geir Ove Grønmo ================== | STEP Infotek as, Gjerdrumsvei 12, 0486 Oslo, Norway | | grove@infotek.no http://www.infotek.no/ | ------------------------------------------------------- From grove@infotek.no Tue Mar 23 07:56:47 1999 From: grove@infotek.no (Geir Ove Grønmo) Date: 22 Mar 1999 08:56:47 -2300 Subject: [XML-SIG] ANN: xmlarch 0.25, an XML architectural forms processor Message-ID: <GROVE-823e2wwse8.fsf@pc-grove.infotek.no> xmlarch: An XML architectural forms processor written in Python Version: 0.25 Released: March 23rd 1999 Author: Geir Ove Grønmo Email: grove@infotek.no Homepage: http://www.infotek.no/~grove/software/xmlarch/index.html --- What is xmlarch? The xmlarch module contains an XML architectural forms processor written in Python. It allows you to process XML architectural forms using any parser that uses the SAX interfaces. The module allow you to process several architectures in one parse-pass. Architectural document events for an architecture can even be broadcasted to multiple DocumentHandlers. The main reason for releasing this version is to be able to support architectural processing with tmproc[1]. Topic Map processing relies heavily on the existence of the #GI mapping. What's new? - Added support for the new #GI mapping token. - Added a method called get_current_element_name() to the ArchDocHandler class, so that you can easily keep track of the original generic identifier. Fixes: - Bug related to the mapping between attributes and content. - Some minor ones. [1] http://www.infotek.no/~grove/software/tmproc/index.html --- Enjoy! Geir Ove Grønmo -- ================== Geir Ove Grønmo ================== | STEP Infotek as, Gjerdrumsvei 12, 0486 Oslo, Norway | | grove@infotek.no http://www.infotek.no/ | ------------------------------------------------------- From co@daisybytes.su.uunet.de Tue Mar 23 18:26:38 1999 From: co@daisybytes.su.uunet.de (Carsten Oberscheid) Date: Tue, 23 Mar 1999 19:26:38 +0100 Subject: [XML-SIG] DOM Nodes: Identity test Message-ID: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> What is the "official" way to check if a two DOM Node objects are the same? The PyDOM implementation, e.g. in the Element.get_parentNode() method, always returns a "fresh" Node object encapsulating the real data in a _nodeData object (right?), so when I do get_parentNode() twice on the same node, I get two different Node objects. Am I supposed to compare the _node members? I don't think I'm talking about an ID attribute problem here. .co. +------------------------------------------------------- daisy bytes! --------+ Carsten Oberscheid co@daisybytes.su.uunet.de digital document processing http://www.pweb.de/daisybytes.su electronic publishing From spepping@scaprea.hobby.nl Mon Mar 22 09:06:46 1999 From: spepping@scaprea.hobby.nl (Simon Pepping) Date: Mon, 22 Mar 1999 10:06:46 +0100 Subject: [XML-SIG] Building a DOM tree Message-ID: <19990322100646.A684@scaprea.hobby.nl> Hello, I have made a study of some of the inner workings of the DOM implementation. One thing that strikes me is that building the DOM tree is rather slow. Others have remarked on this too. With that in mind it strikes me that the DOM building process builds many objects which are discarded almost immediately. The data structure of a DOM tree consists of a tree of _nodeData instances, which I call the backbone. These objects refer to each other by their children attribute. These _nodeData instances are never presented to the user. Whenever the user approaches a node by one of the DOM methods, an instance of a subclass of Node (Document, Element, Text, Attribute) is created, whose _node attribute refers to the _nodeData instance. This object implements all DOM attributes and methods; for the data it refers to its backbone counterpart, the _nodeData instance. During construction of a DOM tree both objects are created for each node. The backbone remains in existence. During construction the Element instances of the current element and its ancestors remain in existence. Each Element instance is deleted at the close of the element in the XML text. At the end of the construction only the Document instance remains. It keeps the backbone alive by holding a reference to its top, i.e., its _node attribute. So I wonder: would it not be much faster to build the backbone directly, without the Node instances? I say this because I think building the DOM tree lasts really too long to be of practical use. For those interested, I put my notes on my home page: http://www.hobby.nl/~scaprea/XML/DOMnotes.txt. Regards, Simon Pepping -- Simon Pepping email: spepping@scaprea.hobby.nl From Fred L. Drake, Jr." <fdrake@acm.org Tue Mar 23 20:02:24 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Tue, 23 Mar 1999 15:02:24 -0500 (EST) Subject: [XML-SIG] Building a DOM tree In-Reply-To: <19990322100646.A684@scaprea.hobby.nl> References: <19990322100646.A684@scaprea.hobby.nl> Message-ID: <14071.62288.21003.237272@weyr.cnri.reston.va.us> Simon Pepping writes: > I say this because I think building the DOM tree lasts really too long > to be of practical use. I've been building DOM trees from ESIS events (generated from LaTeX, of course! ;). While I don't think the speed is a problem for my application, I think for most applications it would be. Anything with an interactive aspect needs to be able to respond a little more quickly; using threads can maintain interactivity of a UI, but can't get the results any faster. The later is what is really needed in a lot of contexts, including many intersting ones (like an editor). -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives From akuchlin@cnri.reston.va.us Tue Mar 23 21:22:46 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Tue, 23 Mar 1999 16:22:46 -0500 (EST) Subject: [XML-SIG] Building a DOM tree In-Reply-To: <19990322100646.A684@scaprea.hobby.nl> References: <19990322100646.A684@scaprea.hobby.nl> Message-ID: <14072.34.559711.63456@amarok.cnri.reston.va.us> Simon Pepping writes: >The data structure of a DOM tree consists of a tree of _nodeData >instances, which I call the backbone. These objects refer to each >other by their children attribute. Correct. >These _nodeData instances are never presented to the user. Whenever >the user approaches a node by one of the DOM methods, an instance of a >subclass of Node (Document, Element, Text, Attribute) is created, Also correct. This convoluted structured is required in order to avoid creating cycles of references, because CPython's garbage collection can't handle cycles. (JPython is dependent on the Java VM for garbage collection, and I'd imagine that most Java VMs can correctly collect garbage cycles.) The problem is parent pointers: if you have a tree of Node instances, and each node holds a reference to its parent, you have to deliberately break the cycle before it can be garbage collection, so you'd have to call some .destroy() or .close() method on nodes before they go out of scope. Forget this, and your program would leak memory. Fewer cycles can be created if every node has a reference to the Document node, but that doesn't avoid the problem completely; you'd still have to explicitly destroy Document nodes. If anyone can suggest a solution that avoids cycles and avoids having to remember to destroy things, please post! We can change the Builder class to deal with _nodeData instances directly, instead of creating proxy nodes and then adding them; that will help software that relies on Builder, but won't speed up software that builds DOM trees on its own. -- A.M. Kuchling http://starship.python.net/crew/amk/ A pig can learn more tricks than a dog, but has too much sense to want to do it. -- Robertson Davies, _The Table Talk of Samuel Marchbanks_ From Fred L. Drake, Jr." <fdrake@acm.org Tue Mar 23 22:34:46 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Tue, 23 Mar 1999 17:34:46 -0500 (EST) Subject: [XML-SIG] Building a DOM tree In-Reply-To: <14072.34.559711.63456@amarok.cnri.reston.va.us> References: <19990322100646.A684@scaprea.hobby.nl> <14072.34.559711.63456@amarok.cnri.reston.va.us> Message-ID: <14072.5894.589152.999922@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > We can change the Builder class to deal with _nodeData > instances directly, instead of creating proxy nodes and then adding > them; that will help software that relies on Builder, but won't speed > up software that builds DOM trees on its own. This should be sufficient for most uses, and I certainly wouldn't mind the speedup. ;-) -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives From akuchlin@cnri.reston.va.us Wed Mar 24 14:48:08 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Wed, 24 Mar 1999 09:48:08 -0500 (EST) Subject: [XML-SIG] DOM Nodes: Identity test In-Reply-To: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> References: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> Message-ID: <14072.64080.409375.326302@amarok.cnri.reston.va.us> Carsten Oberscheid writes: >What is the "official" way to check if a two DOM Node objects are the same? > >The PyDOM implementation, e.g. in the Element.get_parentNode() method, always >returns a "fresh" Node object encapsulating the real data in a _nodeData object >(right?), so when I do get_parentNode() twice on the same node, I get two >different Node objects. Am I supposed to compare the _node members? I've been wondering about that myself. We don't want to overload __cmp__ to do this, because Python comparisions are usually done by value; to check if two objects are the same, you'd use the 'is' operator, but we can't overload that operator. It's probably best to have a method of Node objects that returns the value of 'self._node is other._node'. Anyone care to suggest a name? .equals() offers itself, but that may confuse Java users, because .equals() is by value and == is by identity in Java. -- A.M. Kuchling http://starship.python.net/crew/amk/ I like to believe it was only the cold that made me shiver, only a strand of fog in my throat that caused me to catch my breath. Robert walked away across the moor and I never saw him again. Since that time I have walked with less comfort in cities. -- From SANDMAN #51: "A Tale of Two Cities" From paul@prescod.net Wed Mar 24 16:22:13 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 24 Mar 1999 10:22:13 -0600 Subject: [XML-SIG] DOM Nodes: Identity test References: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> <14072.64080.409375.326302@amarok.cnri.reston.va.us> Message-ID: <36F91135.62EFB469@prescod.net> "Andrew M. Kuchling" wrote: > > I've been wondering about that myself. We don't want to > overload __cmp__ to do this, because Python comparisions are usually > done by value; We could agree that the only by-value check that makes sense is the identity check. This is the same as saying that the location of a node in the tree is part of its value. I note that the default "==" behaviour for new Python classes is to use the identity so there are probably dozens of Python classes in the standard distribution with this behaviour. >>> class A: ... pass ... >>> A()==A() 0 -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html From co@daisybytes.su.uunet.de Thu Mar 25 10:58:26 1999 From: co@daisybytes.su.uunet.de (Carsten Oberscheid) Date: Thu, 25 Mar 1999 11:58:26 +0100 Subject: [XML-SIG] Building a DOM tree In-Reply-To: <14072.34.559711.63456@amarok.cnri.reston.va.us> References: <19990322100646.A684@scaprea.hobby.nl> <19990322100646.A684@scaprea.hobby.nl> Message-ID: <3.0.5.32.19990325115826.009509f0@kelly> At 16:22 23.03.99 -0500, Andrew M. Kuchling wrote: > Fewer cycles can be created if every node has a reference to >the Document node, but that doesn't avoid the problem completely; >you'd still have to explicitly destroy Document nodes. If anyone can >suggest a solution that avoids cycles and avoids having to remember to >destroy things, please post! <proposal appearance="UGLY"> Assuming that each Node object can be a member only of one single DOM tree, wouldn't it be possible to replace the _parent_relation member of the document element by one global _parent_relation dictionary on module level? xml.dom.core._parent_relation == { id(childNode): parentNode, ... } This would make the reference to the document node obsolete. get_parentNode() returns xml.dom.core._parent_relation[id(self)], insertChild() and removeChild() must take care of the global dictionary as they do with the document element's dictionary now. </proposal> Beauty aside, could this work or did I miss something? .co. ---------------------------------------------- daisy bytes! --------- Carsten Oberscheid co@daisybytes.su.uunet.de digital document processing http://www.peb.de/daisybytes.su electronic publishing From MHammond@skippinet.com.au Thu Mar 25 11:23:29 1999 From: MHammond@skippinet.com.au (Mark Hammond) Date: Thu, 25 Mar 1999 22:23:29 +1100 Subject: [XML-SIG] New DOM code checked in In-Reply-To: <14070.24995.128078.6216@amarok.cnri.reston.va.us> Message-ID: <000601be76b1$e9364710$0801a8c0@bobcat> Andrew M. Kuchling writes on the xml-sig: > Lars Marius Garshol writes: > >I'm in a similar quandary. I'd very much like to add Unicode support > >to xmlproc, but to do that I need support in RE. Any feedback on what > >Guido thinks and what the others here think would be welcome. The string-sig did pick up on this a while ago. There has been a long, protracted discussion there, and Guido has chimed in now and then - even supporting a specific proposal for implementation. The basis of this is that there will be a standard Unicode type which stores UCS-2 characters. Python will also define stream based decoders/encoders to go to and from the new type - with one built-in encoder - UTF-8. As mentioned, Fredrik has begun implementation of this type, but the encoders haven't really been tackled. This UCS-2 decision would seem to make Dieter's pcre work quite relevant too... Ive been trying to drive this on the string-sig, primarily for the Windows CE port. The Windows CE platform is Unicode based, and coming up with reasonable patches to the Python core really depend on integrated Unicode support. So I intend attempting to take the first steps towards Unicode integration on Windows CE. I'm trying to get agreement on concepts on the string-sig, and hopefully a decent CE environment would help towards acceptance for 1.6. The whole point is this is really to recommend that new Unicode experiments move towards Fredrik's implementation. If you need better encoders, then we could move towards the string-sig proposal (possibly using Martin's work as a base), as it has in-principle support from our benevolent dictator ;-) Mark. From akuchlin@cnri.reston.va.us Thu Mar 25 14:31:16 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Thu, 25 Mar 1999 09:31:16 -0500 (EST) Subject: [XML-SIG] Building a DOM tree In-Reply-To: <3.0.5.32.19990325115826.009509f0@kelly> References: <19990322100646.A684@scaprea.hobby.nl> <14072.34.559711.63456@amarok.cnri.reston.va.us> <3.0.5.32.19990325115826.009509f0@kelly> Message-ID: <14074.17922.614618.229455@amarok.cnri.reston.va.us> Carsten Oberscheid writes: >At 16:22 23.03.99 -0500, Andrew M. Kuchling wrote: >Assuming that each Node object can be a member only of one single DOM tree, >wouldn't it be possible to replace the _parent_relation member of the >document element by one global _parent_relation dictionary on module level? > > xml.dom.core._parent_relation == { id(childNode): parentNode, ... } Hmm... hmmm... no, I can't think of any reason that wouldn't work. Nodes can only have a single parent, and you can't mix nodes from two different document trees (unless you're Fred Drake), so key collisions aren't possible. That would mean there's a single dictionary with lots of keys, testing Python's dictionary code a bit more, but dictionaries are supposed to handle that sort of thing, so it shouldn't cause any problems. Shouldn't cause any problems for threading, either. Hmmm... -- A.M. Kuchling http://starship.python.net/crew/amk/ We of Faerie are of the wild magic. We are not creatures of spells and grimoires. We *are* spells, and we are written of in grimoires. -- From SANDMAN #52: "Cluracan's Tale" From Jeff.Johnson@icn.siemens.com Thu Mar 25 16:07:23 1999 From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com) Date: Thu, 25 Mar 1999 11:07:23 -0500 Subject: [XML-SIG] Building a DOM tree Message-ID: <8525673F.005890F9.00@li01.lm.ssc.siemens.com> [Carsten Oberscheid] >>Assuming that each Node object can be a member only of one single DOM tree, >>wouldn't it be possible to replace the _parent_relation member of the >>document element by one global _parent_relation dictionary on module level? >> >> xml.dom.core._parent_relation == { id(childNode): parentNode, ... } >> >>This would make the reference to the document node obsolete. >>get_parentNode() returns xml.dom.core._parent_relation[id(self)], >>insertChild() and removeChild() must take care of the global dictionary as >>they do with the document element's dictionary now. [A.M. Kuchling] > Hmm... hmmm... no, I can't think of any reason that wouldn't >work. Nodes can only have a single parent, and you can't mix nodes >from two different document trees (unless you're Fred Drake), so key >collisions aren't possible. That would mean there's a single >dictionary with lots of keys, testing Python's dictionary code a bit >more, but dictionaries are supposed to handle that sort of thing, so >it shouldn't cause any problems. Shouldn't cause any problems for >threading, either. Hmmm... I don't know that much about the inner working of Python so this may be a dumb question. How and when would the global dictionary be released? Is removeChild called for all nodes when I dereference a DOM document node? I call appendChild a lot but I usually don't call removeChild, I just throw away the whole tree. It seems to me that if I were to call the following code, I would run out of memory with the proposed dictionary. def processAlotOfFiles(): fr = FileReader() while 1: dom = fr.readFile('test.xml') Did I miss something? Also, I have gotten bitten several times by the fact that I can't move a node from one tree to another. I figured cloneNode() would allow it but it won't. Could we come up with a function to move a node (or copy it) from one tree to another? One simple example of why I need to do this is when I have to break up a large HTML file into smaller files. One big tree --> many small trees. I do it now by writing the HTML, HEAD and BODY tags as plain text to a file and inserting HtmlLineariser.linearise() between them. Not the most elegant solution. While I'm complaining, is there a good reason that HtmlWriter closes the file passed to it? Because of that I have to build the HTML string in memory and write it to the file. def writeStack(self,stack,head,body,fileName): f = open(fileName,'w') f.write('<HTML>\n') util2.writeHtmlNode(head,f) # Should copy the body node to get the attributes of the original. #for a,v in self.getAttributes() # self.body # attributes f.write('<BODY>\n') # Take the easy way for now... for node in stack: util2.writeHtmlNode(node,f) f.write('</BODY>\n') f.write('</HTML>\n') f.close() def write_html(document, stream=sys.stdout): "Given a DOM document, write the HTML to stream." w = HtmlWriter(stream) w.write(document) def writeHtmlNode(node, stream=sys.stdout): """HtmlWriter closes the stream which is not always desirable, this won't but it is probably slower because it builds a big string.""" l = HtmlLineariser() stream.write(l.linearise(node)) From akuchlin@cnri.reston.va.us Thu Mar 25 17:15:45 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Thu, 25 Mar 1999 12:15:45 -0500 (EST) Subject: [XML-SIG] Building a DOM tree In-Reply-To: <8525673F.005890F9.00@li01.lm.ssc.siemens.com> References: <8525673F.005890F9.00@li01.lm.ssc.siemens.com> Message-ID: <14074.26231.579500.755196@amarok.cnri.reston.va.us> Jeff.Johnson@icn.siemens.com writes: >I don't know that much about the inner working of Python so this may be a >dumb question. How and when would the global dictionary be released? Is >removeChild called for all nodes when I dereference a DOM document node? Ooh, good catch, Jeff. With the current CVS implementation, the dictionary is discarded when the document node gets collected, but cleaning out the global dictionary is harder. DocumentFragments and Documents would need a __del__ method that walked over their children and deleted the dictionary entries, slowing down destruction. Will the speedup from removing proxies compensate for it? More hmmm... >Also, I have gotten bitten several times by the fact that I can't move a >node from one tree to another. I figured cloneNode() would allow it but it A good observation; the DOM Level 1 doesn't actually say whether the result of cloneNode() is associated with the same document or not. I'd suspect the answer is 'yes', but have written to the www-dom list asking about this. Moving things between DOM document trees doesn't seem possible with the DOM Level 1 specification; all the functions to add and remove children from a node can raise a WRONG_DOCUMENT exception, which is described as: WRONG_DOCUMENT_ERR: Raised if newChild was created from a different document than the one that created this node. DocumentFragments should be used for this, I think. Create lots of DocumentFragments, and build your various small trees underneath them. To output them, make a copy of the document fragment (to avoid it being destroyed) and set it as the root element. # Clone the DocumentFragment df2 = docfragment.cloneNode( deep = 1) # Remove all the children of the root for child in document.childNodes: document.removeChild( child ) # Add the cloned DocFrag to the root. document.appendChild( df2 ) You could now call document.cloneNode(deep = 1) to get a new document tree containing just that tree, or print document.toxml(), and repeat the process with a new fragment. A bit clumsy, but there seems no way around it. >Not the most elegant solution. While I'm complaining, is there a good >reason that HtmlWriter closes the file passed to it? Because of that I Probably not; I've been annoyed at other functions which close files on you, so the .close() should be removed. -- A.M. Kuchling http://starship.python.net/crew/amk/ To call such persons "humorists", a loose-fitting and ugly word, is to miss the nature of their dilemma and the dilemma of their nature. The little wheels of their invention are set in motion by the damp hand of melancholy. -- James Thurber, "Preface to A Life", in _The Thurber Carnival_ From larsga@ifi.uio.no Thu Mar 25 17:24:47 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 25 Mar 1999 18:24:47 +0100 Subject: [XML-SIG] adr_parse update Message-ID: <wkpv5xeb34.fsf@ifi.uio.no> For various reasons I did an Opera bookmark to XBEL conversion today and had to update adr_parse to do it. It now handles adr files correctly even when parsing under Unix, and also has much better date handling. --Lars M. #!/usr/bin/env python """ Small utility to parse Opera bookmark files. Written by Lars Marius Garshol """ import string,bookmark,time # --- Constants short_months={"Jan":"01","Feb":"02","Mar":"03","Apr":"04","May":"05", "Jun":"06","Jul":"07","Aug":"08","Sep":"09","Oct":"10", "Nov":"11","Dec":"12"} # --- Parsing exception class OperaParseException(Exception): pass # --- Methods def readfield(infile,fieldname): line=string.rstrip(infile.readline()) pos=string.find(line,fieldname+"=") if pos==-1: raise OperaParseException("Field '%s' missing" % fieldname) return line[pos+len(fieldname)+1:] def swallow_rest(infile): "Reads input until first blank line." while 1: line=infile.readline() if line=="" or line=="\n" or line=="\015\012": break def parse_date(date): # CREATED=904923783 (Fri Sep 04 17:43:03 1998) # VISITED=0 (?) if date=="": return None lp=string.find(date,"(") rp=string.find(date,")") if lp==-1 or rp==-1: if string.find(date," ")!=-1: raise OperaParseException("Can't handle this date: %s" % `date`) t=time.localtime(string.atoi(date)) return "%s%s%s" % (t[0],string.zfill(t[1],2),string.zfill(t[2],2)) if date[lp:rp+1]=="(?)": return None month=short_months[date[lp+5:lp+8]] day=date[lp+9:lp+11] year=date[rp-4:rp] return "%s%s%s" % (year,month,day) def parse_adr(filename): bms=bookmark.Bookmarks() infile=open(filename) version=infile.readline() while 1: line=infile.readline() if line=="": break line=string.rstrip(line) if line=="#FOLDER": print "FOLDER" name=readfield(infile,"NAME") created=parse_date(readfield(infile,"CREATED")) parse_date(readfield(infile,"VISITED")) # Just throw this away order=readfield(infile,"ORDER") swallow_rest(infile) bms.add_folder(name,created) elif line=="#URL": name=readfield(infile,"NAME") url=readfield(infile,"URL") created=parse_date(readfield(infile,"CREATED")) visited=parse_date(readfield(infile,"VISITED")) order=readfield(infile,"ORDER") swallow_rest(infile) bms.add_bookmark(name,created,visited,None,url) elif line=="-": bms.leave_folder() else: print `line` return bms # --- Test-program if __name__ == '__main__': import sys if len(sys.argv)<2 or len(sys.argv)>3: print print "A simple utility to convert Opera bookmarks to XBEL." print print "Usage: " print " adr_parse.py <adr-file> [<xbel-file>]" sys.exit(1) bms=parse_adr(sys.argv[1]) if len(sys.argv)==3: out=open(sys.argv[2],"w") bms.dump_xbel(out) out.close() else: bms.dump_xbel() # Done From akuchlin@cnri.reston.va.us Thu Mar 25 19:29:18 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Thu, 25 Mar 1999 14:29:18 -0500 (EST) Subject: [XML-SIG] DOM Nodes: Identity test In-Reply-To: <36F91135.62EFB469@prescod.net> References: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> <14072.64080.409375.326302@amarok.cnri.reston.va.us> <36F91135.62EFB469@prescod.net> Message-ID: <14074.28689.599615.707562@amarok.cnri.reston.va.us> Paul Prescod writes: >We could agree that the only by-value check that makes sense is the >identity check. This is the same as saying that the location of a node in >the tree is part of its value. I just remembered that we discussed this last year; see the thread starting at http://lists.w3.org/Archives/Public/www-dom/1998OctDec/0330.html . At the time, we concluded that no reasonable meaning for == could be chosen, and if a future DOM Level N draft specifies some different behaviour, we're in trouble, so it would be best to have a completely different method for such comparisons. So, how about isSameNode()? node1.isSameNode(node2) returns true iff node1 and node2 refer to the same underlying node in the tree. -- A.M. Kuchling http://starship.python.net/crew/amk/ Returned to his place, he sits quite still, pretending he doesn't exist, which is a harder game than you might imagine. -- The Enigma kills yet more time, in ENIGMA #4: "And Then What?" From Fred L. Drake, Jr." <fdrake@acm.org Thu Mar 25 19:46:23 1999 From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake) Date: Thu, 25 Mar 1999 14:46:23 -0500 (EST) Subject: [XML-SIG] DOM Nodes: Identity test In-Reply-To: <14074.28689.599615.707562@amarok.cnri.reston.va.us> References: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> <14072.64080.409375.326302@amarok.cnri.reston.va.us> <36F91135.62EFB469@prescod.net> <14074.28689.599615.707562@amarok.cnri.reston.va.us> Message-ID: <14074.37519.442067.101168@weyr.cnri.reston.va.us> Andrew M. Kuchling writes: > So, how about isSameNode()? node1.isSameNode(node2) returns > true iff node1 and node2 refer to the same underlying node in the This would be sufficient. -Fred -- Fred L. Drake, Jr. <fdrake@acm.org> Corporation for National Research Initiatives From spepping@scaprea.hobby.nl Thu Mar 25 20:01:10 1999 From: spepping@scaprea.hobby.nl (Simon Pepping) Date: Thu, 25 Mar 1999 21:01:10 +0100 Subject: [XML-SIG] DOM Nodes: Identity test In-Reply-To: <36F91135.62EFB469@prescod.net>; from Paul Prescod on Wed, Mar 24, 1999 at 10:22:13AM -0600 References: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> <14072.64080.409375.326302@amarok.cnri.reston.va.us> <36F91135.62EFB469@prescod.net> Message-ID: <19990325210110.A483@scaprea.hobby.nl> On Wed, Mar 24, 1999 at 10:22:13AM -0600, Paul Prescod wrote: > "Andrew M. Kuchling" wrote: > > > > I've been wondering about that myself. We don't want to > > overload __cmp__ to do this, because Python comparisions are usually > > done by value; > > We could agree that the only by-value check that makes sense is the > identity check. This is the same as saying that the location of a node in > the tree is part of its value. > > I note that the default "==" behaviour for new Python classes is to use > the identity so there are probably dozens of Python classes in the > standard distribution with this behaviour. > > >>> class A: > ... pass > ... > >>> A()==A() > 0 The question is: which comparison is used to check dictionary keys? This is what the Python reference (2.1.6) says: A mapping object maps values of one type (the key type) to arbitrary objects. Mappings are mutable objects. There is currently only one standard mapping type, the dictionary. A dictionary's keys are almost arbitrary values. The only types of values not acceptable as keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity. Numeric types used for keys obey the normal rules for numeric comparison: if two numbers compare equal (e.g. 1 and 1.0) then they can be used interchangeably to index the same dictionary entry. That seems to endorse Paul's proposal. -- Simon Pepping email: spepping@scaprea.hobby.nl From dieter@handshake.de Thu Mar 25 20:46:24 1999 From: dieter@handshake.de (Dieter Maurer) Date: Thu, 25 Mar 1999 20:46:24 +0000 (/etc/localtime) Subject: [XML-SIG] Building a DOM tree In-Reply-To: <14074.26231.579500.755196@amarok.cnri.reston.va.us> References: <8525673F.005890F9.00@li01.lm.ssc.siemens.com> <14074.26231.579500.755196@amarok.cnri.reston.va.us> Message-ID: <14074.40670.140802.349420@lindm.dm> Andrew M. Kuchling writes: > Jeff.Johnson@icn.siemens.com writes: > > .... > A good observation; the DOM Level 1 doesn't actually say > whether the result of cloneNode() is associated with the same document > or not. I'd suspect the answer is 'yes', but have written to the > www-dom list asking about this. Part 1.1.2 (Memory Management) of WD-DOM/level-one-core.html says: In the DOM Level 1, objects implementing some interface "X" are created by a "createX()" method on the Document interface; this is because all DOM objects live in the context of a specific Document. Thus, the cloned nodes must belong to *SOME* document. Because, there is no explicite parameter for a second document, they must belong to the document of the source nodes. > >Not the most elegant solution. While I'm complaining, is there a good > >reason that HtmlWriter closes the file passed to it? Because of that I The most elegant way, probably, would be a visitor copying a subtree of a source tree into a destination tree. - Dieter From dieter@handshake.de Thu Mar 25 20:20:35 1999 From: dieter@handshake.de (Dieter Maurer) Date: Thu, 25 Mar 1999 20:20:35 +0000 (/etc/localtime) Subject: [XML-SIG] Building a DOM tree In-Reply-To: <14074.17922.614618.229455@amarok.cnri.reston.va.us> References: <3.0.5.32.19990325115826.009509f0@kelly> <14074.17922.614618.229455@amarok.cnri.reston.va.us> Message-ID: <14074.38527.144937.11793@lindm.dm> Andrew M. Kuchling writes: > Carsten Oberscheid writes: > >At 16:22 23.03.99 -0500, Andrew M. Kuchling wrote: > >Assuming that each Node object can be a member only of one single DOM tree, > >wouldn't it be possible to replace the _parent_relation member of the > >document element by one global _parent_relation dictionary on module level? > > > > xml.dom.core._parent_relation == { id(childNode): parentNode, ... } > > Hmm... hmmm... no, I can't think of any reason that wouldn't > work. Nodes can only have a single parent, and you can't mix nodes > from two different document trees (unless you're Fred Drake), so key > collisions aren't possible. That would mean there's a single > dictionary with lots of keys, testing Python's dictionary code a bit > more, but dictionaries are supposed to handle that sort of thing, so > it shouldn't cause any problems. Shouldn't cause any problems for > threading, either. Hmmm... Unfortunetely, it would not solve the primary problem: safe garbage collection of unused DOM nodes. Suppose, you remove the last (application) reference to a DOM tree. Then, this DOM tree should be garbaged collected. It is not, however, because the child "c" of the root has an association "id(c) : root" in the global parent_relation dictionary. You still remember "WeakDict"s (URL:http://www.handshake.de/~dieter/pyprojects/weakdict.html)? They would remove problems with cycles and parent pointers. However, in some rare cases, the upper context of a node might be lost prematurely (because parent and document owner references are not reference counted, a reference to an internal node does not protect its upper context). - Dieter From paul@prescod.net Thu Mar 25 21:46:14 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 25 Mar 1999 15:46:14 -0600 Subject: [XML-SIG] DOM Nodes: Identity test References: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> <14072.64080.409375.326302@amarok.cnri.reston.va.us> <36F91135.62EFB469@prescod.net> <14074.28689.599615.707562@amarok.cnri.reston.va.us> Message-ID: <36FAAEA6.9963CCBB@prescod.net> "Andrew M. Kuchling" wrote: > Note that the discussion last year started out on a different basis: > Briefly, what should 'node1 == node2' do? In Python, object > identity is tested using the 'is' operator, so 'node1 is node2' > returns true iff node1 and node2 are actually the same object. But that doesn't work, right? So this doesn't follow: > 'node1 == node2' should therefore test for equal values of the node. So you started down the path of trying to figure out value equality. And as you've pointed out there is no decent definition for it except maybe "nodes in the same place, in the same tree, with the same attributes and the same children" -- i.e. identical to identity. > At > the time, we concluded that no reasonable meaning for == could be > chosen, and if a future DOM Level N draft specifies some different > behaviour, we're in trouble, so it would be best to have a completely > different method for such comparisons. I don't see how the DOM could specify something for such a language-specific concept. The "==" syntax has a different behaviour in C++, Java and Python. It doesn't even exist in Scheme. Therefore we should expect to choose our own behaviour based on Python semantics. Actually, isSameNode seems like something that they are more likely to specify. After all, method calls are language independent but operators are not.. So the danger of conflict there is higher, not lower. > So, how about isSameNode()? node1.isSameNode(node2) returns > true iff node1 and node2 refer to the same underlying node in the > tree. I could live with isSameNode but I prefer "==". -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html From akuchlin@cnri.reston.va.us Fri Mar 26 14:43:41 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Fri, 26 Mar 1999 09:43:41 -0500 (EST) Subject: [XML-SIG] DOM Nodes: Identity test In-Reply-To: <36FAAEA6.9963CCBB@prescod.net> References: <01BE7563.131CCFE0.co@daisybytes.su.uunet.de> <14072.64080.409375.326302@amarok.cnri.reston.va.us> <36F91135.62EFB469@prescod.net> <14074.28689.599615.707562@amarok.cnri.reston.va.us> <36FAAEA6.9963CCBB@prescod.net> Message-ID: <14075.40143.904957.158341@amarok.cnri.reston.va.us> Paul Prescod writes: >I don't see how the DOM could specify something for such a >language-specific concept. The "==" syntax has a different behaviour in >C++, Java and Python. It doesn't even exist in Scheme. Therefore we should >expect to choose our own behaviour based on Python semantics. Good point, and one which GvR agreed with when I asked him about it. I bow to Paul and GvR, therefore; a __cmp__ method is now checked into the CVS tree. -- A.M. Kuchling http://starship.python.net/crew/amk/ Andy and Flo live in the past, and when faced with something they don't like or understand, they do the sensible thing -- ignore it. -- Reg Smythe From co@daisybytes.su.uunet.de Fri Mar 26 14:31:40 1999 From: co@daisybytes.su.uunet.de (Carsten Oberscheid) Date: Fri, 26 Mar 1999 15:31:40 +0100 Subject: [XML-SIG] xml.dom.core bugfix Message-ID: <3.0.5.32.19990326153140.00969220@kelly> --=====================_922455100==_ Content-Type: text/plain; charset="us-ascii" Hi Andrew, there was a little bug in xml.dom.core.replaceChild, in the "newChild == DOCUMENT_FRAGMENT" part. Hope the fix is ok... Best regards .co. --=====================_922455100==_ Content-Type: application/octet-stream; name="core.py.gz"; x-mac-type="477A6970"; x-mac-creator="477A6970" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="core.py.gz" H4sICC2Y+zYAA2NvcmUucHkA7T39VxtHkj+jv6JXPJ+kWMjGt3e3iy3nEcAX3sPgA3LJHsvjDZoB Jh5mlJkRmM3L/3710d/TIw3YyWb3Ti8vRuru6urq6qrq6urqfr/fmxVlMpk/bIlBll7f1AOR3s6z 5DbJ66hOi1wUV6K+ScRuMVvgj+Lo8sdkVov3RZxkYoitRyJL7uDL5qTXO06ukjLJZ8mWuKnr+daL F/f395P7f50U5fWL0+MXm3/+859eHO/tbOwevd84wGYbmxv44+bLl5sver3d5C6lfitxVRa31PUB QxfVPJlt9YQQX4lT+DlLq1qUSb0o8yQWlw9icp3Ue4x69c3DaXR9GN0mw5FIq3xQQ/W7pNf7rkqu FpmYZVFVJZVIc+ghrcRtES+yRERlIg5hYGIYXVZ1Gc3qkYjyWKR11YOh5jPoDoZbLS4VgI0NTZmx kJ2PxWnyCf6/U9zS196HspglVZXm1/s5gF3MiLDQNMoypO/9TTq7EdVNschgIAlgVdVRXgMhYGA1 /Hx900NClAkQGgoE4AFlP6v+sDtEe6x6nEwmv8DAbxNoHFcCOot6egILmsCJEL3e6cM8nUWZWFTR dbLV6xHJP91mk7i4neDUIjMUZS2+6vXiYiamsmcFC/pofHo39W0GNaH+hGtLNIcDLBmMqMKkSurt ui7Ty0WdDAcR/DkYi8FdlC0SrJJEcQsMKIEKdVrDfAVrUBFU6dVAFqeKotOwf0rNrwuYwRvg174E OInm8ySPd27SLB5ic8bE/RkrNsbNg7LrYcMRUs3+VQyxXpNqvd68TGFusHpdwAwMR70+LM6epD9w DXDPWFQPlZmkRZ1mlZqhpJpFc2DwdaGXILJ3lQI3Vz8tkLMvgaE/JnUFYBbAb1ElzjYn/3Y+Jrav CxGJKmHWpGWRAKyjWV1cJqXYhKUGK1YuxaM8ocWYXgH3YANY+OtQ+z2tog2WBnFyleYpLWUohNK9 T7NkTuBnMAfU+avz3v7h7t4PFyf7/7N3sXd87FJlKjZ70OvJ6fH+4X8G6kzFq963+3vH28c73/7l 4njvv77bOzm1qkzFv/a+Pz6CtrtHO9+93zs89dv/Efr/7+2D/d2LnW+3j7d3TveOnfb/1js8utjd Pt2+2D44OPp+b9cBMBX/juXvj3b33+3vbJ/uHx069abiP6D89OLd0XeHu83hQfmfqPzkuw8fjo5P 9/w6U/FnwO+7k72L7VOgwTffne55/W++dChbiWFe3IvZTZRfo+goWNBtXEZVEk9A4kS3H2Fh0I84 n7ohiVUBHIeTXMyTUsrgIs8eRBmlVSIS0wmwR19/jbK+mKUlCAQUWsB1Y4Il0kkyGYNoS0D45AYm SGNiWRCHlxmxHZRcFeWtGCYpMF0p4IvIimsUTAwJ1m4F3Y5BNM6iBaASR3WEcLKiAjEL1VUBCklP fd0Am0NxcZswsAXKVuh5NBH7ubhOcsArGxNzK3HJOkVzuEjKEvog0YRjZzhFGad5VD6IuRbuokrr BRPOLLFiUW8UVxuXxQL0CEGqmCgLbMKwUCYdgDab8Fdc+CQU8H+wjMTFBa6ki4thlWRXY3FbXY94 wvCDv03gJ+AG+L/VpkzmpWxjVVeDk624I2aI/TxOPp2kf0s0WwxtHpFAcPFCX+66NSx1QpKqK5Tm 6paQvk1hXsrZzcNx8hOQve4CLCgKJLzvyyK/VmqrC7Cm3JCQDotdYL/tLCvuk7gLpIAE0aBAYmoh +jiQbUJHg67fIct1g2aJKNP+ZDFHzdIVI0+MaZ4Ci0srek/eaFZqiDiUakegLi2hM4yBn2GdA0Ij jeQJrDJcPW0oijnUM0N6/0BauGPtfSVJ2knALQBZshrrh3lSTUCZ7x3sIdd44h6GKgsuDo9295ql mz1NBL/IUKfZFNXg6d4Pje6oCAuC3QnWjzvEmyd7O8hIdpFT4EFAxQnD2D/FpfZu73jvcGdPF/kF dlvUqFweQEY2DCKLqvbD8dHO3skJSov9QxAb3zHOUxEuYECog3eO3oenQxa0TMefemr5N4q0XAi0 RK2ty0//8mEv2BALGlQFda4rvDve/k/Tt9VSFVitoeUmWhLb7iTKMlUQRnbzVQ+Me9AYoCnfgZW8 KJPhFf87FndJWaEWBSBFnoy0mXAKIlmkcmcIqtNTuvprhUal0qMSKms5+QV3QXMwS2H/IXLYrqnt pioG+6CGvmC3AtpabQSxR7RPvbFkyTXsZaSeRpt28O3p+4MB7eAGP8Bf3LMaE/Ss/swXt2jlyr4V PrJvMSykoTNSw7dHMGXbfILCu1SUGxkFDlRSg5lOBe+BtpQafhmq8glr6MFBoUIzrWgadOvNUCWE sDl5OQjU0p3SfPtbOTm5O/QzzNtVmVQ3wts2EjElnLRmQuBO7SIHAYiKcegsDaZDPKGpBcTWYwlu IAtotnCTNomUkqgkt2nYGsV4zGzYswYT6z0SKJoSragx/bWbzuqeluVsXw1VlYn6w3D0do4mHWyc YlBLWSY3QsARODCw58AMSQFZpARtlT48gLGYkxMCmBP9EWgrM87pldyB4YzVNxFSTUzSOrnVm3Ki YyQmAPC6BkNRDZ74NopjMNc1s7XZgRkNVlHUsvHWxfdgJBfo9biNPuJczor5g2JvbPaafAywQwUL HRidK5Vq42gBgrGm0ElV8DiwvaImDi7K7qOHSizmG3WxAWa5XNz4IQuTLPUpdekWXCi0eYNOf9oD DRuvyBqDNxqBMzHQRbhrSHFnUuLeZwh0HWoMRhYI/KA804Vn6blolCLvXuwcbJ+cnIl8gnpdnA/z sYf7yGmHyFXiuUDUh/kI/hqMB/7aq862NjbPsez87cC38q+TGplET3Bq4e3j7APuirHVXeV3B//B d9npGm/7TgHSHm5bxqJvZj6Xez3JZRWbitQcNj1Vep1jb32rszjJfo3OAGyCK9VZJUDHKktnid3Z jxYtYatYkvdwakTD2Xn77Kr6TPutc3cqtn5sTIZq4FG7iRQvYkDtKQQgcO3kDpLg6f0EKQ2SyhFH 2AOaBQB+SN9RHdE3sxhxHtbsJUFUvyBRLYGx3HguJZw/LWswtgQ66g6D/h6L3tramrVeA7Mthc/T hrUEIeK254ZrnjIoPRAXEA5KDat9SLeLTI8oJz5YTrivWoQdEIdAocpnoIYd2M+p+I0Wt5GsT+A7 6TgdjgxjQ9WkrBsi5ClcPZGwLOjzYq5ATzc2EfsnAUYwFtQyuS3uEo8sTwIsIVmwZ7DHr13Q9sTi TE64DpZOyEobWdSMk08rWnOdUOsyQZMzkTr682ZawbKGVhV6or+Kyuvqc7sgeCMpvdbFdpZGeIBT omFX8gGS2s0co78QeD6m3Qyb6KhspraOpp/h24U04aaiYV3CwoJCXCLKEAUzOEZU30fzobJSJ+oP Y4zCL+S9RVMiqdBKikJWKdtkMzAuL0EPzNANyQNBc5stsb2w2RqnBCkqH8AATAH7u6hMiwUadEwW MkGVE1MNldDfJx1PuwDkRv3bansVOw3aq7apiJUeZyr6VtPH5KHVboIyo6zxBA84Q1pLU9/J4oDo Zl4JYzPIhZQvM7gAGyPCQL/QKiPYf/CRUZqhyfgWS1nMT457ZIl5LdD9JhkmquT+uN8kPhEHhmrW usEdf6t8a/wAKp8ZcqIFTiPiPV3qSBFq75nhB0rAiyG287fz6mOTnkAr8tOXJVPgfVoKGlN24Jrl TK/Q2M3wZJ2RQwtvN3IQ2oUc2JY8wcXtyNPZ7cBM2WKO+zHJaQWerIwcp0Je4A5OccyQaowdCeXh G2C+HTzZQU7Dw5eJ7HHEEiXKCaQDsd9bIy4BHkHiUIUlDGKxJhLlzgwO1ry9huCnaJHVU8tJJUdp 5ugmqi6g8tAVDxbtsKorINgkC1SV/WldEscpe4j0GZJ0eoEM5bPUNr2iFpfSwqhtW05rLE4744ri vNcUzVPRUCwTX215oruljd6yhVWdX12qOjWqykJKjg7UuDU2IDVAgt8mKGuwriOTlDiCCj2zIqUi 1e4mrTJ36PcFKkBksaou0CtHpCfNgj9GAs/csoSEIbkUtbi8IjgUC8I9yGiMHA0TVK860ATUKG7v Kh0Y03B/TpSHcH0d+f6CDDCAMpXuvoBqxBXuK0TWSlTkFszwDKNMclf8UpH0s9HfyrVGX8LeNUZS E3PiodtS8Fxs9khfWWh6LnG9x1sXu9rUELcg73FS0nioQWudNaKz4KhE7S6bNisJLmIxyJUvyiRj r/NU/PyLbxgAkdF4bD0HHeoxjKXPdn3dQIAVYFpyAX4eS7CNzXbXltqIDftvDFPTpE+fpWTITZ9J pY1/wDzCP2/74llPb2WtMRguGFtMMPZ5YCRGtm9UL6JvwPaT/I/r5boEKSEtBxBjdEaOYV+wdD6U xacUDYpczw0BkYvCzBxBm2izsEyimDSGWgyBNdK2SBCmsR/dMwnDFTkfKpL9siZNPEvPMRCN3yhU J9YRXaYehhnoAuWJH/E6sH9GtcpO+nbjlff6zQq0Lg2bIWF8Dmv+RquxA9N2g7YhNpVWewcjxrms 0DsM8zpPZx8zikLinRV9T7RLWPPBGKUs+r7rQSVB/biocAszB6YBbY2bh7hIKCrvJgIOi/IHaA3d J1VN8iG/KpDh0ppI9MglbdjAt5fUyNBbf1XgcTupiPtCq20tyAGxPMbR0shLDF+pgd9fGye5Ghtu q4xwVWEfExZORBh06cMOE3aZs5oMAozLibKswrYSCmpXqwlqMOcXjwjYY8uGB/gRfjh7ufXHczoL QsgDwba5+RWhDzzbhsw7c1bPNh55s7HpH7fOLft2kWM0oMRE4sFdPUd0/DnB+kPf+ewPQu4c3LGA 2WZ3UekuXOTDCFVNhFTlodymjDzxcYEb0IsLZX1QJV1FTtZ7yS0wfdDhC+hFWQIbxu6zmAJYKCKB tCVYNLNUltBMxYlj4F4os8jn8/4xEbWiTs2hKQkf4JR++4IgzWCMKeqELQa7vyYW/434LkWDidkZ D0NW7EJZKk53TSxOlcO1DQkyRTrjQIaV0wkZVrjaGts90F/cEUykdjXNCjD4QByBkMBAXm2WORiI fdKIpWEiFAd5oauPuTIepgVBQ1XSwNqqtEajTy+sYRm4jc2iM9irtKxqDlVtHSydxGM9xrZlZKz2 WPSx2MVKHq6sMI0hIJd3APGWszq30tlLd9+85BRMLN04t+7xSBk7FANLpgvBsNoXpdeXINfG5ufQ 60nkmpfJHfoUT9JLVKTLiUZWWwqb4xij3jMMq0xm8IW2cMsoaLx0mpIdOM5Vt3RHwA2tcLZIqSYq e8NHNrxUvAGjtZ0m7ZTTc+Vj409eimbZ582fK0yTT/XTZsWymj5/Vj5vUg46Uq/rLIJVtLEpyG8H f7LHDun+1Jk9OEuff8lJM2ZCi92rcXKaFfd5UuoImuZsn9p3e1SMT1UVM759Qq48a65Pla7KqkKD qQMwFvIIg6OMRJ7cSzUmvsfoZw3STAIqQA1F6UTNLGqf7AzFEEtFBrkWmuXApsO+bxLY1iqfKKBE 0hxt3Cv6y6XMPjWpzEJQDcQlgaGC5FPKexYW+M6IFFhaIeqLGtRYomSgyvCaJI+VO0rRlk6PQZco lp4YgQaQNQCHgu/K6Nqaj7G6awR7KtFYGmiSMDpJPFYey4osS4yJGqsROyOyI/AcHDLc5D8oMHWZ 4GBpc8y2BHsf47BkVg6C0N5kebz0GI3BKN4g9wJNWP857V6I6T0zf3aTzD6yMjd8wBVtlFSZZTr8 Yeqt0xCmLeHmfU2nZ5VcGTHfc4tE3wGz9NNXEWa19in0xTMx5Mgj2cVoPBJtg9H2r+O4cwJLPYGG bh+OmPGgNHgpIA1N4zOvufJg4wc9Tqxwct3EBZTyuSdjTucXueURPZRbhGqrQclcBalbglcDQtkb cBY0j4t4altvJozFX1umsI+zHeWI8WViTMNnlTVreLQud9vEhl9w7sITxt5juw9fRLlg1huyJrTO eW3Dcm9MnMKioeX/MA305iDe0PDcCy9fg5NFsW78hJ+AsayPJan1yDhaua5xzKCrYIjebDl7BpbL POosWCJq0DwIW+viNVkQWtpeOGqly9kiEJxPEmG9Nsf8OVP59On8AlNKNDtLt+iMVM3pUvQev1wa qIaWjV+5KfTUJ8yFa361IGut5iz8tHKXUpr+TaC+XugokSIpkNLcNZJGfcsABbmURbPE7H5t86nI 4oD5dMxN2H4y1pGuzaal4cLcNnU0IOPNMIHnDFLDsTchNlcb6+YxBolq9A9jmDyawR2ODDN5k5Wf rHrx82XUryH3E1Qwfj5XDRcwBsV10nx53SbCnyqti38yaf1lZDHJfNhNdzA/7Q/SXgu3zqyuPkss A/reGmjUKsy52XJp7tD981SPjCVyobmqp1XtOG1WoCwXbpL5ZpG7WEaBCCypu1TFDrrLEvtGd9l+ VldtGTaVgVAtugrrNVRVmscpXTjFmErdsc5L0k1XpfVvvNE1roPSAx7iaBlT7E2V2c1Rp4JOZOhE MIxu+zTJI/G2mVp7AvMEmcYEoIeMFHe6t+M45NbB+37aARN0vLiOGc8B2tEJYjRDuzOEiOJYY9oW MBfgDCkMBW6iaqfDIVZdYoDmlaEBZknAU3jDwlcRyBW/juXWNc4oB3M5N/r6lcttI/HWju+YZTAY 0igyQDhJ5iF80a8VL+YZLUaHhcYyzQRQ6o7DWyPO6ZDOzLLEu2+zQmbeKUptAFUTigQwkDUpch2J pIAMrcN5x5E9arEa+U4i9q1/c+9J+keRwo0eDUTScl11Qh0OmeTTfzriV0S3Dv9f8eH/hX/0z2TK 6zRfuLdqw8pFnt3jfUxCE0c5wcnDPzgslwwnHSPWovlJ4dsBPbEKjbF+80I4DAc2bMpYG5QituKE HD8/YukOKLb1qzuSEAf7nE46VXN0vLhVV0iA3ZIIQyurh2pS1XGxqDm+k0b10uXzXWhHK+2H9weY XgrlhVI0htXx9JlSpVDcf5bUlXgoFpbUYJ80MzreJpZiDKE5bMq4Te5L4KGhROqrgRio24ts+j4X g7/mA5ctZx43hs2omTczMxXCPFtu5s8mTEBCT1Prudj0zkIsW9STcWQse4e92uPOqb4wFukycQ8t iMC9tXWZisuNTpKyYGLxsw/bPaJCQDu4j1PyE9TZyJFb7KvHlSoVhLmQ8RpbY0fFmO2RoiyTal7k dBgqJULlRdlZ8ZHrxB2DSl5lL9XwMOHX2rzUK+raNfrMRMg2UzEv0fAnZWyZf/YdbLmoZAsn+s25 yWbzgiSmWqr89TEHzhy/5UZbeFOJ27wBzTLfY2HNrsJKzajHVuYgWFH3FPKbYe4xuiwze5hlyWu7 qe4lQHpuLSM3UGOZwFfiKgVD1qDUbj7epMD4kgNdKi9J+8NKuFKt6aR1QCoK2Ye2l9783CfMuHj6 RlxotW2nWyID3+RRD2a7yxbEc6Z5VLsszJTMkqqCxeHrISscU8sMNb9Ltl/dzvkc7tKhB+viQ5ne oSpXAX3yY0WeeXzvyY8uGitwSZM1sxeQ3H6nU5Oh0cTgWTXxlOqWphnAxqvXMnIsrgR51kiSxpF5 dA0Dz/iI4++KNFZ873ibSNk2Dr6Ety73fQknjWB9jc1ch7AuoCn2stiW7hqjM2PChai49QUO4vvR FlYSMV/YsChsrC0VwudqaH+rYYdQdqGogwo1sxEx3Zi7lc1oNdYJdINDKQXjg63VshqoX3EX0m+K 52DwQ1MsB+MXpBD29oIG58Y2sC3sxwltt3yXTNYZuy+5N4cOy/14Sm/y+F1hbm02ZmSyOe64+RR7 1jVA8GagBR3V5MupuRJSQO/c42ka6ecc+6UUIk+XTHDr4bkblUOIvkgqeV6SOSri5jSbHQeH6+/c RJiuNClpb3HIEgLL101Eb+XYUHgrZhl/zhREvj9TmJ1gx8hO3Vtl9zYWOgmA7PNEhs+2d8jaGENI CHozcujpvpSh5UrhSIm+75E3w8IdCqDW4zIrVpVWIw/SHbq8FbWMzvJOlTVcWvg2LZoE9/bXvOWi Ss41LedSlomK0ROzuOTkRbtmdoqrKxjKmG+Wu5uUvU+UEhdNFHJmI858t0p5xGRqWeMxJ2CYaQls pJJcKbpPnNSEIXJnWM0kYtIUqKyKXtQYg3/jB9/RtAeSGfYPk2tggrtENuyPmsDeBinbtYMjHm8W ldeoUW9g6VdQQxHK7ZAG/XjkN+TsGkKWLJFANI4ajhh/KGc8zC3+5znhcO570Cx2cK/soe+Maqh9 ppxHz23mc28g1u+3W7GNn56bK4UmMiywAhpD54AwvEXIA5cBWybk34ybYYQWgl8HofCmE1MtUQ+6 FWCATVxac51fcyX8mhP0my+0DhxxJlfDObOGSq8SWDXWSmETtZvg5OMFW2xa0k0LT+qus+jk/i0n 5woJyvX/qQTo31mQ2GzTwjAsX7fOFdRGQEUb/wSkj4yncAVsxdwRlEYM0JjCHMlr15CZC39lnvPF mYRH+EjJh7ntaWncp1nGt61psPH/8+uvw69LxZziWsOtnJI+HHbOieebJBSe1YkMQdkIMZ0YMUBR fhQ5pr2MMpU5cWnuMtXxXWBwywhxd4YBY8/h3x/tQbWmZKuSn1yPkHU1GMrG7markbPiJ0LvpyB2 T0UdoQYGEEh05xCqjROcnHXusZTOF2x4S6fv7K/jmwh9q0Dj6QxEcgY+tzCMw65duW3Fvam9W3Wj iTDgQeMzDicWPvc3XI+6Mn7IJ2pL2jiXVpemVB+80Ttt3rU+q94O9FbfmgOk5mg8WrG8cPSW9Ot+ +IJbC+251OQLnOLJfJUzjw3xk2QOlCDl2yEO/mVAcIlt4OvrZhpMd9aW3XZdcom17f5qEHbwDus6 X2ihMyHYYVISMnVTwtmONyM7RGRBkTrNurYJrVP1aMCyOc27zmn+mDnNg3MaOssF+SwPxEB37eV1 Wj8cN/LAhic4XznB7Zd7nSmqGlMUvAr+a+rPepUQrNuEYN0iBAPS3Lk1Fpjzs5eUW/QMsDkP1TOW 21Rmda4sAlftBEba68arF5iuuuLUtXHzbPnVz9bqzfuIzTttrC/UIz9LVYadZH+8LDn82EkA35aA zNJAzZz4fzetpOp21U79N5J2YvCsGrwlH7Svk7popDd913DM7bxNlGQJlOCY/u8e/VFtK9dBOOOX FDCAXjUdEJIEz8vfoJ5joZAI2dPy4GUrcdxyCWpjIU1bbqusWr9ye/ylnzfQTl4nwtnrwvGgCh8J QL4PFaTbwL1GZovP1pisadv9ZZ6JF2/7Pb+fwVs387Y+vlqpyXyi8PGNIsrMxLuvSrOdNwihifDm RYBJPay1npK7FO+UxJFPtXxKrptl4m57ul7itfJL2BkdvRwTaJO4iUXo9FCe3VvnBKginGyj4VW4 OqNg3JQ01/bTbVI6ofhwB1OmCaX2sXLb85pV+UlbE0FYQkIdtHrwFSMFG51hZTdOumumTwWXtiUr 0r2r+K8GG7aelfrPOlVhOjZtHxWyymdgeqQUeOoQWPp4oprTwgTuwKh8svIYPlEvFuKxs44tst4P uzTCneBKu5irgkEA3cA+WN4L1wAw0R5mUlIepraLNdbGWopDS7Fr6jXSTjqE0y8qJJ/AbOEXR9ga 9w4Lv7yJFzkQXZ4yYCMF1lGVUfO6Yd1IRhTkbMyJ6Ie4r16PHODucMsXWogY/rBsIQblhhX12yY7 CFudu9bCNsRJeLq/HHNeEuHUn45hHncULFKoxKuEiuxjW0bM+jIlKA5s8iT3+PNvLQ4EZexgL6zK jQ3daQh+Eg9/MuyUezyCsfDGQcRZurg52EHl6PYYAmPlyuuFefkAP2qpcZchG3UliwcZBeu18gW3 CKocrNVF6wCTgGxRqodt6FbVzDHSuomJf8PPUhR90vhMyjBbpYvFlkUW+2xp36IJ5EGzVu66+OGH H2SsWJKwb/gjvlwZXRaLWuye7lac/hyDZvDgTCbc9eSvnGyJypebbHcylsiCFcmn2yVjiGPcKWjM dTDJUvO5Y54dsFh9sWpfpsB5PJB3W9CsjGGng2mJTRhoJZM4i+v0zrpdB3CljSIlBscOwhd9kvMg A2EvMSRAppWkuD9LbYLYkUGH7HOLtMNNbVJ1yLqZ7CLm7f05hnmRK6ISjQTk3Tci6ZW995jqt/ma ezlUCMrx0GGDIoHLKaALF1/RHQzuTv0oy9tvzFbaMJjJ25WhmpISui5iFqia05PM4ReyFbM0G0mS 87+w5comOsQJP34WOqq3LO9cXpS3UZb+rZnGr9//sEDbDZgRjw1M0llKQbcgHp2bMKlqcbmBLKIB LHJgpzyJVLYixUeUbDMSfe65z9cn7ilblX6vCz+3UflxMRfDZHI9oeVT4UksvaYNf1lvvKbmBe9K OoQs0at+RzdxQs5UK2HoCCqgyYw7NzNIeYXJMrMROZkkMIp/BAUM4zD13VWxLk6V2zp0YY0xuU3K 68TAqjUsDeZJt5g3Nsfi5VjgSyn+ysKrp01vCyEDRRubgcKQa/r0vrCwBZUA/Etp6qtCDgq+3k4C LbfzB6aZeRKBss/qnOB0NB0HWoYi+/FU+irlsKdboNOdRTrTMk4r2K7nnOVVh3Ugm77W+az4LuTl 4nos9vHH/GMTe0khtedwvj5n0rIe9BKTBC5RYuWRGKE7AhjcBFRGdux+U+CB3lJXrhuFy3liBSvY rtlAwhfPN2SangsaymrBm0+MjHFOGukwMnRw2/Qfn7d5cNpcrE3nGu/g37x6uSWqaeuBr0wD3ahw trX5H/RG3GQyaXisBm+0LPBOGEfjgFu4mmdpTYO3w0xc0fsNbAY+VsyjBjiJTkyObAlkP8jEbAdk 9MrHJKEU75cFSGLr8ityXMUu/kp6Cziy98aSMnQ0Iz1elfZ34Q8otkA8y1BDGacyLwDFCSz12DqL wy2RRngszRJcezK9GSLCexmoIzFSfjTH1hi3YxLxM5jRVZ2UTYTsXVG+2eJ0wBQ/r1Y5JPJNzyNB rdq8FFDbOfB3o3tf6qk3DV4ta6BmVHE2JkNsQMDhEXPlm+1rkwbKtV4FHj5bcaHBUnUyABN6RfIj MEolzpdFmhkF5SUH9x71psqm4tVyMtwoPB0JssPmwKOEyO9Iakj0lwkOjfKyeJ/Bmz9sbDyrNjYI QgNLRSw0jE7YHhri3JvHrY7VbSBZyzKbeLWi7XOZFbOP7ODGhWqu5xEUejjmHj0HFCt2HZUxr2w2 4/QSbJuW393cWNRqm6Cuk3NGwM4GgfguenYU3x2Vk6Tu2FFibcvzaj3SsMkrT6f9pDI+QycYrWSW OrTl8kn+yOMU1Y5s6rR5etI47rCgqDbtAgotGbbWw1fuFQTXYsGn39mT4acq52dgGkiGGqyYzN2j HXpBHNjhr/kgcGZrq3z9cD11/6UnVK+b1kmdL0CbzvY7BBiomk1zhSIVHqo6ue0CR9VcQU/XC6Qa hdMcGtrr99SfVeLDd98c7O/oVelPQmhstrvORUDVeAwCJ385Od173wkBNb4OJ1Jto2ztJ+iCWPUJ DP2zATUH6a8DjmX6sqvg9xdZ8sR198XWmy3wOp6RW7V7zlzpuLMvFd7TQnZnalZMQguIlqnpJtb/ ZRCOTpBRdJImH7TPad+4nJZTxuONZUj033zdEiOBWUAChkP/67f9R4TCfgh6zMTXJFu+Xi3E5I1G h8/w2Z+kcWO/i9kQuuvadnXV7JyXXlv9O1099U2333Sp6DQIaJP8qnFubooRIkZsGavBWMrdo/f7 zjt8dAGWEzEB+AsyOi8uRjrnmMGLto3L1w3Sc/DmaygR6G4F8NP+5uQlLAswzBpcYSMbjnBrVGvE kXSPDm57oHbpSx42Ms0gFjc8qtOSx+fhFFu+FmVR4Et1wg+Bp+WoEqPLQ5VRYwPKudFVuT7TOiQx gd7N6c+/jMVXX328j83DyfgxYVToCFKnAPJesuU3dy892OskcO9B9mwVBCN6Y/eVxZ9/sQwwemRT jScO51PkwahnVGmYXlIwKzwIGUM2CEZSYpcTJ1DGji3yZw9r9zzq+y8JtEesIalb3h0IhWu4tA/n AQ3ePlF8s3ElO2nci2/g7JHaGyI6JqwzbT8zgTNAcio2BvUlLs+EztTcuzOGTwzemPrCGURoBSmf VYfxKf/QiiHaGiE4Snlq1nWgCsPlE0WWmfQTdRqM7VBZNaKG1RceF3a4IXHoPDrbDbZ8iGF7T4k+ tHxWjDkIYNXgl9gEQUGIeHQce3hALbLPI0aH+FIzcHPtasVg2yL04mCEnhld8+E1N5grOJ/+nqbD QKwmeOy4YjTBPUj3QTmqSw7Lxzk0wnWx96lOcnS4qiRbFHiBg0ZzXx57k4debvWsw2ZDH+2nsiNf tWtkKlP2aWdNw3h0SaegrSKacnd8DrFi48aAvy0Mzc96o+wRWA+6C+8EKWNgj4W9lV5FIAbZkad+ Y+LEE28kjo8gyKOtBPxyQVFWOJIbDaWjoIzZ/+hgKCsISkNpBEOZ17D8BJ6Y5pGcV/iUKfH9TPVT wObyNv0bR23GBe3Aa9UjJ8dTjTS4dXmQFmX6Zn4zDKIS/IRt9kC5CaFzHluFSds47yT0DiO4wQSK xZURCYeF9Txwxce8eJSLt8H5nqSMRlyzXkPlI1bONgMDoDAeDG+Bzr+FHiklGJ0DyXfG+DQ5qhHI Q1JPJtYrWdL+bt+A2Bs4qtzM6yljnM7OwwrMCk+zgTWjwKxfyEY3PgtvU+eEgJmqwrqDRFFdVNQt sOuxAV0qQskMWHgquz29WDGrA++12l4Xe/frtE2dbfwyEI09f9cHXR/xgGpTwDT4pxnNZiX4wp2w zq7pr2z1OhswhvvkpPumog0jHHYNKPmimI61iJm7hUMSI3WL1OFHz8MHGC5GikQIfPkzAu0X6T7v 1Y/TAhNP5A/aG2CnQFUT0Rc+awD67o6qy2uK1rT/Izyq6A3ZTqR6BIK8TGPrfffbhTSx9GPpVupQ TjRL8pfbqzy1US1uC3r9HXcCWeLOAkdwQHt2esi2jnSQ9Sl9sZ6Lf7LnHFWp/uMf/llHnQHbJA0u 5rB+iYHIbkAFXiYVRvWn3opRtsBNdIdzccuzE+V46cSsXoKMQ1QcxUtBm0bwX4wNLh84U5EK/TTZ ihZ5CpJEy2bepWD27Zzj12ykGlkgFB9VfE1OE5rThD/y0RuK5MSaKx5ZtDMKW9XP3PrnPU5l+/Mv rVJxXeya9NS0f5Mk6q3pTCOqgyXpRRwtIeIzjDmdjeieSUMVPSF7ySrwazJOJ8aHJTZVpuVHa4u+ +/CZtGsNc7oMSLpYHx5I5rGuXRPvqw1EUabXaR7pQwNaq1ZvvTU6BOkmz5rekv8777L9TiRKVPkP 5Y2ZYAwII6Ffq+zFhSLtZUILa91cqOO0wPZ1u+pGDQHfJqOd4v+Lkt9QlOD0vMP0FfzggkpKS9n0 eW4odp9z3jX4nLTrLKqIyVJUFlFvraD81O47RraBD+VB9Hgzbq6o4WNIs3yEt9TWeOFncgD083nv C8nBviNNniYGRU9KtG6SyT/O1uc2//ARIOGEOZbHxR8yvU0INAJFgSJgLhMwRhw+by6evMbrSznn Qh9b8NiRomUZijF+/AIdLyBnJ/oJdeR3bPOEBGS/hzQfbgoxR/rcRnO6fVATp8jbAu4DKsxvbD33 egZFlHE9ZxV6ny2lFcY99xzBr8Z3nXvm1lPzs0Xne1CpyW92Jfv0aNwLcpxBznXd6+pBDFR1wGDJ 6oFqwTMcxNtaQE3g8kxv3HPezWhUUyvAqqcDS0L1cP1bdR2959dVq2rcczz+DRyUQ773Sw9Y6S69 3aqr6autKO31/hfhaQhkHbUAAA== --=====================_922455100==_ Content-Type: text/plain; charset="us-ascii" ---------------------------------------------- daisy bytes! --------- Carsten Oberscheid co@daisybytes.su.uunet.de digital document processing http://www.peb.de/daisybytes.su electronic publishing --=====================_922455100==_-- From larsga@ifi.uio.no Fri Mar 26 15:50:46 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 26 Mar 1999 16:50:46 +0100 Subject: [XML-SIG] SAX2 discussion Message-ID: <wk1zicl06h.fsf@ifi.uio.no> The design and functionality of SAX2 is currently being thrashed out on the xml-dev mailing list, and I'd like to get a parallell discussion started here so that we can be ready for a Python SAX2 release around the same time that the Java version is released. Also, it is quite likely that people here will have suggestions and ideas that could influence the overall direction of SAX2, and so we had better start now, in good time before SAX2 is finalized. Generally, what has been decided so far is that SAX2 should have: - an extended parser interface, offering registration of handlers by globally unique IDs, getting and setting of properties (again by ID) and also turning features on/off (also by ID) - a lexical handler for lexical events (comments, CDATA marked sections, event boundaries etc) - a DTD handler for DTD information (attribute declarations, entity declarations etc) - a namespace handler The idea is that SAX2 is to get it's own Java package and be an optional and 100% backwardly-compatible extension to SAX 1.0. I'll try to start separate discussion threads here on the following subjects (in this order): - general ideas for the mapping and extra features in the Python version - fixes to the previous version - the extended parser interface - the list of features - the list of properties - the list of handlers - the lexical handler interface - the DTD handler interface - the namespace handler interface - what to do about filters I'll be starting these threads as soon as I can get together starting points for the various topics. Once some kind of agreement has been reached (and the Java version has been finalized) I'll put together saxlib 2.0 and a new driver package, both of which should probably be in beta for a while. If anyone thinks this discussion will swamp the list completely and should be discussed elsewhere, or has other opinions about this procedure, please come forward with them now. Also, when saxlib 1.0 was defined there was hardly any discussion at all (until too late), which was unfortunate since it meant that the interface was frozen already when most people started looking at it. So _please_ don't let that happen again. Please at least take some time to ensure that the design at least makes a little bit of sense to you. --Lars M. (off to the pub :) From akuchlin@cnri.reston.va.us Fri Mar 26 15:51:20 1999 From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling) Date: Fri, 26 Mar 1999 10:51:20 -0500 (EST) Subject: [XML-SIG] Re: xml.dom.core bugfix In-Reply-To: <3.0.5.32.19990326153140.00969220@kelly> References: <3.0.5.32.19990326153140.00969220@kelly> Message-ID: <14075.41494.524456.347195@amarok.cnri.reston.va.us> Carsten Oberscheid writes: >there was a little bug in xml.dom.core.replaceChild, in the >"newChild == DOCUMENT_FRAGMENT" part. Hope the fix is ok... Thanks for the changes. Could you show me the testcase that demonstrated why this is broken? Your patch is: *** core.py Mon Mar 22 09:15:04 1999 --- /tmp/core.py Fri Mar 26 10:01:44 1999 *************** *** 496,501 **** if newChild._node.type == DOCUMENT_FRAGMENT_NODE: L[i:i+1] = newChild._node.children ! for child in newchild._node.children: ! self._set_parentdict(id(newChild._node), self._node) newChild._node.children = [] else: --- 495,501 ---- if newChild._node.type == DOCUMENT_FRAGMENT_NODE: L[i:i+1] = newChild._node.children ! for child in newChild._node.children: ! self._node.children.append(child) ! self._set_parentdict(id(child), self._node) newChild._node.children = [] else: The change to the _set_parentdict() call is correct; that's definitely a bug. But I don't see why the self._node.children.append() call is required, because the L[i:i+1] should be adding the new children to the node; L has been set to self._node.children just above the patch location. -- A.M. Kuchling http://starship.python.net/crew/amk/ Consumers are like roaches -- you spray them and they get immune after a while. -- David Lubars From co@daisybytes.su.uunet.de Fri Mar 26 16:08:24 1999 From: co@daisybytes.su.uunet.de (Carsten Oberscheid) Date: Fri, 26 Mar 1999 17:08:24 +0100 Subject: [XML-SIG] Re: xml.dom.core bugfix In-Reply-To: <14075.41494.524456.347195@amarok.cnri.reston.va.us> References: <3.0.5.32.19990326153140.00969220@kelly> <3.0.5.32.19990326153140.00969220@kelly> Message-ID: <3.0.5.32.19990326170824.00963920@kelly> [me] >>there was a little bug in xml.dom.core.replaceChild, in the >>"newChild == DOCUMENT_FRAGMENT" part. Hope the fix is ok... Oops, sorry, the original mail wasn't intended to go to the list... changed to another mail program, don't have it under control yet :-( [AMK] > > Thanks for the changes. Could you show me the testcase that >demonstrated why this is broken? Your patch is: I'll try to isolate it, the actual code won't be too helpful at the moment. > >The change to the _set_parentdict() call is correct; that's definitely >a bug. But I don't see why the self._node.children.append() call is >required, because the L[i:i+1] should be adding the new children to >the node; L has been set to self._node.children just above the patch >location. Yes, you are completely right, I just found this out myself. I was too fast again, next time I'll look closer and test better BEFORE I post. There's-always-room-for-improvement-but-does-it-have-to-be-that-much-ly your's .co. ---------------------------------------------- daisy bytes! --------- Carsten Oberscheid co@daisybytes.su.uunet.de digital document processing http://www.peb.de/daisybytes.su electronic publishing From uche.ogbuji@fourthought.com Fri Mar 26 19:03:39 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Fri, 26 Mar 1999 12:03:39 -0700 Subject: [XML-SIG] SAX2 discussion In-Reply-To: Your message of "26 Mar 1999 16:50:46 +0100." <wk1zicl06h.fsf@ifi.uio.no> Message-ID: <199903261903.MAA07222@malatesta.local> Lars Marius Garshol <larsga@ifi.uio.no>: > If anyone thinks this discussion will swamp the list completely and > should be discussed elsewhere, or has other opinions about this > procedure, please come forward with them now. I definitely think this discussion applies to this list. I've been lagging behind on the ever-burgeoning xml-dev (last I read, the LexicalHandler interface had been described), but I intend to catch up today, so I can contribute better, here and in xml-dev. > Also, when saxlib 1.0 was defined there was hardly any discussion at > all (until too late), which was unfortunate since it meant that the > interface was frozen already when most people started looking at it. > So _please_ don't let that happen again. Please at least take some > time to ensure that the design at least makes a little bit of sense to > you. I wasn't around back then, but of course it's clear that even without any feedback you did a great job. I think the list has grown, as has XML, and I expect you'll get more discussion this time. I, for one, shall be sure to comment. > --Lars M. (off to the pub :) Could yer brings me back a pint o' bitter? -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From larsga@ifi.uio.no Sat Mar 27 18:18:59 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 27 Mar 1999 20:18:59 +0200 Subject: [XML-SIG] SAX2 discussion In-Reply-To: <199903261903.MAA07222@malatesta.local> References: <199903261903.MAA07222@malatesta.local> Message-ID: <wkiubm7q3w.fsf@ifi.uio.no> * uche ogbuji | | I think the list has grown, as has XML, and I expect you'll get more | discussion this time. I, for one, shall be sure to comment. I'm glad to hear this, since these issues very likely do need some discussion to be thrashed out properly. But I think you're right that both the list and the number of users and their involvement has grown sufficiently that we might actually get much more discussion this time. Let's start, then. * Lars Marius Garshol | | --Lars M. (off to the pub :) * uche ogbuji | | Could yer brings me back a pint o' bitter? Ach, too late now that I'm back. If I could I would have sent you a virtual pint of Leffe Brun. --Lars M. From larsga@ifi.uio.no Sat Mar 27 18:20:46 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 27 Mar 1999 20:20:46 +0200 Subject: [XML-SIG] SAX2: General issues Message-ID: <wkhfr67q0x.fsf@ifi.uio.no> The basic philosophy of SAX (as David Megginson defines it) is to be as low-level as possible, to enable as many different kinds of services as possible to be built on top of SAX. If this means that SAX becomes a bit awkward to use directly, that's less important, the higher-level services are supposed to take care of that. So far, we've followed this philosophy in the Python version as well, and I think we shold continue that way. Also, so far we've mimiced the Java interface almost 100% percent, even down to naming. I think the naming was a mistake (since we need a PySAX driver for use with JPython anyway), but that once we've made it we should stick with it. I think we should try to stay as close to the Java interface as possible, to make PySAX easy to use in JPython and also to ensure that when a C/C++ version happens, PySAX can easily use it. Last time, we added some extra things to the Python version, and I think we might well do so this time around as well. I think that whatever JavaSAX does we should have some kind of defined support for parser filters. Beyond that I haven't really got any ideas for really new features at the moment. Does anyone else have things they'd like to see added? The last question is, which package shold we place the new stuff in? xml.sax2? xml.sax? --Lars M. From larsga@ifi.uio.no Sat Mar 27 18:22:28 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 27 Mar 1999 20:22:28 +0200 Subject: [XML-SIG] SAX2: saxlib 1.0 fixes Message-ID: <wkg16q7py3.fsf@ifi.uio.no> - We haven't defined what kind of argument setLocale should take, partly because no parsers actually used it at the time. However, now xmlproc supports localization and expat has what it needs to start doing so (but at the moment only supports English and seems to have no way of setting the locale (at least the version in the CVS tree)). We need to take the following into account here: - xmlproc: Uses case-insensitive ISO 3166 language codes - Java locales: Uses a combination of lower-case ISO 639 country codes and upper-case ISO 3166 language codes - Python locale module: Uses the ANSI C library, which seems to be missing in my copy of K&R, and the module documentation doesn't explicitly say what the locale name is, but from the examples it appears to be either ISO 3166 or ISO 639 codes. - expat: How this will work, and if it ever will is unclear. With this in mind I propose that setLocale should take a pair of case-insensitive ISO 3166 and ISO 639 codes. xmlproc will then ignore the country code, the JPython driver will do case conversion and pass a java.util.Locale object to JavaSAX and hopefully this will work with expat as well. - The InputSource interface has been left out of PySAX, and should be introduced now, as that solves a problem with EntityResolver and also since Python will be getting Unicode support soon. If we agree on this I'll make a draft and start a separate thread on this. - The PySAX EntityResolver returns a system identifier, whereas the Java version returns an InputSource. (Thanks to Paul Prescod for pointing out the problems with this.) For consistency, and also to enable users to provide custom protocol support, I think PySAX should follow suit. Is anyone against this? - The get method on AttributeList is missing if AttributeList is to behave exactly as built-in dictionaries. Should be added. (Thanks to Neelakantan Krishnaswami for spotting this one.) --Lars M. From uche.ogbuji@fourthought.com Sun Mar 28 00:47:51 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Sat, 27 Mar 1999 17:47:51 -0700 Subject: [XML-SIG] SAX2: General issues In-Reply-To: Your message of "27 Mar 1999 20:20:46 +0200." <wkhfr67q0x.fsf@ifi.uio.no> Message-ID: <199903280047.RAA09559@malatesta.local> > The last question is, which package shold we place the new stuff in? > xml.sax2? xml.sax? Well, I know that on xml-dev, there's a lot of talk about not stomping all over SAX 1.0, but IMO, once the drivers are ported, there are not likely to be a lot of people depending on SAX 1.0, and even for those who don't want to break things by changing, they can always just stick to the older XML packages. In other words, I think we should use xml.sax even for SAX2. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From larsga@ifi.uio.no Sun Mar 28 14:30:20 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 28 Mar 1999 16:30:20 +0200 Subject: [XML-SIG] SAX2: Extended parser interface Message-ID: <wk90chvg8z.fsf@ifi.uio.no> This is the proposed extended parser interface. It uses globally unique IDs for features, handlers and properties, based on URLs (just like namespaces do). The IDs will be dealt with separately. class Parser2(Parser): def setFeature(featureID, state) This turns on or off (depending on whether state is true or false) support for a particular feature (like namespaces, validation etc). The parser can raise SAXNotSupportedException if it doesn't support the feature or its subclass SAXUnrecognizedException. def setHandler(handlerID, handler): This registers an event handler with the parser (LexicalHandler, NamespaceHandler or maybe some special parser-defined handler). The parser can raise SAXNotSupportedException if it doesn't support the handler or its subclass SAXUnrecognizedException. def set(propertyID, value): This sets the value of a parser property (such as the namespace separator string or something parser-defined.) The parser can raise SAXNotSupportedException if it doesn't support the handler or its subclass SAXUnrecognizedException. def get(propertyID): This returns the value of a property. The parser can raise SAXNotSupportedException if it doesn't support the handler or its subclass SAXUnrecognizedException. From larsga@ifi.uio.no Sun Mar 28 14:32:18 1999 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 28 Mar 1999 16:32:18 +0200 Subject: [XML-SIG] SAX2: Parser features Message-ID: <wk7ls1vg5p.fsf@ifi.uio.no> The list below is copied directly from David Megginsons latest proposal. I think we should just include these. Does anyone have additions they'd like to see, or arguments against any of these? Note that all features are optional. http://xml.org/sax/features/validation Validate (true) or don't validate (false). http://xml.org/sax/features/external-general-entities Expand external general entities (true) or don't expand (false). http://xml.org/sax/features/external-parameter-entities Expand external parameter entities including the external DTD subset (true) or don't expand (false). http://xml.org/sax/features/namespaces Preprocess namespaces (true) or don't preprocess (false). See also the http://xml.org/sax/properties/namespace-sep property. http://xml.org/sax/features/normalize-text Ensure that all consecutive text is returned in a single callback to DocumentHandler.characters or DocumentHandler.ignorableWhitespace (true) or explicitly do not require it (false). http://xml.org/sax/features/use-locator Provide a Locator using the DocumentHandler.setDocumentLocator callback (true), or explicitly do not provide one (false). From gstein@lyra.org Sun Mar 28 14:46:23 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 28 Mar 1999 06:46:23 -0800 Subject: [XML-SIG] quick speed test Message-ID: <36FE40BF.50C03339@lyra.org> This is a multi-part message in MIME format. --------------3BCDD7BF6676916C1CF1ED3C Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hey gang, I added some parsing in my DAV client for the server responses. My test script then started running horribly slow :-( To do some quick performance testing, I whipped up the attached script. The "Parser" class in there is essentially a direct translation of the C code in mod_dav. It interfaces with Expat and handles xml:lang and namespace processing. Of course, Python has different/better data structures, so it is quite a bit simpler than the C equivalent. My testing shows that the Parser class is about 12 times faster than going thru the DOM code. Some post-processing of the DOM adds another 50%. The post-processing does the namespace handling (no xml:lang handling or handling of the reserved "xml" prefix). The post-process *does* do some data extraction which I haven't written for the Parser thing yet. I figure it would balance out to the Parser being about 15x the DOM version. Regardless of the obscure details, the main point is that this script demonstrates a much faster mechanism for translating Expat output into a useful tree-based structure, while also performing namespace processing and miscellaneous XML conformance stuff. There is also a sample function for dumping the output tree. To get this to run on your system, you may need to drop the "import davlib" from the top. It isn't really used. A couple other DAV remnants are in there, but hey. Exercise for the reader :-) I'm posting this mostly as an example or aid, in that somebody may find it useful. It isn't intended to universally replace the DOM stuff. Cheers, -g -- Greg Stein, http://www.lyra.org/ --------------3BCDD7BF6676916C1CF1ED3C Content-Type: text/plain; charset=us-ascii; name="xmlperf.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="xmlperf.py" # # do performance tests on XML parsing variants # import xml.sax.saxexts import xml.dom.sax_builder import StringIO import davlib from xml.parsers import pyexpat import string import time msr = '''\ <?xml version="1.0"?> <multistatus xmlns="DAV:"> <response> <href>/dav/foo.cgi</href> <propstat> <prop> <creationdate>1999-03-16T20:06:16Z</creationdate> <getcontentlength>17</getcontentlength> <getlastmodified>Tue, 16 Mar 1999 20:06:16 GMT</getlastmodified> <resourcetype/></prop> <status>HTTP/1.1 200 OK</status> </propstat> </response> <response> <href>/dav/file1</href> <propstat> <prop> <creationdate>1999-03-16T20:06:17Z</creationdate> <getcontentlength>14</getcontentlength> <getlastmodified>Tue, 16 Mar 1999 20:06:17 GMT</getlastmodified> <resourcetype/></prop> <status>HTTP/1.1 200 OK</status> </propstat> </response> <response> <href>/dav/testdata/</href> <propstat> <prop> <creationdate>1999-03-16T20:06:18Z</creationdate> <getlastmodified>Tue, 16 Mar 1999 20:06:18 GMT</getlastmodified> <resourcetype><collection/></resourcetype> </prop> <status>HTTP/1.1 200 OK</status> </propstat> </response> <response> <href>/dav/newdir/</href> <propstat> <prop> <creationdate>1999-03-28T12:28:29Z</creationdate> <getlastmodified>Sun, 28 Mar 1999 12:28:29 GMT</getlastmodified> <resourcetype><collection/></resourcetype> </prop> <status>HTTP/1.1 200 OK</status> </propstat> </response> <response> <href>/dav/foo/</href> <propstat> <prop> <creationdate>1999-03-16T13:26:07Z</creationdate> <getlastmodified>Tue, 16 Mar 1999 13:26:07 GMT</getlastmodified> <resourcetype><collection/></resourcetype> </prop> <status>HTTP/1.1 200 OK</status> </propstat> </response> <response> <href>/dav/</href> <propstat> <prop> <creationdate>1999-03-28T12:28:29Z</creationdate> <getlastmodified>Sun, 28 Mar 1999 12:28:29 GMT</getlastmodified> <resourcetype><collection/></resourcetype> </prop> <status>HTTP/1.1 200 OK</status> </propstat> </response> </multistatus> ''' def use_parser(): parser = xml.sax.saxexts.make_parser() handler = xml.dom.sax_builder.SaxBuilder() parser.setDocumentHandler(handler) parser.parseFile(StringIO.StringIO(msr)) return handler.document return davlib.MultiStatusResponse(handler.document) class blank: pass DAV_NS_XML = -10 class Parser: def __init__(self): self.reset() def reset(self): self.doc = doc = blank() doc.root = None doc.namespaces = [ 'DAV:' ] self.cur_elem = None self.no_namespace_id = None self.error = None def find_prefix(self, prefix): elem = self.cur_elem while elem: if elem.ns_scope.has_key(prefix): return elem.ns_scope[prefix] elem = elem.parent if prefix == '': if self.no_namespace_id is None: self.no_namespace_id = len(self.doc.namespaces) self.doc.namespaces.append('') return self.no_namespace_id return -1 def process_prefix(self, ob): idx = string.find(ob.name, ':') if idx == -1: ob.ns_id = self.find_prefix('') elif string.lower(ob.name[:3]) == 'xml': ob.ns_id = DAV_NS_XML # name is reserved by XML else: ob.ns_id = self.find_prefix(ob.name[:idx]) ob.name = ob.name[idx+1:] if ob.ns_id == -1: self.error = 'namespace prefix not found' return def start(self, name, attrs): if self.error: return elem = blank() elem.name = name elem.lang = None elem.parent = None elem.children = [ ] elem.ns_scope = { } elem.attrs = [ ] elem.first_cdata = '' elem.following_cdata = '' if self.cur_elem: elem.parent = self.cur_elem elem.parent.children.append(elem) self.cur_elem = elem else: self.cur_elem = self.doc.root = elem # scan for namespace declarations for i in range(0, len(attrs), 2): name = attrs[i] value = attrs[i+1] if name == 'xmlns' or name[:6] == 'xmlns:': if name == 'xmlns': prefix = '' else: prefix = name[6:] try: id = self.doc.namespaces.index(value) except ValueError: id = len(self.doc.namespaces) self.doc.namespaces.append(value) elem.ns_scope[prefix] = id elif name == 'xml:lang': elem.lang = value else: attr = blank() attr.name = name attr.value = value elem.attrs.append(attr) # inherit xml:lang from parent if elem.lang is None and elem.parent: elem.lang = elem.parent.lang # process prefix of the element name self.process_prefix(elem) # process attributes' namespace prefixes map(self.process_prefix, elem.attrs) def end(self, name): if self.error: return parent = self.cur_elem.parent del self.cur_elem.ns_scope del self.cur_elem.parent self.cur_elem = parent def cdata(self, data): if self.error: return elem = self.cur_elem if elem.children: last = elem.children[-1] last.following_cdata = last.following_cdata + data else: elem.first_cdata = elem.first_cdata + data def parse(self, s): p = pyexpat.ParserCreate() p.StartElementHandler = self.start p.EndElementHandler = self.end p.CharacterDataHandler = self.cdata rv = p.Parse(s, 1) if rv == 0: raise 'expat parsing error' doc = self.doc self.reset() return doc def use_expat(): p = Parser() return p.parse(msr) def dump(f, doc, elem=None, dump_ns=0): if elem is None: f.write('<?xml version="1.0"?>\n') dump(f, doc, doc.root, 1) else: if elem.ns_id == DAV_NS_XML: f.write('<' + elem.name) else: f.write('<ns%d:%s' % (elem.ns_id, elem.name)) for attr in elem.attrs: if attr.ns_id == DAV_NS_XML: f.write(' %s="%s"' % (attr.name, attr.value)) else: f.write(' ns%d:%s="%s"' % (attr.ns_id, attr.name, attr.value)) if dump_ns: for i in range(len(doc.namespaces)): f.write(' xmlns:ns%d="%s"' % (i, doc.namespaces[i])) if elem.children or elem.first_cdata: f.write('>' + elem.first_cdata) for child in elem.children: dump(f, doc, child) f.write(child.following_cdata) if elem.ns_id == DAV_NS_XML: f.write('</%s>' % elem.name) else: f.write('</ns%d:%s>' % (elem.ns_id, elem.name)) else: f.write('/>') def timing(n1=10, n2=200): l1 = range(n1) l2 = range(n2) t = time.time() for i in l1: use_parser() t1 = time.time() - t print "time=%.4f each=%.4f" % (t1, t1/n1) t = time.time() for i in l2: use_expat() t2 = time.time() - t print "time=%.4f each=%.4f" % (t2, t2/n2) --------------3BCDD7BF6676916C1CF1ED3C-- From mike.olson@fourthought.com Mon Mar 29 18:15:23 1999 From: mike.olson@fourthought.com (Mike Olson) Date: Mon, 29 Mar 1999 12:15:23 -0600 Subject: [XML-SIG] Building a DOM tree References: <3.0.5.32.19990325115826.009509f0@kelly> <14074.17922.614618.229455@amarok.cnri.reston.va.us> <14074.38527.144937.11793@lindm.dm> Message-ID: <36FFC33B.25A97BCE@fourthought.com> This is a cryptographically signed message in MIME format. --------------msCCE8F064DFEA8A9D1DB15A01 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit 4DOM went through many iterations of how to handle the garbage collection as well. We tried the global dictionary idea. In the end you needed to call a function regularly to handle trees that where just left. Then I tried a hack of python its self that added a new core object called a DOM instance that acted like an instance. This was smart enough to know that its reference count included all external reference counts, plus the number of children it had. This worked well, but required the user to patch python, and in general it slowed down python by 10-15% (PY_DECREF now had to check to see if the object was a DOM Instance when checking references). We tried using SWIG and storing the references to a childs parents as a CPointer, but we ran into the problem of comparision, and having to reconstruct nodes when needed We also had a proxy implementation that ran into the similar problems. We came up with the general conclusion that we cannot handle every case elegantly so we went back to the simple "ReleaseNode" function. This will recurively "chop" a tree so all nodes can be collected on their own. It does require the user to call this function, but we figured it is better to have the user know what is going on, and when, then to all of the sudden have parent nodes dissapearing. It also allows us to keep a reference to a node's parent so 2 calls to getParentNode will return the same node.. Mike Dieter Maurer wrote: > Andrew M. Kuchling writes: > > Carsten Oberscheid writes: > > >At 16:22 23.03.99 -0500, Andrew M. Kuchling wrote: > > >Assuming that each Node object can be a member only of one single DOM tree, > > >wouldn't it be possible to replace the _parent_relation member of the > > >document element by one global _parent_relation dictionary on module level? > > > > > > xml.dom.core._parent_relation == { id(childNode): parentNode, ... } > > > > Hmm... hmmm... no, I can't think of any reason that wouldn't > > work. Nodes can only have a single parent, and you can't mix nodes > > from two different document trees (unless you're Fred Drake), so key > > collisions aren't possible. That would mean there's a single > > dictionary with lots of keys, testing Python's dictionary code a bit > > more, but dictionaries are supposed to handle that sort of thing, so > > it shouldn't cause any problems. Shouldn't cause any problems for > > threading, either. Hmmm... > Unfortunetely, it would not solve the primary problem: safe > garbage collection of unused DOM nodes. > > Suppose, you remove the last (application) reference to a > DOM tree. Then, this DOM tree should be garbaged collected. > It is not, however, because the child "c" of the root > has an association "id(c) : root" in the global parent_relation > dictionary. > > You still remember "WeakDict"s > (URL:http://www.handshake.de/~dieter/pyprojects/weakdict.html)? > They would remove problems with cycles and parent pointers. > However, in some rare cases, the upper context of a node > might be lost prematurely (because parent and document owner references > are not reference counted, a reference to an internal node > does not protect its upper context). > > - Dieter > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Mike Olson Member Consultant FourThought LLC http://www.fourthought.com http://opentechnology.org --- "No program is interesting in itself to a programmer. It's only interesting as long as there are new challenges and new ideas coming up." --- Linus Torvalds --------------msCCE8F064DFEA8A9D1DB15A01 Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIIKmQYJKoZIhvcNAQcCoIIKijCCCoYCAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3DQEHAaCC CCUwggTvMIIEWKADAgECAhAOCY8cYeSQOObs5zKyDmWRMA0GCSqGSIb3DQEBBAUAMIHMMRcw FQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UECxMWVmVyaVNpZ24gVHJ1c3QgTmV0d29y azFGMEQGA1UECxM9d3d3LnZlcmlzaWduLmNvbS9yZXBvc2l0b3J5L1JQQSBJbmNvcnAuIEJ5 IFJlZi4sTElBQi5MVEQoYyk5ODFIMEYGA1UEAxM/VmVyaVNpZ24gQ2xhc3MgMSBDQSBJbmRp dmlkdWFsIFN1YnNjcmliZXItUGVyc29uYSBOb3QgVmFsaWRhdGVkMB4XDTk5MDMwNTAwMDAw MFoXDTk5MDUwNDIzNTk1OVowggEKMRcwFQYDVQQKEw5WZXJpU2lnbiwgSW5jLjEfMB0GA1UE CxMWVmVyaVNpZ24gVHJ1c3QgTmV0d29yazFGMEQGA1UECxM9d3d3LnZlcmlzaWduLmNvbS9y ZXBvc2l0b3J5L1JQQSBJbmNvcnAuIGJ5IFJlZi4sTElBQi5MVEQoYyk5ODEeMBwGA1UECxMV UGVyc29uYSBOb3QgVmFsaWRhdGVkMSYwJAYDVQQLEx1EaWdpdGFsIElEIENsYXNzIDEgLSBO ZXRzY2FwZTETMBEGA1UEAxQKTWlrZSBPbHNvbjEpMCcGCSqGSIb3DQEJARYabWlrZS5vbHNv bkBmb3VydGhvdWdodC5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBANKGswZUnQ/B IfNlZWIIy6G6AkyjYgPRhXynebPtI5ARMq9xDo2zgLgWE+8QffdoZp2hUnTpm63B6cG8yqH1 PnA/7SB2roIfml1vnOwXgNuBctciTmnrac4GWgL0CM9839fJZh47QIVYPlCbOPtnvnH1NGGD jFWAVX7vmES72Dl9AgMBAAGjggGPMIIBizAJBgNVHRMEAjAAMIGsBgNVHSAEgaQwgaEwgZ4G C2CGSAGG+EUBBwEBMIGOMCgGCCsGAQUFBwIBFhxodHRwczovL3d3dy52ZXJpc2lnbi5jb20v Q1BTMGIGCCsGAQUFBwICMFYwFRYOVmVyaVNpZ24sIEluYy4wAwIBARo9VmVyaVNpZ24ncyBD UFMgaW5jb3JwLiBieSByZWZlcmVuY2UgbGlhYi4gbHRkLiAoYyk5NyBWZXJpU2lnbjARBglg hkgBhvhCAQEEBAMCB4AwgYYGCmCGSAGG+EUBBgMEeBZ2ZDQ2NTJiZDYzZjIwNDcwMjkyOTg3 NjNjOWQyZjI3NTA2OWM3MzU5YmVkMWIwNTlkYTc1YmM0YmM5NzAxNzQ3ZGE1ZDNmMjE0MWJl YWRiMmJkMmU4OTIxM2FlNmFmOWRmMTE0OTk5YTNiODQ1ZjlmM2VhNDUwYzAzBgNVHR8ELDAq MCigJqAkhiJodHRwOi8vY3JsLnZlcmlzaWduLmNvbS9jbGFzczEuY3JsMA0GCSqGSIb3DQEB BAUAA4GBAIuxBeIOBMHbj5yM/Vu4UJxDcz4Xtc7h0K8c6d82SiwwKLN5Gbew69PevcN6Ak+p D8LO4NyCH8Cfu3acoT0Efi99XjWvdi2eSbDJUw6MvgJtnAfY03zM+Cf31A/1iyrvr3hD45/c yhUNRh8f6qX1NzeKvvh5AcYD1bsi+0wnP0D8MIIDLjCCApegAwIBAgIRANJ2Lo0UDD19sqgl Xa/uDXUwDQYJKoZIhvcNAQECBQAwXzELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWdu LCBJbmMuMTcwNQYDVQQLEy5DbGFzcyAxIFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRpb24g QXV0aG9yaXR5MB4XDTk4MDUxMjAwMDAwMFoXDTA4MDUxMjIzNTk1OVowgcwxFzAVBgNVBAoT DlZlcmlTaWduLCBJbmMuMR8wHQYDVQQLExZWZXJpU2lnbiBUcnVzdCBOZXR3b3JrMUYwRAYD VQQLEz13d3cudmVyaXNpZ24uY29tL3JlcG9zaXRvcnkvUlBBIEluY29ycC4gQnkgUmVmLixM SUFCLkxURChjKTk4MUgwRgYDVQQDEz9WZXJpU2lnbiBDbGFzcyAxIENBIEluZGl2aWR1YWwg U3Vic2NyaWJlci1QZXJzb25hIE5vdCBWYWxpZGF0ZWQwgZ8wDQYJKoZIhvcNAQEBBQADgY0A MIGJAoGBALtaRIoEFrtV/QN6ii2UTxV4NrgNSrJvnFS/vOh3Kp258Gi7ldkxQXB6gUu5SBNW LccI4YRCq8CikqtEXKpC8IIOAukv+8I7u77JJwpdtrA2QjO1blSIT4dKvxna+RXoD4e2HOPM xpqOf2okkuP84GW6p7F+78nbN2rISsgJBuSZAgMBAAGjfDB6MBEGCWCGSAGG+EIBAQQEAwIB BjBHBgNVHSAEQDA+MDwGC2CGSAGG+EUBBwEBMC0wKwYIKwYBBQUHAgEWH3d3dy52ZXJpc2ln bi5jb20vcmVwb3NpdG9yeS9SUEEwDwYDVR0TBAgwBgEB/wIBADALBgNVHQ8EBAMCAQYwDQYJ KoZIhvcNAQECBQADgYEAiLg3O93alDcAraqf4YEBcR6Sam0v9vGd08pkONwbmAwHhluFFWoP uUmFpJXxF31ntH8tLN2aQp7DPrSOquULBt7yVir6M8e+GddTTMO9yOMXtaRJQmPswqYXD11Y Gkk8kFxVo2UgAP0YIOVfgqaxqJLFWGrBjQM868PNBaKQrm4xggI8MIICOAIBATCB4TCBzDEX MBUGA1UEChMOVmVyaVNpZ24sIEluYy4xHzAdBgNVBAsTFlZlcmlTaWduIFRydXN0IE5ldHdv cmsxRjBEBgNVBAsTPXd3dy52ZXJpc2lnbi5jb20vcmVwb3NpdG9yeS9SUEEgSW5jb3JwLiBC eSBSZWYuLExJQUIuTFREKGMpOTgxSDBGBgNVBAMTP1ZlcmlTaWduIENsYXNzIDEgQ0EgSW5k aXZpZHVhbCBTdWJzY3JpYmVyLVBlcnNvbmEgTm90IFZhbGlkYXRlZAIQDgmPHGHkkDjm7Ocy sg5lkTAJBgUrDgMCGgUAoIGxMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcN AQkFMQ8XDTk5MDMyOTE4MTUyNFowIwYJKoZIhvcNAQkEMRYEFBZpSOp7ojW9wE8anlR7z4lX hS1aMFIGCSqGSIb3DQEJDzFFMEMwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMAcGBSsO AwIHMA0GCCqGSIb3DQMCAgFAMA0GCCqGSIb3DQMCAgEoMA0GCSqGSIb3DQEBAQUABIGAqAhB I4qOI4WuO42v73nUizzOKhuPW3lZLrI8rCJTb0Rw8Kg5g0VN2LbxLeofiiOQZxAhRsihccxa ulxLtOzteUMVvIxq4Sb1aR/2YYJ0Q5JrsR/X9WQYZpShqng6LhwkxoiGr5XwgveYwg2VYyH7 TZWyb7q6VITWzmIeBHXuu7Y= --------------msCCE8F064DFEA8A9D1DB15A01--