From uche.ogbuji@fourthought.com Sun Dec 5 02:22:01 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Sat, 04 Dec 1999 19:22:01 -0700 Subject: [XML-SIG] (no subject) Message-ID: <199912050222.TAA20179@localhost.localdomain> I have made an RPM of version 0.25 of Geir O. Gr=F8nmo's GPS (Groves and = Property Sets for Python). You can get it at our Python/RPM site: ftp://FourThought.com/pub/mirrors/python4linux/redhat/ The GPS home page is at Homepage: http://www.infotek.no/~grove/software/gps/index.html Great package, Geir! Thanks. -- = Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From aa8vb@yahoo.com Wed Dec 8 22:07:39 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Wed, 8 Dec 1999 17:07:39 -0500 Subject: [XML-SIG] Building xmldist Message-ID: <19991208170739.A463904@vislab.epa.gov> Per instructions, I grabbed: http://starship.python.net/crew/da/xmldists/xml_08_12_99.tar.gz Then: cd xml/extensions (presumed; Makefile.pre.in is here) make -f Makefile.pre.in boot make Finally: make install ERROR>> gmake: *** No rule to make target `install'. Stop. libinstall looked like the only likely makefile target, so: make libinstall ... Creating directory ... Creating directory ... for i in `find *.py arch dom marshal ... -name '*.py' -print` ; do \ ... fi; \ done Cannot stat *.py No such file or directory Cannot stat arch No such file or directory Cannot stat dom No such file or directory Cannot stat marshal No such file or directory ... gmake: *** [libinstall] Error 2 Any suggestions? Randall From aa8vb@yahoo.com Wed Dec 8 22:14:50 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Wed, 8 Dec 1999 17:14:50 -0500 Subject: [XML-SIG] Building xmldist In-Reply-To: <19991208170739.A463904@vislab.epa.gov> References: <19991208170739.A463904@vislab.epa.gov> Message-ID: <19991208171450.A7614@vislab.epa.gov> Randall Hopper: | | http://starship.python.net/crew/da/xmldists/xml_08_12_99.tar.gz ... | ERROR>> gmake: *** No rule to make target `install'. Stop. http://www.python.org/sigs/xml-sig/files/xml051pre1.zip from back in March builds and installs cleanly. I'll give that a try pending further advice. -- Randall Hopper aa8vb@yahoo.com From fdrake@acm.org Wed Dec 8 22:23:13 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 8 Dec 1999 17:23:13 -0500 (EST) Subject: [XML-SIG] Building xmldist In-Reply-To: <19991208171450.A7614@vislab.epa.gov> References: <19991208170739.A463904@vislab.epa.gov> <19991208171450.A7614@vislab.epa.gov> Message-ID: <14414.55889.448668.339585@weyr.cnri.reston.va.us> Randall Hopper writes: > from back in March builds and installs cleanly. I'll give that a try > pending further advice. I work from the CVS directly, but that shouldn't make a difference. The current approach is to run: python ./setup.py install from the top of the tree; that seems to be fine. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From aa8vb@yahoo.com Thu Dec 9 13:39:38 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Thu, 9 Dec 1999 08:39:38 -0500 Subject: [XML-SIG] Re: Building xmldist In-Reply-To: <14414.55889.448668.339585@weyr.cnri.reston.va.us> References: <19991208170739.A463904@vislab.epa.gov> <19991208171450.A7614@vislab.epa.gov> <14414.55889.448668.339585@weyr.cnri.reston.va.us> Message-ID: <19991209083938.B9138@vislab.epa.gov> Fred L. Drake, Jr.: |Randall Hopper writes: | > from back in March builds and installs cleanly. I'll give that a try | > pending further advice. | | I work from the CVS directly, but that shouldn't make a difference. |The current approach is to run: | | python ./setup.py install | |from the top of the tree; that seems to be fine. Thanks, that did the trick. Yesterday's CVS is installed and working well with a few of the demos. Say, what's a good forum to pose general XML questions (state-of-XML; not necessarily tied to Python XML)? The reason I ask is I picked up an XML book to read on the plane last weekend (_Applied XML, A Toolkit for Programmers_; Ceponkus, Hoodbhoy). It seems to be a good first read for my intended XML uses (app-to-app data exchange). ...but as with most things XML it was doubtless behind the times when it hit the shelves. As I read through XML syntax, DTDs, the XML DOM API, etc. I'm wondering things like, have DTDs been superceded by a new mechanism that supports datatyping, is more flexible, and less rigid (XML Data and XML DCDs are mentioned), etc. Thanks, Randall -- Randall Hopper aa8vb@yahoo.com From akuchlin@mems-exchange.org Thu Dec 9 14:59:15 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Thu, 9 Dec 1999 09:59:15 -0500 (EST) Subject: [XML-SIG] Re: Building xmldist In-Reply-To: <19991209083938.B9138@vislab.epa.gov> References: <19991208170739.A463904@vislab.epa.gov> <19991208171450.A7614@vislab.epa.gov> <14414.55889.448668.339585@weyr.cnri.reston.va.us> <19991209083938.B9138@vislab.epa.gov> Message-ID: <14415.50115.39397.647788@amarok.cnri.reston.va.us> Randall Hopper writes: >Say, what's a good forum to pose general XML questions (state-of-XML; not >necessarily tied to Python XML)? The reason I ask is I picked up an Probably XML-DEV, especially now that the endlessly tedious SML threads have gone away to another mailing list. comp.text.xml is another alternative, though it gets a higher concentration of newbie questions ("how do I do X in XSL with IE5?"). -- A.M. Kuchling http://starship.python.net/crew/amk/ If you don't hurry up and let life know what you want, life will damned soon show you what you'll get. -- Robertson Davies, _Fifth Business_ From edd@usefulinc.com Thu Dec 9 14:08:32 1999 From: edd@usefulinc.com (Edd Dumbill) Date: Thu, 9 Dec 1999 09:08:32 -0500 Subject: [XML-SIG] Re: Building xmldist References: <19991208170739.A463904@vislab.epa.gov> <19991208171450.A7614@vislab.epa.gov> <14414.55889.448668.339585@weyr.cnri.reston.va.us> <19991209083938.B9138@vislab.epa.gov> Message-ID: <014a01bf4258$3eb3f6c0$8fbdfea9@heddley.com> Randall Hopper : > Say, what's a good forum to pose general XML questions (state-of-XML; not > necessarily tied to Python XML)? The reason I ask is I picked up an XML > book to read on the plane last weekend (_Applied XML, A Toolkit for > Programmers_; Ceponkus, Hoodbhoy). It seems to be a good first read for my > intended XML uses (app-to-app data exchange). ...but as with most things > XML it was doubtless behind the times when it hit the shelves. You could try the XML-L list, which is a good place for general questions that aren't deeply technical. It's linked from the front page of XMLhack.com along with several others. Also you could read XML.com which (in my not so humble opinion) is a good source of tutorial and analysis on XML development. > As I read through XML syntax, DTDs, the XML DOM API, etc. I'm wondering > things like, have DTDs been superceded by a new mechanism that supports > datatyping, is more flexible, and less rigid (XML Data and XML DCDs are > mentioned), etc. There is no W3C-sanctioned alternative to DTDs yet. There is work ongoing on XML Schemas (which the XML Data, DCD etc. work feeds into) but no recommendation yet from the W3C. Expect that sometime in Q1 2000. A new working draft should be out soon. My advice would be to stick with DTDs a while. An article we just published a couple of weeks back on XML.com by Simon St.Laurent surveys the DTDs/Schemas issue. Edd Dumbill Managing Editor, XML.com From fdrake@acm.org Thu Dec 9 16:19:49 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 9 Dec 1999 11:19:49 -0500 (EST) Subject: [XML-SIG] Re: Building xmldist In-Reply-To: <19991209083938.B9138@vislab.epa.gov> References: <19991208170739.A463904@vislab.epa.gov> <19991208171450.A7614@vislab.epa.gov> <14414.55889.448668.339585@weyr.cnri.reston.va.us> <19991209083938.B9138@vislab.epa.gov> Message-ID: <14415.54949.62347.21663@weyr.cnri.reston.va.us> Randall Hopper writes: > Say, what's a good forum to pose general XML questions (state-of-XML; not > necessarily tied to Python XML)? The reason I ask is I picked up an XML > book to read on the plane last weekend (_Applied XML, A Toolkit for > Programmers_; Ceponkus, Hoodbhoy). It seems to be a good first read for my > intended XML uses (app-to-app data exchange). ...but as with most things I used to follow XML-DEV, but stopped due to the volume. The content quality was pretty good at the time. That list was more for people developing specifications than applications, though, and I'm really not sure where to go for the later. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From aa8vb@yahoo.com Thu Dec 9 17:38:40 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Thu, 9 Dec 1999 12:38:40 -0500 Subject: [XML-SIG] Re: Re: Building xmldist In-Reply-To: <014a01bf4258$3eb3f6c0$8fbdfea9@heddley.com> References: <19991208170739.A463904@vislab.epa.gov> <19991208171450.A7614@vislab.epa.gov> <14414.55889.448668.339585@weyr.cnri.reston.va.us> <19991209083938.B9138@vislab.epa.gov> <14415.54949.62347.21663@weyr.cnri.reston.va.us> <19991208170739.A463904@vislab.epa.gov> <19991208171450.A7614@vislab.epa.gov> <14414.55889.448668.339585@weyr.cnri.reston.va.us> <19991209083938.B9138@vislab.epa.gov> <014a01bf4258$3eb3f6c0$8fbdfea9@heddley.com> Message-ID: <19991209123840.A11092@vislab.epa.gov> Edd Dumbill: |You could try the XML-L list...you could read XML.com Fred L. Drake, Jr.: | I used to follow XML-DEV, but stopped due to the volume. The |content quality was pretty good at the time... Thanks for the suggestions. Edd Dumbill: |> have DTDs been superceded by a new mechanism that supports datatyping,... | |There is no W3C-sanctioned alternative to DTDs yet. There is work ongoing on |XML Schemas... | |My advice would be to stick with DTDs a while. An article we just published |a couple of weeks back on XML.com by Simon St.Laurent surveys the |DTDs/Schemas issue. Ok. I'll look for it. Thanks, -- Randall Hopper aa8vb@yahoo.com From aa8vb@yahoo.com Fri Dec 10 18:06:59 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Fri, 10 Dec 1999 13:06:59 -0500 Subject: [XML-SIG] To .document or not to .document Message-ID: <19991210130659.A45283@vislab.epa.gov> --DocE+STaALJfprDB Content-Type: text/plain; charset=us-ascii When xml.dom.utils.FileReader is used to read in an XML DOM, the DOM may or may not have a leading .document component depending on whether the XML was read from a file or a stream. Why the difference? For example, the attached script reads a simple XML script from the XML variable via StringIO. If you change: reader = utils.FileReader().readStream( stream ) to: reader = utils.FileReader( "elements.xml" ) then the script doesn't work unless you change: doc = reader to: doc = reader.document There's an extra DOM component here. Thanks for any insight. -- Randall Hopper aa8vb@yahoo.com --DocE+STaALJfprDB Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="elements.py" #!/usr/bin/env python from xml.dom import utils,core import string, sys, StringIO XML = """\ """ stream = StringIO.StringIO( XML ) reader = utils.FileReader().readStream( stream ) doc = reader rootNode = doc.documentElement rootNode2 = doc.childNodes.item(0) print rootNode print rootNode2 element1 = rootNode.childNodes.item(0) element2 = rootNode.childNodes.item(1) element3 = rootNode.childNodes.item(2) print element1, element2, element3 --DocE+STaALJfprDB Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="elements.xml" --DocE+STaALJfprDB-- From aa8vb@yahoo.com Fri Dec 10 18:27:35 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Fri, 10 Dec 1999 13:27:35 -0500 Subject: [XML-SIG] Ignoring whitespace with DOM Message-ID: <19991210132735.A49666@vislab.epa.gov> --jRHKVT23PllUwdXP Content-Type: text/plain; charset=us-ascii I'm new at this so please be gentle. When building a DOM, all whitespace sequences like newlines and spaces are turned into nodes in the tree (even those peer to elements). In Dejanews, I read that this is required behavior for the parser. With the Python DOM, is there a supported method to configure whitespace parsing? (Possibly something like SAX's ignorableWhitespace which saw mentioned.) Or, after parsing, are there methods to "filter out" whitespace nodes? Or is it expected that you will just selectively ignore certain Text nodes whenever you are traversing the DOM tree. Thanks, -- Randall Hopper aa8vb@yahoo.com --jRHKVT23PllUwdXP Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="elements.py" #!/usr/bin/env python from xml.dom import utils,core import string, sys, StringIO XML = """\ Data 1 """ stream = StringIO.StringIO( XML ) reader = utils.FileReader().readStream( stream ) doc = reader rootNode = doc.documentElement rootNode2 = doc.childNodes.item(0) print rootNode print rootNode2 element1 = rootNode.childNodes.item(0) element2 = rootNode.childNodes.item(1) element3 = rootNode.childNodes.item(2) print element1, element2, element3 --jRHKVT23PllUwdXP-- From aa8vb@yahoo.com Fri Dec 10 18:33:51 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Fri, 10 Dec 1999 13:33:51 -0500 Subject: [XML-SIG] Re: Ignoring whitespace with DOM In-Reply-To: <19991210132735.A49666@vislab.epa.gov> References: <19991210132735.A49666@vislab.epa.gov> Message-ID: <19991210133351.A48635@vislab.epa.gov> Randall Hopper: |With the Python DOM, is there a supported method to configure whitespace |parsing? (Possibly something like SAX's ignorableWhitespace which saw |mentioned.) ... | | Data 1 | | ... |element1 = rootNode.childNodes.item(0) |element2 = rootNode.childNodes.item(1) In case it helps, what prompted me to ask is that I'm trying XML DOM examples listed in this book I'm reading. I'm pulling out simple DOM code and XML data island snippets in MSIE5 Javascript and keying them into Python to try. It appears that MSIE5's XML DOM is somehow filtering out the whitespace. For example, item(1) refers to element2 in the XML and doesn't seem to be thrown off by spaces and line breaks in the XML data. Randall From sean@digitome.com Sun Dec 12 18:40:11 1999 From: sean@digitome.com (Sean McGrath) Date: Sun, 12 Dec 1999 18:40:11 +0000 Subject: [XML-SIG] Pyxie - An open source XML Processing Library for Python Message-ID: <3.0.6.32.19991212184011.009b8ca0@gpo.iol.ie> All, I finally put Pyxie on the Web just now. Hope it is of use to some people. The book that hatched the Pyxie library is now in production at Prentice Hall. "XML Processing with Python" should hit the shelves around February, 2000. regards, http://www.pyxie.org - an Open Source XML Processing library for Python From l.szyster@ibm.net Mon Dec 13 10:33:05 1999 From: l.szyster@ibm.net (Laurent Szyster) Date: Mon, 13 Dec 1999 11:33:05 +0100 Subject: [XML-SIG] Pyxie - An open source XML Processing Library for Python References: <3.0.6.32.19991212184011.009b8ca0@gpo.iol.ie> Message-ID: <3854CB61.7E69F969@ibm.net> Sean, All I get is a "403 Forbidden" error :-( Sean McGrath wrote: > > http://www.pyxie.org - an Open Source XML Processing library for > Python Laurent Szyster From sean@digitome.com Mon Dec 13 10:42:34 1999 From: sean@digitome.com (Sean McGrath) Date: Mon, 13 Dec 1999 10:42:34 +0000 Subject: [XML-SIG] Pyxie - An open source XML Processing Library for Python In-Reply-To: <3854CB61.7E69F969@ibm.net> References: <3.0.6.32.19991212184011.009b8ca0@gpo.iol.ie> Message-ID: <3.0.6.32.19991213104234.009a32b0@gpo.iol.ie> At 11:33 13/12/99 +0100, Laurent Szyster wrote: >Sean, > >All I get is a "403 Forbidden" error :-( > Use this link instead: http://www.digitome.com/pyxie.html pyxie.org is currently a redirect until we find a proper home for it. Sorry for the hassle. regards, Sean From fredrik@pythonware.com Mon Dec 13 11:14:23 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 13 Dec 1999 12:14:23 +0100 Subject: [XML-SIG] Pyxie - An open source XML Processing Library for Python References: <3.0.6.32.19991212184011.009b8ca0@gpo.iol.ie> <3854CB61.7E69F969@ibm.net> Message-ID: <01d401bf455b$367dd560$f29b12c2@secret.pythonware.com> > All I get is a "403 Forbidden" error :-( try: http://www.digitome.com/pyxie.html (which is where I ended up when I pointed my browser to www.pyxie.org). From akuchlin@mems-exchange.org Mon Dec 13 23:54:55 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Mon, 13 Dec 1999 18:54:55 -0500 (EST) Subject: [XML-SIG] Developer's Day Message-ID: <199912132354.SAA10101@amarok.cnri.reston.va.us> Is there anything for the SIG to discuss at the IPC8 Developer's Day? What's needed for a 1.0 release, a general BoF, anything? --amk From a.eyre@optichrome.com Tue Dec 14 10:12:41 1999 From: a.eyre@optichrome.com (Adrian Eyre) Date: Tue, 14 Dec 1999 10:12:41 -0000 Subject: [XML-SIG] Ignoring whitespace with DOM In-Reply-To: <19991210132735.A49666@vislab.epa.gov> Message-ID: <003a01bf461b$c1cfddf0$3acbd9c2@peridot.optichrome.com> > Or, after parsing, are there methods to "filter out" whitespace nodes? In the current xml dist, in dom.utils, there's a handy function called "strip_whitespace" which should do what you need. On an similar topic, I noticed that many people have posted functions to add whitespace to format the DOM in a tree-like way. Are there any plans to incorporate these into the main xml dist? -------------------------------------------- Adrian Eyre Optichrome Computer Solutions Ltd Maybury Road, Woking, Surrey, GU21 5HX, UK Tel: +44 1483 740 233 Fax: +44 1483 760 644 http://www.optichrome.com -------------------------------------------- From aa8vb@yahoo.com Tue Dec 14 11:53:48 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Tue, 14 Dec 1999 06:53:48 -0500 Subject: [XML-SIG] Re: Ignoring whitespace with DOM In-Reply-To: <003a01bf461b$c1cfddf0$3acbd9c2@peridot.optichrome.com> References: <19991210132735.A49666@vislab.epa.gov> <003a01bf461b$c1cfddf0$3acbd9c2@peridot.optichrome.com> Message-ID: <19991214065348.A96831@vislab.epa.gov> Adrian Eyre: |> Or, after parsing, are there methods to "filter out" whitespace nodes? | |In the current xml dist, in dom.utils, there's a handy function called |"strip_whitespace" which should do what you need. Thanks. That's what I was looking for. -- Randall Hopper aa8vb@yahoo.com From paul@prescod.net Tue Dec 14 13:25:20 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 05:25:20 -0800 Subject: [XML-SIG] Does 4XPath requires lex? Message-ID: <38564540.8E7D3E74@prescod.net> Is it really the case that 4XPath requires lex code to be compiled? And if so, is that the case just because of the necessity to parse XPaths? I thought that someone had done a Python-based XPath parser before. This is a killer distribution issue, especially for Windows users. XPaths are probably not much harder to parse than regular expressions. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From larsga@garshol.priv.no Tue Dec 14 15:31:31 1999 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 14 Dec 1999 16:31:31 +0100 Subject: [XML-SIG] Does 4XPath requires lex? In-Reply-To: <38564540.8E7D3E74@prescod.net> References: <38564540.8E7D3E74@prescod.net> Message-ID: * Paul Prescod | | Is it really the case that 4XPath requires lex code to be compiled? | And if so, is that the case just because of the necessity to parse | XPaths? I thought that someone had done a Python-based XPath parser | before. Dieter Maurer has an all-Python package: --Lars M. From aa8vb@yahoo.com Tue Dec 14 16:40:56 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Tue, 14 Dec 1999 11:40:56 -0500 Subject: [XML-SIG] Re: To .document or not to .document In-Reply-To: <19991210130659.A45283@vislab.epa.gov> References: <19991210130659.A45283@vislab.epa.gov> Message-ID: <19991214114056.A107393@vislab.epa.gov> Randall Hopper: |When xml.dom.utils.FileReader is used to read in an XML DOM, the DOM may or |may not have a leading .document component depending on whether the XML was |read from a file or a stream. ... | reader = utils.FileReader().readStream( stream ) | reader = utils.FileReader( "elements.xml" ) I see my mistake. Though both forms read the XML stream, they don't return the same object. The first returns the stream; the second gives you back the parser. -- Randall Hopper aa8vb@yahoo.com From fdrake@acm.org Tue Dec 14 18:42:57 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 14 Dec 1999 13:42:57 -0500 (EST) Subject: [XML-SIG] Ignoring whitespace with DOM In-Reply-To: <003a01bf461b$c1cfddf0$3acbd9c2@peridot.optichrome.com> References: <19991210132735.A49666@vislab.epa.gov> <003a01bf461b$c1cfddf0$3acbd9c2@peridot.optichrome.com> Message-ID: <14422.36785.950131.520609@weyr.cnri.reston.va.us> Adrian Eyre writes: > On an similar topic, I noticed that many people have posted functions > to add whitespace to format the DOM in a tree-like way. Are there any > plans to incorporate these into the main xml dist? You could use xml.sax.writer.PrettyPrinter; just walk over the DOM instance to generate SAX events using a PrettyPrinter instance as the document handler (which should be pretty much all you need). Hmm.. I suppose xml.dom.walker could include a SAXGenerator class to make this easier! I'll think about that the next time I can find a little time. ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From paul@prescod.net Tue Dec 14 20:24:28 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 14 Dec 1999 12:24:28 -0800 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> Message-ID: <3856A77C.3A4D9F00@prescod.net> I think that we need to make an "XML Developer's Package" with minimal overlap and few advanced features. Our current offering is overwhelming in its array of sometimes incompatible and overlapping offerings. That's fine for a generalized distribution but we need to develop something clean enough to go in the Python 1.6 standard library and the assortment of stuff we have now is NOT it. Ideally we would have one (or at most two!) implementation of each of the major specs: XML SAX Unicode XPath XPointer XSLT DOM Paul Prescod "Andrew M. Kuchling" wrote: > > Is there anything for the SIG to discuss at the IPC8 Developer's Day? > What's needed for a 1.0 release, a general BoF, anything? > > --amk > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From guido@CNRI.Reston.VA.US Tue Dec 14 23:26:34 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Tue, 14 Dec 1999 18:26:34 -0500 Subject: [XML-SIG] Don't forget to register for the Python conference! Message-ID: <199912142326.SAA00634@eric.cnri.reston.va.us> We know that the Python conference isn't until the next millennium. You still have THREE WHOLE WEEKS to register and qualify for the early bird registration. However, at least one of those weeks you will have partying and family gatherings on your mind, and when that week's over, recovery from the partying and gathering will probably take priority over registering for the conference, and as a result you might be PAYING FULL PRICE! (The horror!) That is, if your payment isn't received by January 5, 2000. So, be smart and register *before* Christmas. That's still more than ten days -- plenty of time to make travel arrangements, register for the conference, and present your boss with the bill (in that order). Our motto, due to Bruce Eckel, is: "Life's better without braces." Some highlights from the conference program: - 8 tutorials on topics ranging from JPython to Fnorb; - a keynote by Open Source evangelist Eric Raymond; - another by Randy Pausch, father of the Alice Virtual Reality project; - a separate track for Zope developers and users; - live demonstrations of important Python applications; - refereed papers, and short talks on current topics; - a developers' day where the feature set of Python 2.0 is worked out. Come and join us at the Key Bridge Marriott in Rosslyn (across the bridge from Georgetown), January 24-27 in 2000. Make the Python conference the first conference you attend in the new millennium! The early bird registration deadline is January 5. More info: http://www.python.org/workshops/2000-01/ --Guido van Rossum (home page: http://www.python.org/~guido/) From uche.ogbuji@fourthought.com Wed Dec 15 15:56:47 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Wed, 15 Dec 1999 08:56:47 -0700 Subject: [XML-SIG] Does 4XPath requires lex? In-Reply-To: Your message of "Tue, 14 Dec 1999 05:25:20 PST." <38564540.8E7D3E74@prescod.net> Message-ID: <199912151556.IAA03674@localhost.localdomain> > Is it really the case that 4XPath requires lex code to be compiled? And > if so, is that the case just because of the necessity to parse XPaths? I > thought that someone had done a Python-based XPath parser before. This > is a killer distribution issue, especially for Windows users. XPaths are > probably not much harder to parse than regular expressions. Actually, when we analyzed the performance of the first, all-Python versions of 4XSL, the pattern-parsing was by a huge margin the greatest bar on performance. XPaths are not terribly complex, but they are just complex enough to make it difficult to do so efficiently with regular expressions. Note that we shall be putting resources into compiling a version of 4Suite for Windows for distribution. We expect to have it ready in January. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From akuchlin@mems-exchange.org Wed Dec 15 16:19:32 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Wed, 15 Dec 1999 11:19:32 -0500 (EST) Subject: [XML-SIG] Developer's Day In-Reply-To: <3856A77C.3A4D9F00@prescod.net> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> Message-ID: <14423.49044.143333.790752@amarok.cnri.reston.va.us> Paul Prescod writes: >I think that we need to make an "XML Developer's Package" with minimal >overlap and few advanced features. Our current offering is overwhelming >in its array of sometimes incompatible and overlapping offerings. That's Huh? There's obviously a good deal of stuff in there, some of it perhaps too esoteric, but I don't see where there's overlap. Or are you talking about Python tools in general, where there are 3 DOM implementations? (PyDOM, 4DOM, and ZDOM hiding inside Zope.) >fine for a generalized distribution but we need to develop something >clean enough to go in the Python 1.6 standard library and the assortment >of stuff we have now is NOT it. I lean against shoveling more stuff into 1.6; better to get the Distutils widely used, which makes it easier to install *all* Python extensions. >Ideally we would have one (or at most two!) implementation of each of >the major specs: >XML >SAX >Unicode >XPath >XPointer >XSLT >DOM Do you mean "one implementation of each in a single package", or "one implementation existing for Python, distributed separately"? We need to come up with a position paper for developer's day, stating what needs to be discussed. Suggestions? I'd propose focusing on getting the XML-SIG package to 1.0, but that's just an idea. -- A.M. Kuchling http://starship.python.net/crew/amk/ "Don't try anything." "It's all right; I won't hurt you." -- Unnamed thug and Liz Shaw, in "The Ambassadors of Death" From paul@prescod.net Wed Dec 15 17:24:00 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 15 Dec 1999 09:24:00 -0800 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> Message-ID: <3857CEB0.C29C5F24@prescod.net> "Andrew M. Kuchling" wrote: > > Huh? There's obviously a good deal of stuff in there, some of it > perhaps too esoteric, but I don't see where there's overlap. Well, there are several parsers and parser wrappers. How is a user supposed to choose? And there is PyDOM, Minidom and qp_dom. > Or are > you talking about Python tools in general, where there are 3 DOM > implementations? (PyDOM, 4DOM, and ZDOM hiding inside Zope.) That too. > I lean against shoveling more stuff into 1.6; better to get the > Distutils widely used, which makes it easier to install *all* Python > extensions. I don't think that XML is any more of an "add-on" to a modern scripting language than URL support or regular expression support. I'm in the "batteries included" camp for this and several other reasons: * standard Python libraries may soon need XML support. If WebDAV takes off then there should be a libWebDAV right alongside libftp and libhttp. And libWebDAV will require XML * there is a difference between theory and practice. In theory, distutils will be done soon and everything will be easy. In practice, it is the end of 1999 and at every conference I have to install the XML sig package on the machines of several people who haven't been able to get it going themselves. In practice, we can't wait for distutils because people are choosing their XML tools now. > >Ideally we would have one (or at most two!) implementation of each of > >the major specs: > >XML >SAX >Unicode >XPath >XPointer >XSLT >DOM > > Do you mean "one implementation of each in a single package", or "one > implementation existing for Python, distributed separately"? With the possible exception of XSLT, one implementation of each *in Python 1.6*. > We need to come up with a position paper for developer's day, stating > what needs to be discussed. Suggestions? I'd propose focusing on > getting the XML-SIG package to 1.0, but that's just an idea. I don't see how the XML-SIG package can ever get to 1.0. Anybody can contribute code at anytime and thus far we've been totally flexible about putting it in. I think that's great. It just won't ever lead to a stable, carefully maintained, tightly interoperable package. Some of the maintainers of the individual pieces have probably lost interest and there is probably nobody that understands it all enough to integrate it nicely. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things to be wary of: A new kid in his prime A man who knows the answers, and code that runs first time http://www.geezjan.org/humor/computers/threes.html From akuchlin@mems-exchange.org Wed Dec 15 18:45:06 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Wed, 15 Dec 1999 13:45:06 -0500 (EST) Subject: [XML-SIG] Developer's Day In-Reply-To: <3857CEB0.C29C5F24@prescod.net> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> Message-ID: <14423.57778.131798.776845@amarok.cnri.reston.va.us> Paul Prescod writes: >I don't think that XML is any more of an "add-on" to a modern scripting >language than URL support or regular expression support. I'm in the >"batteries included" camp for this and several other reasons: Good arguments, and you've shaken my convictions enough that I forwarded your posting to python-dev to get some reactions. However, thinking about it more, I still lean against inclusion. (This is all subject to Guido's say-so, of course; if he says something should go on, I'll bow to his decision.) Some observations: * Python revisions come out slowly, once every year or two. XML standards have been revolving faster , and we don't want to wait until 1.7 for SAX2, or DOM Level2, or other new revisions. Keeping the modules out of the core lets them be updated at their own pace. A counterargument is that the XML specs are slowing down -- add namespace support to SAX, and finalize DOM Level 2, and I don't think any other standards are very important to basic XML programming. * We really want a C-based parser to be commonly available. sgmlop is the only reasonable choice for this, because I'd be against including Expat. To replay some arguments I made against including the zlib library in 1.6, what if a C extension requires a newer version of the library? Symbol conflicts if you're lucky, hard-to-debug problems if you're not. * We can drop various marginal bits of the CVS tree; the xmlarch support is probably not of very wide interest, for example. >is the end of 1999 and at every conference I have to install the XML sig >package on the machines of several people who haven't been able to get >it going themselves. In practice, we can't wait for distutils because >people are choosing their XML tools now. I think I'm on the record as saying that Python's major problems now aren't language-related, but are with the development environment. Language changes (from minor, like 'for i in 1..9', to major, like fixing the type/class dichotomy or adding static types) aren't going to bring in piles of new users, useful though they might be to experienced Pythoneers, large projects, or some other specific application. Instead, the problem areas are having documentation for everything, finding Python extensions, installing them, distributing applications, and so forth. These problems affect all Python programmers, and we can't duck them forever by shoveling things into the base distribution. If installing things is a problem, then we need to buckle down and finish the distutils. So, overall, I'd still vote against inclusion in 1.6. >I don't see how the XML-SIG package can ever get to 1.0. Anybody can No, it's *got* to reach 1.0. The point of the package is that it's exactly *one* thing to install that gives basic XML tools; you don't need to chase down the SAX modules from Lars' page, PyExpat from ftp.cwi.nl, sgmlop from pythonware.com, and so forth. If the Distutils made it as easy as: python fetchpackage.py SAX PyExpat DOM sgmlop etc... then much of the need for a single package goes away, but, as you point out, that isn't currently the case. -- A.M. Kuchling http://starship.python.net/crew/amk/ We were here before any other city that now stands. And we will sing the funeral songs that are sung for cities for them when they die. -- The role of the Necropolis Litharge, in SANDMAN #55: "Cerements" From Dan.Zimmer@icn.siemens.com Wed Dec 15 20:24:39 1999 From: Dan.Zimmer@icn.siemens.com (Dan.Zimmer@icn.siemens.com) Date: Wed, 15 Dec 1999 15:24:39 -0500 Subject: [XML-SIG] Newbie question Message-ID: <85256848.006FF5C1.00@li01.lm.ssc.siemens.com> Hello all, I am relatively new to python so don't laugh at me too much... I inherited 2 PCs with python already installed, but the programs that I am running, actually reside on a server. I am attempting to install and run the software on 2 more PCs for back-up purposes. I noticed that I had to add the python directory to the PATH before I could it would find the executable, and I can now run some basic python programs. Now I find that the program is crashing with an attribute error when the __getattr_ is called for wdDoNotSaveChanges. (__init__.py) (One of the systems I inherited already had this problem as well) I know there is something simplistic that I am just overlooking, and its starting to drive me nuts, so any helpful hints would be much appreciated. Thanks, Dan From gstein@lyra.org Thu Dec 16 03:38:53 1999 From: gstein@lyra.org (Greg Stein) Date: Wed, 15 Dec 1999 19:38:53 -0800 (PST) Subject: [XML-SIG] Developer's Day In-Reply-To: <3857CEB0.C29C5F24@prescod.net> Message-ID: On Wed, 15 Dec 1999, Paul Prescod wrote: >... > > We need to come up with a position paper for developer's day, stating > > what needs to be discussed. Suggestions? I'd propose focusing on > > getting the XML-SIG package to 1.0, but that's just an idea. > > I don't see how the XML-SIG package can ever get to 1.0. Anybody can > contribute code at anytime and thus far we've been totally flexible > about putting it in. I think that's great. It just won't ever lead to a > stable, carefully maintained, tightly interoperable package. Some of the > maintainers of the individual pieces have probably lost interest and > there is probably nobody that understands it all enough to integrate it > nicely. Flexible? That's not entirely true. I've offered up qp_xml.py for inclusion twice now. The second time, I didn't even hear a single reply about putting the thing into the distribution. Sure, I have root access to the CVS repository. I can get it into the XML distro :-) But that would be wrong. So I post here, and get nothing but silence. Who *is* responsible for determining yes/no for inclusion? And what are the rules? If qp_xml is not valid for inclusion, then why not? thx, -g -- Greg Stein, http://www.lyra.org/ From tpassin@idsonline.com Thu Dec 16 04:37:31 1999 From: tpassin@idsonline.com (Thomas B. Passin) Date: Wed, 15 Dec 1999 23:37:31 -0500 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> Message-ID: <003d01bf477f$44b69ba0$0101a8c0@tomshp> ----- Original Message ----- Paul wrote: > "Andrew M. Kuchling" wrote: > > > > Huh? There's obviously a good deal of stuff in there, some of it > > perhaps too esoteric, but I don't see where there's overlap. > > Well, there are several parsers and parser wrappers. How is a user > supposed to choose? And there is PyDOM, Minidom and qp_dom. > > > Or are > > you talking about Python tools in general, where there are 3 DOM > > implementations? (PyDOM, 4DOM, and ZDOM hiding inside Zope.) > > That too. > > > I lean against shoveling more stuff into 1.6; better to get the > > Distutils widely used, which makes it easier to install *all* Python > > extensions. > > I don't think that XML is any more of an "add-on" to a modern scripting > language than URL support or regular expression support. I'm in the > "batteries included" camp for this and several other reasons: > Hear, Hear! For another scripting language example, look at Rebol. Very small executable code, and libraries for just about all key Internet protocols are built in with no imports needed. Loading a url or ftp-ing a file are trivial. Shallow parsing is a bit wierd but easy. Has a built-in CGI mode. Standard Python should be no less capable. > * standard Python libraries may soon need XML support. If WebDAV takes > off then there should be a libWebDAV right alongside libftp and libhttp. > And libWebDAV will require XML > > * there is a difference between theory and practice. In theory, > distutils will be done soon and everything will be easy. In practice, it > is the end of 1999 and at every conference I have to install the XML sig > package on the machines of several people who haven't been able to get > it going themselves. In practice, we can't wait for distutils because > people are choosing their XML tools now. > Yes, I had a lot of trouble getting XML-SIG to run on my Windows95 machine, especially the expat wrapper. It's working now, at last. And I never did get sgmlop working - it complained than the C API was a more advanced version than the one on my Python 1.5.2 distribution. For that matter, I couldn't get tkinter working until I copied several tcl.tk DLLS into the python DLL folder. Clean installations are of PRIME importance! Regards, Tom Passin From fredrik@pythonware.com Thu Dec 16 11:16:52 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 16 Dec 1999 12:16:52 +0100 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> Message-ID: <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> > I don't think that XML is any more of an "add-on" to a modern scripting > language than URL support or regular expression support. one could say the same (or even more so) for the GUI, but that hasn't exactly helped... > I'm in the "batteries included" camp for this and several other reasons: but some batteries *are* include: "import xmllib" works just fine in 1.5.2. (the 1.6 version will support unicode, and will probably also be much faster, thanks to the new regular expression engine). > * standard Python libraries may soon need XML support. If WebDAV takes > off then there should be a libWebDAV right alongside libftp and libhttp. > And libWebDAV will require XML (shouldn't that be "webdavlib.py" or maybe just "davlib.py" ? :-) the same thing can be said about XML-RPC and SOAP. and by some reason, both xmlrpclib.py and soaplib.py can both use xmllib.py if necessary... From aa8vb@yahoo.com Thu Dec 16 12:49:44 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Thu, 16 Dec 1999 07:49:44 -0500 Subject: [XML-SIG] Re: Developer's Day In-Reply-To: <14423.57778.131798.776845@amarok.cnri.reston.va.us> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> Message-ID: <19991216074944.A156414@vislab.epa.gov> Andrew M. Kuchling: |However, thinking about it more, I still lean against inclusion... | | * Python revisions come out slowly, once every year or two. XML | standards have been revolving faster , and we don't want to wait | until 1.7 for SAX2, or DOM Level2, or other new revisions. | Keeping the modules out of the core lets them be updated at their | own pace. I guess I don't follow. xmllib is there, though xml is updated at its own pace. I wouldn't think if xml was there it would hold xml dev back, unless the APIs are going to change radically. However, aren't SAX and DOM standard? I'd vote for inclusion to give folks something to work with. Personall I try to avoid depending on add-ons when possible because my scripts won't "just work" for anyone that has Python installed. There's the: 1) how do I get your version of the package, 2) how do I install it, and 3) them deciding if it's really worth the effort. that they have to go through. There's no "import xmllib.dom" or "import xmllib.sax" today, and those are standard APIs, aren't they? I think that alone is strong reason to push for inclusion in 1.6. With those standards out there, I'm less inclined to build on xmllib. Perhaps a compromise solution would be to add xml to 1.6 and only export (document) the XML interfaces that are standards-based and aren't expected to change. -- Randall Hopper aa8vb@yahoo.com From akuchlin@mems-exchange.org Thu Dec 16 15:20:02 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Thu, 16 Dec 1999 10:20:02 -0500 (EST) Subject: [XML-SIG] qp_xml (was Developer's Day) In-Reply-To: References: <3857CEB0.C29C5F24@prescod.net> Message-ID: <14425.802.985142.3545@amarok.cnri.reston.va.us> Greg Stein writes: >Who *is* responsible for determining yes/no for inclusion? And what are >the rules? If qp_xml is not valid for inclusion, then why not? If most of the SIG members think it's useful, then it should go in. I can't remember the reactions to qp_xml.py, whether favorable or unfavorable. So what does everyone think? To take a look at it, see http://www.lyra.org/greg/python/qp_xml.py . (Greg, correct me if there's a better URL.) -- A.M. Kuchling http://starship.python.net/crew/amk/ Can I have more water, please? My hair drank most of it. -- Lyta Hall, in SANDMAN #61: "The Kindly Ones:5" From larsga@garshol.priv.no Thu Dec 16 15:28:57 1999 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 16 Dec 1999 16:28:57 +0100 Subject: [XML-SIG] qp_xml (was Developer's Day) In-Reply-To: <14425.802.985142.3545@amarok.cnri.reston.va.us> References: <3857CEB0.C29C5F24@prescod.net> <14425.802.985142.3545@amarok.cnri.reston.va.us> Message-ID: * Andrew M. Kuchling | | If most of the SIG members think it's useful, then it should go in. | I can't remember the reactions to qp_xml.py, whether favorable or | unfavorable. So what does everyone think? Personally, I think qp_xml is a good idea. I doubt that I'll ever use the DOM myself, mainly because it's so big, complex and ugly and un-Pythonic. (This is not a complaint. Something designed to be mappable to all programming languages is unlikely to fit well in a language like Python.) Something like qp_xml looks a lot more attractive to me and for tree-based processing I'll probably use qp_xml, Pyxie or simply some dead-simple module that I write myself. Anyway, my opinion is that it should go in. --Lars M. From paul@prescod.net Thu Dec 16 13:31:54 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 05:31:54 -0800 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> Message-ID: <3858E9CA.FCB3DD9B@prescod.net> Fredrik Lundh wrote: > > one could say the same (or even more so) for the GUI, > but that hasn't exactly helped... GUI is a special case because there are no standards there. > but some batteries *are* include: "import xmllib" works > just fine in 1.5.2. xmllib is not, as far as I know, a legal XML processor and it certainly does not support "modern" advances like tree processing, validation, defaulted attributes, XML namespaces or SAX. So it isn't legal and it isn't modern. If people compare that to Perl or Java's Project X, to Microsoft's XML COM objects it will be, er, embarrassing. Sjoerd did a great job for the day, but it is way outdated now. By the time Python 2.0 comes out, xmllib will be quite primitive compared to what everyone else is doing. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Thu Dec 16 18:02:55 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 16 Dec 1999 10:02:55 -0800 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> Message-ID: <3859294F.138FF398@prescod.net> "Andrew M. Kuchling" wrote: > > * Python revisions come out slowly, once every year or two. XML > standards have been revolving faster , and we don't want to wait > until 1.7 for SAX2, or DOM Level2, or other new revisions. > Keeping the modules out of the core lets them be updated at their > own pace. A counterargument is that the XML specs are slowing > down -- add namespace support to SAX, and finalize DOM > Level 2, and I don't think any other standards are very important > to basic XML programming. I agree with your counterargument. :) Anyhow, isn't there a logical fallacy in your original argument? Why can't we offer a DOM 3 module or extension after Python ships with DOM 2? > * We really want a C-based parser to be commonly available. > sgmlop is the only reasonable choice for this, because I'd be > against including Expat. To replay some arguments I made against > including the zlib library in 1.6, what if a C extension requires > a newer version of the library? Symbol conflicts if you're lucky, > hard-to-debug problems if you're not. I don't understand this issue. Why would a C extension build on sgmlop which is designed to make XML information available to *Python* programmers? > * We can drop various marginal bits of the CVS tree; the xmlarch > support is probably not of very wide interest, for example. How about "expat", "mac", "pyexpat", "utils", "windows". There is just too much stuff there! And I daresay that alot of it has not been "quality controlled" to the level that we would expect if it were a part of the real Python library. In other words, there is no single place to go to get only XML-processing software that works well and works together. > I think I'm on the record as saying that Python's major problems now > aren't language-related, but are with the development environment. > Language changes (from minor, like 'for i in 1..9', to major, like > fixing the type/class dichotomy or adding static types) aren't going > to bring in piles of new users, useful though they might be to > experienced Pythoneers, large projects, or some other specific > application. (irrelevant aside: I agree 100% that making things easier to install will actually improve newbies experience more than (e.g.) static type checking but I do not agree that it is a better "sales tool". Most people are sold based on the language and its libraries before they start trying to install extensions.) > If installing things is a problem, then we need to > buckle down and finish the distutils. So, overall, I'd still vote > against inclusion in 1.6. So are you saying that Python 2 might have only five packages and everything else must be downloaded? No httplib, no pickle, no random or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? When people download Python and go to the library documentation that impressive array of BUILT-IN-FEATURES is part of what sells them on Python. Hell, I can download all of that stuff for Scheme but what makes Python beautiful is that I don't have to download it for Python. It's just there. But if an XML person comes to Python after hearing us rant about how great it is for processing XML and all they find is xmllib...they will be underwhelmed. > No, it's *got* to reach 1.0. The point of the package is that it's > exactly *one* thing to install that gives basic XML tools; you don't > need to chase down the SAX modules from Lars' page, PyExpat from > ftp.cwi.nl, sgmlop from pythonware.com, and so forth. If the > Distutils made it as easy as: > > python fetchpackage.py SAX PyExpat DOM sgmlop > > > > etc... > > then much of the need for a single package goes away, but, as you > point out, that isn't currently the case. I'm a little lost here. We need xmllib to continue because distutils doesn't do what we need yet but we don't need to put the stuff in the Python library because disutils will work well enough soon. But there is an important issue that disutils will not solve. One of the beautiful things about the Python library is that everything is at the same version level. When you install it you know that everything works together or else it WILL in the next patch level if you report the incompatibility. When the xml package gets versioned incompatibly with the Python library you don't have that safe feeling. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Thu Dec 16 19:06:24 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 16 Dec 1999 11:06:24 -0800 (PST) Subject: [XML-SIG] Developer's Day In-Reply-To: <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> Message-ID: On Thu, 16 Dec 1999, Fredrik Lundh wrote: >... > > * standard Python libraries may soon need XML support. If WebDAV takes > > off then there should be a libWebDAV right alongside libftp and libhttp. > > And libWebDAV will require XML > > (shouldn't that be "webdavlib.py" or maybe just "davlib.py" ? :-) :-) http://www.lyra.org/greg/python/ davlib.py httplib.py qp_xml.py davlib requires the new httplib (for HTTP/1.1) and qp_xml. Cheers, -g -- Greg Stein, http://www.lyra.org/ From jcw@equi4.com Thu Dec 16 19:09:42 1999 From: jcw@equi4.com (Jean-Claude Wippler) Date: Thu, 16 Dec 1999 20:09:42 +0100 Subject: [Python-Dev] Re: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <14423.57778.131798.776845@amarok.cnri.reston.va.us> <3859294F.138FF398@prescod.net> Message-ID: <385938F6.C4164756@equi4.com> Paul Prescod wrote: [...] > (irrelevant aside: [...] Most people are sold based on the language > and its libraries before they start trying to install extensions.) > > [AMK] > > If installing things is a problem, then we need to > > buckle down and finish the distutils. So, overall, I'd still vote > > against inclusion in 1.6. > > So are you saying that Python 2 might have only five packages and > everything else must be downloaded? No httplib, no pickle, no random > or math, no calendar, pwd, grp, imaplib, nntplib, mailbox or rexec? > > When people download Python and go to the library documentation that > impressive array of BUILT-IN-FEATURES is part of what sells them on > Python. Hell, I can download all of that stuff for Scheme but what > makes Python beautiful is that I don't have to download it for Python. > It's just there. But if an XML person comes to Python after hearing us > rant about how great it is for processing XML and all they find is > xmllib...they will be underwhelmed. (Nodding in agreement) Could this perhaps be solved with a large batteries-included standard distribution, plus a real easy/effective way to strip Python down and wrap things up for deployment? In other words, aim for two very distinct goals: everything within easy reach for development + fully signed-sealed-delivered products. The first goal can evolve to do fancy net-bourne distribution, even if it is a brittle process, because this is for Python developers. They want it all, so open the floodgate to give it all to them. The second becomes a matter or pruning down and wrapping up. All the way down to an single installation-less executable, if possible. I may well be wrong (and I'm not tracking distutils), but might it not be simpler to focus on 1) power users + 2) production-grade deployment, instead of trying to streamline a tangled-web-of-module-dependencies into a distribution system which tries to meet a wide range of needs? > [...] One of the beautiful things about the Python library is that > everything is at the same version level. When you install it you know > that everything works together or else it WILL in the next patch level > if you report the incompatibility. [...] More nods. So why not allow the Python distribution to become very large - with every release moving to a better-tuned combination of all the different parts (occasional mishaps can quickly be fixed)? Plus some tools to dist(ut)il(l) a turnkey solution from this big soup. Sort-of-from-violin-to-quartet-all-the-way-to-symphony-orchestra... -- Jean-Claude From tpassin@idsonline.com Fri Dec 17 04:23:55 1999 From: tpassin@idsonline.com (Thomas B. Passin) Date: Thu, 16 Dec 1999 23:23:55 -0500 Subject: [XML-SIG] Developer's Day Message-ID: <002501bf4846$897dd920$42fbb1cd@tomshp> David Niergarth wrote: > > Thanks for the pointer to REBOL -- I hadn't heard of it before. In your > post to the XML-SIG you mentioned > Actually, REBOL looks very interesting. There isn't enough documentation as yet so the learning curve is on the steep side. At the risk of being off-topic (and off-Python), I'm including a REBOL script - my only one- that retrieves a URL, and extracts a particular section from the html. The section REBOL[...] is essentially a comment. ---------------------------------------------------------------------------- ----------------------------------------- REBOL [ Title: "Zone Forecast Extractor" File: %zone.r Purpose: {Extract the Virginia Zone Forecast and display it.} ] zone: read http://iwin.nws.noaa.gov/iwin/va/zone.html print "" {parse zone [thru copy result to ] print result} print "Current Fairfax County Zone Forecast" fairfax: find zone "Fairfax" parse fairfax [thru "..." copy forecast to "$$"] print forecast ---------------------------------------------------------------------------- ---------------------------------------- My point is not to urge anyone to switch from Python to REBOL, but to illustrate how simple it can be. Getting a url in Python isn't much more involved if you ignore error handling. My point is that the REBOL folks have decided that their system will support standard network operations in a built-in way, just like filehandling is built in. Whether Python does it with standard libraries or built-in functions and types, basic url and xml handling should come included and easy to use. > > Shallow parsing is a bit wierd but easy. > > I'm curious what you mean by "shallow parsing". It's a topic I haven't > seen mentioned on the list except for a posting I made a while back (by > me) pointing out an article by Robert D. Cameron related to "shallow > parsing" XML with regular expressions ( ftp://fas.sfu.ca/pub/cs Yes, I got the phrase from your post and the link was very helpful - thank you very much. > TR/1998/CMPT1998-17.html ). More generally shallow parsing seems to be > mentioned in the context of parsing natural languages. I'd be interested > in understanding what you mean by it or in what domain you've used it. I take "shallow parsing" to mean getting the elements and their content but not the nested hierarchical structure. For example, I have a case where a spreadsheet is translated into xml (yes, really!). Each row becomes an element, and each cell in the row becomes a child element of the row. In this case each row is independent and on a separate line in the file. So line by line I just extract each named element using regular expressions, knowing in advance that each element has no children of its own. Fast and easy - shallow parsing. > Tom Passin From fredrik@pythonware.com Fri Dec 17 08:05:25 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 17 Dec 1999 09:05:25 +0100 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <3858E9CA.FCB3DD9B@prescod.net> Message-ID: <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> Paul Prescod wrote: > > one could say the same (or even more so) for the GUI, > > but that hasn't exactly helped... > > GUI is a special case because there are no standards there. oh, there are standards. Motif is one, for example. didn't help. people are still building new stuff on the fundamental components (xlib, win32 gdi, etc), just like people will keep on building new XML stuff on top of simple XML tokenizers (xmllib, expat, sgmlop). > > but some batteries *are* include: "import xmllib" works > > just fine in 1.5.2. > > xmllib is not, as far as I know, a legal XML processor and it certainly > does not support "modern" advances like tree processing, validation, > defaulted attributes, XML namespaces or SAX. So it isn't legal and it > isn't modern. "isn't legal"? now that's an interesting case of XML snobbery... "you're under arrest for using an illegal parser." "but sir, my tools work, they're fast as hell, and hundreds of people download them every week" "we don't care -- we're the XML police, so we don't have to". I'm out of here. From l.szyster@ibm.net Fri Dec 17 13:34:05 1999 From: l.szyster@ibm.net (Laurent Szyster) Date: Fri, 17 Dec 1999 14:34:05 +0100 Subject: [XML-SIG] qp_xml (was Developer's Day) References: <3857CEB0.C29C5F24@prescod.net> <14425.802.985142.3545@amarok.cnri.reston.va.us> Message-ID: <385A3BCD.6E839E1A@ibm.net> "Andrew M. Kuchling" wrote: > > If most of the SIG members think it's useful, then it should go in. > I can't remember the reactions to qp_xml.py, whether favorable or > unfavorable. So what does everyone think? I think it should go in. Laurent Szyster From aa8vb@yahoo.com Fri Dec 17 18:27:50 1999 From: aa8vb@yahoo.com (Randall Hopper) Date: Fri, 17 Dec 1999 13:27:50 -0500 Subject: [XML-SIG] Re: Developer's Day In-Reply-To: <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <3858E9CA.FCB3DD9B@prescod.net> <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> Message-ID: <19991217132750.A184035@vislab.epa.gov> Fredrik Lundh: |Paul Prescod wrote: |> > one could say the same (or even more so) for the GUI, |> > but that hasn't exactly helped... |> |> GUI is a special case because there are no standards there. | |oh, there are standards. Motif is one, for example. |didn't help. people are still building new stuff on the |fundamental components (xlib, win32 gdi, etc), just |like people will keep on building new XML stuff on |top of simple XML tokenizers (xmllib, expat, sgmlop). Fredrik, generally I agree with what you have to say. But this is way out there. So what's your point? Don't need standard APIs like DOM or SAX since you can theoretically do all your parsing with the Python specific xmllib? Motif helped and continues to. Consider industry (we're not just talking Linux world money-is-no-object development here). No, it wasn't a magic bullet -- neither are DOM or SAX for that matter. But standard APIs help encourage cross-platform and cross-language support and promote developer skills portability. |> xmllib is not, as far as I know, a legal XML processor and it certainly |> does not support "modern" advances like tree processing, validation, |> defaulted attributes, XML namespaces or SAX. So it isn't legal and it |> isn't modern. | |"isn't legal"? | |now that's an interesting case of XML snobbery... Admitted, "isn't standards compliant" might have been a more specific choice of words. But I think we knew what he meant. -- Randall Hopper aa8vb@yahoo.com From xml-sig@teleo.net Fri Dec 17 23:02:56 1999 From: xml-sig@teleo.net (Patrick Phalen) Date: Fri, 17 Dec 1999 15:02:56 -0800 Subject: [XML-SIG] Fwd: RE: REBOL and XML Message-ID: <99121715050503.02059@quadra.teleo.net> I'm forwarding this from XML-DEV, pertaining to recent REBOL discussions here. ---------- ## Forwarded Message ## ---------- Subject: RE: REBOL and XML Date: Fri, 17 Dec 1999 17:21:31 -0500 From: Gavin McKenzie Yes...I've been hooked on REBOL for a couple of months now. It truly is a different (in the good sense) and very powerful scripting language. That, plus the rebol mailing list and developers have been very responsive. However...and this is a biggie...it doesn't do XML like you would expect, or at least like I expected. The people at REBOL will tell you that REBOL has a built-in XML parser. True enough, it does have the capability to parse XML -- and it has some nifty features for composing HTML or XML, and a truly great type system where things like URIs and XML tags are first class datatypes built into the language. But, the result of parsing an XML file is that it is loaded into a tree structure in memory. No, it isn't loaded into a DOM, it is loaded into a REBOL 'block structure' which I've not found very easy to use. Blocks are just simple nested lists, and they're easy enough to deal with on their own...but, trying to work on an XML document that has been put into a block is very non-intuitive and tedious. I've asked the REBOL folks whether they are considering an add-on or another flavour of REBOL that exposes either a real SAX-style callback interface or a real DOM to the scripter. They have said that they are aware of the requirement, and do plan to build it...alas I expect they have *alot* of other work on their plate. Gavin. > -----Original Message----- > From: Simon St.Laurent [mailto:simonstl@simonstl.com] > Sent: Friday, December 17, 1999 1:18 PM > To: XML-Dev Mailing list > Subject: REBOL and XML > > > Is anyone doing work using REBOL[1] with XML? The > 'everything is data' > approach of REBOL seems both like a good fit and perhaps a > conflict with > the XML approach. I'm just getting started with this, but if > anyone has > opinions or stories, I'd love to hear them. > > [1] - http://www.rebol.com (and yes, it's pronounced like rebel) > > Simon St.Laurent > XML: A Primer, 2nd Ed. > Building XML Applications > Inside XML DTDs: Scientific and Technical > Sharing Bandwidth / Cookies > http://www.simonstl.com > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and > on CD-ROM/ISBN 981-02-3594-1 > To unsubscribe, mailto:majordomo@ic.ac.uk the following message; > unsubscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) ------------------------------------------------------- From ruebe@aachen.heimat.de Sat Dec 18 15:22:11 1999 From: ruebe@aachen.heimat.de (Christian Scholz) Date: Sat, 18 Dec 1999 16:22:11 +0100 (CET) Subject: [XML-SIG] Escaping text in DOM Message-ID: <19991218152211.0B5DD610DB@aachen.heimat.de> Hi! I am just working on my dav implementation and thus using the DOM implementation for parsing and creating XML request/responses. I now have a problem with german umlauts. If these are included in a test string the actual DOM implementation does not escape it and when I escape it by hand (ü) I get an escaped & (getting &uuml;). I looked at the implementation and found the escape method in xml.utils which expects an optional list of entities to convert as parameter. Unfortunately escape() is used without this parameter in core.py of the DOM implementation. The question is now what to do about it and how to change it to make it work. One solution would be maybe to use htmlentitydefs.py of the standard distribution directly in xml.utils.escape() Or will that create any problems? regards, Christian From ruebe@aachen.heimat.de Sat Dec 18 15:34:00 1999 From: ruebe@aachen.heimat.de (Christian Scholz) Date: Sat, 18 Dec 1999 16:34:00 +0100 (CET) Subject: [XML-SIG] 4DOM link? Message-ID: <19991218153400.D1841610DB@aachen.heimat.de> Hi! I just clicked on the link to the 4DOM page on the XML SIG webpage It said Not Found The requested URL /opentech/projects/4DOM/ was not found on this server. Can somebody fix this? And maybe send me the correct link? :) Thanks, Christian From gstein@lyra.org Sat Dec 18 19:44:43 1999 From: gstein@lyra.org (Greg Stein) Date: Sat, 18 Dec 1999 11:44:43 -0800 (PST) Subject: [XML-SIG] Escaping text in DOM In-Reply-To: <19991218152211.0B5DD610DB@aachen.heimat.de> Message-ID: On Sat, 18 Dec 1999, Christian Scholz wrote: >... > I now have a problem with german umlauts. If these are included in > a test string the actual DOM implementation does not escape it and > when I escape it by hand (ü) I get an escaped & (getting > &uuml;). I think you need to examine full UTF-8 encoding, rather than escaping specific characters. You can't escape only an umlaut and not escape chr(ord(c)+1). The other alternative would be to explicitly specify the encoding of the request/response. An umlaut shouldn't be in the request unless the encoding allowed for that. [ encoding as in: ] Have you seen my davlib.py, for the client side, yet? Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Sat Dec 18 18:21:15 1999 From: paul@prescod.net (Paul Prescod) Date: Sat, 18 Dec 1999 12:21:15 -0600 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <3858E9CA.FCB3DD9B@prescod.net> <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> Message-ID: <385BD09B.A4224AEA@prescod.net> What's the point of standards if implementors violate them willy nilly? It is totally wrong to only support the parts of standards that you feel like and it is the sort of thing that makes me want to go and rip Bill Gates' head off (when, as is often the case) Microsoft is the perpetrator. I would be a total hypocrite if I held my friends and favorite languages to a lower standard. "You told us to use Python for this million dollar system but halfway through its second day of operation someone fed us a well-formed XML document that crashed it." -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things never trust in: That's the vendor's final bill The promises your boss makes, and the customer's good will http://www.geezjan.org/humor/computers/threes.html From akuchlin@mems-exchange.org Sat Dec 18 22:26:09 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Sat, 18 Dec 1999 17:26:09 -0500 (EST) Subject: [XML-SIG] Developer's Day In-Reply-To: <385BD09B.A4224AEA@prescod.net> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <3858E9CA.FCB3DD9B@prescod.net> <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> <385BD09B.A4224AEA@prescod.net> Message-ID: <14428.2561.504543.286406@amarok.cnri.reston.va.us> Paul Prescod writes: >"You told us to use Python for this million dollar system but halfway >through its second day of operation someone fed us a well-formed XML >document that crashed it." No FUD, please. Are there valid XML documents that xmllib chokes on? Have these bugs been reported? (Clearly xmllib doesn't implement SAX or some other _de facto_ standard, but that isn't a problem, as long as people are aware of that.) -- A.M. Kuchling http://starship.python.net/crew/amk/ A day without a pun is a day without sunshine; there is gloom for improvement. -- John S. Crosbie From akuchlin@mems-exchange.org Sun Dec 19 02:43:19 1999 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Sat, 18 Dec 1999 21:43:19 -0500 Subject: [XML-SIG] Disposition of C extensions and packages Message-ID: <199912190243.VAA03844@207-172-57-180.s180.tnt2.ann.va.dialup.rcn.com> [Crossposted to xml-sig, distutils-sig] I'm working on getting the XML-SIG's CVS tree to install using the current version of the Distutils. Right now there are two C extensions, sgmlop.so and pyexpat.so, and they're installed under xml/parsers/ . It's hard to handle this case using the distutils code as it stands, because it expects to put extensions into a build/platlib/ directory, from which they'll be installed into site-packages. I can coerce setup.py into installing them into xml/parsers/, by subclassing the BuildExt command and setting build_dir myself: from distutils.command.build_ext import BuildExt class XMLBuildExt(BuildExt): def set_default_options (self): BuildExt.set_default_options( self ) self.build_dir = 'lib/build/xml/parser' setup (name = "PyXML", cmdclass = {'build_ext':XMLBuildExt}, ...) You also have to subclass the Install command and set build_dir there; I've trimmed that code. It's really clunky.\ Note that this scheme will break once there are C modules that need to be installed anywhere other than xml/parsers/, because build_dir is being hardwired without knowledge of what module is being compiled. Questions: 1) A general Python question about packaging style: Is mixing C extensions and Python modules in one package tree a bad idea? It makes the whole tree platform-dependent, which is probably annoying for sites maintaining Python installation for different architectures. 2) Another general question, this time o: how should this be handled? Should C extensions always be effectively top-level, and therefore go into site-packages? Should there be an xml package holding .py files, and an X package holding all the C extensions? (X = 'plat_xml', 'xml_binary', or something like that) 3) XML-SIG question: should I go ahead and change it (since I first changed it to use xml.parsers.sgmlop)? 4) Distutils question: is this a problem with the Distutils code that needs fixing? I suspect not; if the tools make it difficult to do stupid things like mix .py and .so files, that's a good thing. -- A.M. Kuchling http://starship.python.net/crew/amk/ The Kappamaki, a whaling research ship, was currently researching the question: How many whales can you catch in one week? -- Terry Pratchett & Neil Gaiman, _Good Omens_ From tpassin@idsonline.com Sun Dec 19 03:43:11 1999 From: tpassin@idsonline.com (Thomas B. Passin) Date: Sat, 18 Dec 1999 22:43:11 -0500 Subject: [XML-SIG] Disposition of C extensions and packages References: <199912190243.VAA03844@207-172-57-180.s180.tnt2.ann.va.dialup.rcn.com> Message-ID: <000701bf49d3$2cae22c0$9f2a08d1@tomshp> A.M. Kuchling wrote: > [Crossposted to xml-sig, distutils-sig] > > I'm working on getting the XML-SIG's CVS tree to install using the > current version of the Distutils. Right now there are two C > extensions, sgmlop.so and pyexpat.so, and they're installed under > xml/parsers/ . It's hard to handle this case using the distutils code > as it stands, because it expects to put extensions into a > build/platlib/ directory, from which they'll be installed into > site-packages. > > I can coerce setup.py into installing them into xml/parsers/, by > subclassing the BuildExt command and setting build_dir myself: > > from distutils.command.build_ext import BuildExt > class XMLBuildExt(BuildExt): > def set_default_options (self): > BuildExt.set_default_options( self ) > self.build_dir = 'lib/build/xml/parser' > > setup (name = "PyXML", cmdclass = {'build_ext':XMLBuildExt}, ...) > > You also have to subclass the Install command and set build_dir > there; I've trimmed that code. It's really clunky.\ > > Note that this scheme will break once there are C modules that need to > be installed anywhere other than xml/parsers/, because build_dir is > being hardwired without knowledge of what module is being compiled. > > Questions: > > 1) A general Python question about packaging style: Is mixing > C extensions and Python modules in one package tree a bad > idea? It makes the whole tree platform-dependent, which is > probably annoying for sites maintaining Python installation > for different architectures. > > 2) Another general question, this time o: how should this be > handled? Should C extensions always be effectively > top-level, and therefore go into site-packages? Should > there be an xml package holding .py files, and an X package > holding all the C extensions? (X = 'plat_xml', > 'xml_binary', or something like that) > Don't forget Windows! Windows users need a working package with binaries that installs easily. In Windows there is a common directory for DLLs, and no path changes are needed if the binaries go there. for unix/linux, why not put binaries wherever the python .so files goes (I'm assuming there is one, but I seem to remember seeing one on a linux installation)? How are others planning to handle this (other SIGs and future Python releases)? It's going to keep coming up, why not have a common solution? Of course, this distribution might need to come out before there is an agreement...., I do agree that the binaries shouldn't be mixed into the .py files in the tree. > 3) XML-SIG question: should I go ahead and change it (since I > first changed it to use xml.parsers.sgmlop)? > > 4) Distutils question: is this a problem with the Distutils > code that needs fixing? I suspect not; if the tools make > it difficult to do stupid things like mix .py and .so > files, that's a good thing. > > -- Let's thank A. M. for doing all this work! Tom Passin From paul@prescod.net Sun Dec 19 09:20:03 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 03:20:03 -0600 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <3858E9CA.FCB3DD9B@prescod.net> <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> <385BD09B.A4224AEA@prescod.net> <14428.2561.504543.286406@amarok.cnri.reston.va.us> Message-ID: <385CA343.9AEDDA58@prescod.net> "Andrew M. Kuchling" wrote: > > Paul Prescod writes: > >"You told us to use Python for this million dollar system but halfway > >through its second day of operation someone fed us a well-formed XML > >document that crashed it." > > No FUD, please. Are there valid XML documents that xmllib chokes on? Sure. Lots. ]> &foo; Support for this construct is not optional. And there are other, similar constructs that xmllib does not support. Another (well known, not xmllib specific) issue is Unicode. > Have these bugs been reported? I didn't see it as a bug. xmllib wasn't designed to be a full-fledged XML parser. It ignores the whole DOCTYPE which makes it impossible to conform to the XML spec. Had I pointed out that obvious fact through a series of bug reports it would have been interpreted (even by me) as the disingenuous rantings of an XML elitist. I always saw xmllib as a stop-gap until we finished real XML parsers like xmlproc and pyexpat. xmllib's virtue was that it came out incredibly early in the XML standard's development process. Python was probably the first languge to support XML in some form in its standard library. Now I want to push on and keep improving that legacy. Let's get at least Unicode, a validating XML parser, SAX and DOM in there. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 09:20:08 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 03:20:08 -0600 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <3858E9CA.FCB3DD9B@prescod.net> <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> <385BD09B.A4224AEA@prescod.net> <14428.2561.504543.286406@amarok.cnri.reston.va.us> Message-ID: <385CA348.8A3B5E0B@prescod.net> "Andrew M. Kuchling" wrote: > > Paul Prescod writes: > >"You told us to use Python for this million dollar system but halfway > >through its second day of operation someone fed us a well-formed XML > >document that crashed it." > > No FUD, please. Are there valid XML documents that xmllib chokes on? Sure. Lots. ]> &foo; Support for this construct is not optional. And there are other, similar constructs that xmllib does not support. Another (well known, not xmllib specific) issue is Unicode. > Have these bugs been reported? I didn't see it as a bug. xmllib wasn't designed to be a full-fledged XML parser. It ignores the whole DOCTYPE which makes it impossible to conform to the XML spec. Had I pointed out that obvious fact through a series of bug reports it would have been interpreted (even by me) as the disingenuous rantings of an XML elitist. I always saw xmllib as a stop-gap until we finished real XML parsers like xmlproc and pyexpat. xmllib's virtue was that it came out incredibly early in the XML standard's development process. Python was probably the first language to support XML in some form in its standard library. Now I want to push on and keep improving that legacy. Let's get at least Unicode, a validating XML parser, SAX and DOM in there. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From sean@digitome.com Sun Dec 19 11:57:40 1999 From: sean@digitome.com (Sean Mc Grath) Date: Sun, 19 Dec 1999 11:57:40 +0000 Subject: [XML-SIG] Developer's Day In-Reply-To: <385BD09B.A4224AEA@prescod.net> References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <3858E9CA.FCB3DD9B@prescod.net> <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> Message-ID: <3.0.6.32.19991219115740.009c6640@gpo.iol.ie> At 12:21 PM 12/18/99 -0600, Paul Prescod wrote: >What's the point of standards if implementors violate them willy nilly? > >It is totally wrong to only support the parts of standards that you feel >like and it is the sort of thing that makes me want to go and rip Bill >Gates' head off (when, as is often the case) Microsoft is the >perpetrator. I would be a total hypocrite if I held my friends and >favorite languages to a lower standard. > Pauls position is completely legitimate but so to is Fredriks. The problem, as I have said many times on xml-dev, is that XML on the face of it, is darned simple. You have start-tags, attributes, end-tags and character data. We have all seen "XML applications" and "XML parsers" which handle this gang-of-four concepts. There is a large audience out there who thinks that this is what XML *is*. Now we can peer over the parapet and shout "your parser smells of elderberries" or "I wave my mixed content at your ankles", as long as we like but the simple gang-of-four base apps will not go way. This dichotomy plays into the hands of those who would embrace and extend XML. XML is at risk of becoming not so much a "standard" but a state of mind. Interoperbility between XML documents and XML applications will be the casualty. I believe we should acknowledge that variations on XML exist in the real world and step in to avoid chaos developing. On xml-dev before XML'99 I suggested the idea of an XML "features manifest". A structured document that declares what features of XML 1.0 parser X or app. Y supports/uses. I received about 6 e-mails saying it was a good idea. Hardly an avalanche! I would like to propose that the XML-SIG takes on board the task of producing an "XML features manifest (XFM)" and then producing XFM declarations for all the applications that make up the XML-SIG distribution. We can then make this part of the distribution so that all the bits are candidly descibing what they can and cannot do. Once it is in shape, I suggest we give this to the XML world at large - especially the NIST folks who are grappling with the XML comformance issues. Thoughts? Sean, From sabren@manifestation.com Sun Dec 19 16:02:01 1999 From: sabren@manifestation.com (Michal Wallace (sabren)) Date: Sun, 19 Dec 1999 11:02:01 -0500 (EST) Subject: [XML-SIG] example code? Message-ID: Hello, I've got a total newbie question here: I've been writing a simple little template system for web scripts. The syntax for the files is XML-based, and I've been using xmllib... I've read the docs, but I've yet to find any decent example code, and right now almost all of my processing is being done through overriding the unknown_starttag() and unknown_endtag() functions. What should I be doing instead? The docs say to override the "elements" dictionary, but: def elements["sometag"][0](): pass gives an error, and: def start_sometag(blahblah): pass elements["sometag"][0] = start_sometag just seems like the wrong thing to be doing.. What's the "right" way to do this? Does anyone have an example program I could borrow? Thanks, - Michal ------------------------------------------------------------------------- http://www.manifestation.com/ http://www.linkwatcher.com/metalog/ ------------------------------------------------------------------------- From bsb@winnegan.de Sun Dec 19 17:17:20 1999 From: bsb@winnegan.de (Siggy Brentrup) Date: 19 Dec 1999 18:17:20 +0100 Subject: [XML-SIG] example code? In-Reply-To: "Michal Wallace's message of "Sun, 19 Dec 1999 11:02:01 -0500 (EST)" References: Message-ID: <874sdeg89r.fsf@baal.winnegan.de> "Michal Wallace (sabren)" writes: [...] > The docs say to override the "elements" dictionary, but: > > def elements["sometag"][0](): > pass > > gives an error, and: > > def start_sometag(blahblah): > pass > > elements["sometag"][0] = start_sometag > > just seems like the wrong thing to be doing.. What's the "right" > way to do this? Does anyone have an example program I could borrow? I'm an XML newbie myself, so there's NO WARRANTY whatsoever. As from what I read from the xmllib docs you have to derive a class along the following lines: class MY_XML_PARSER(xmllib.XMLParser): attributes = { 'SOME_TAG' : { 'ATTR' : 'DEFAULT_VALUE', 'NODFLT' : None, }, } elements = { # Docs say 'function' try with bound method first # if you get a wrong argument count error, drop # "self.". If the latter succeeds, report documentation # bug after crosschecking newest version of docs 'SOME_TAG' : (self.start_SOME_TAG, self.end_SOME_TAG), # repeat for other tags } def start_SOME_TAG(self, attr_dict): """document start_SOME_TAG here""" # do sth usefull pass # just to be syntactically correct :) def end_SOME_TAG(self): """document end_SOME_TAG here""" # do sth usefull pass # just to be syntactically correct :) Note: this is completely untested, but hopefully isn't completely wrong :) not-quite-Cato-ly-yoUr's Siggy -- Siggy Brentrup - bsb@winnegan.de - http://www.winnegan.de/ ****** ceterum censeo javascriptum esse restrictam ******* From sabren@manifestation.com Sun Dec 19 17:40:54 1999 From: sabren@manifestation.com (Michal Wallace (sabren)) Date: Sun, 19 Dec 1999 12:40:54 -0500 (EST) Subject: [XML-SIG] example code? In-Reply-To: <874sdeg89r.fsf@baal.winnegan.de> Message-ID: On 19 Dec 1999, Siggy Brentrup wrote: > As from what I read from the xmllib docs you have to derive a class > along the following lines: Thanks, Siggy. Yup, that works. It seems kind of odd to me that your code for handling one tag is spread out over four different places.. But I guess that's just me. Thanks. :) Cheers, - Michal ------------------------------------------------------------------------- http://www.manifestation.com/ http://www.linkwatcher.com/metalog/ ------------------------------------------------------------------------- From fredrik@pythonware.com Sun Dec 19 17:50:32 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Sun, 19 Dec 1999 18:50:32 +0100 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us><3856A77C.3A4D9F00@prescod.net><14423.49044.143333.790752@amarok.cnri.reston.va.us><3857CEB0.C29C5F24@prescod.net><00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com><3858E9CA.FCB3DD9B@prescod.net><006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> <3.0.6.32.19991219115740.009c6640@gpo.iol.ie> Message-ID: <012101bf4a49$8c369a80$f29b12c2@secret.pythonware.com> Sean Mc Grath wrote: > At 12:21 PM 12/18/99 -0600, Paul Prescod wrote: > >What's the point of standards if implementors violate them willy nilly? professional software development has always been (and will always be) about making the right tradeoffs. some examples: the "xmllib" tradeoff is "if you have python, it's there. cannot handle everything, so it's best to use in cases where you know the source". the "sgmlop" tradeoff is "like xmllib, but much faster." the "SXP" tradeoff (this is our upcoming sgmlop replacement) is "like sgmlop, but usually faster, fully supports utf-8 and unicode, and is written in pure python 1.6 (!)" I don't use the xml-sig distribution, but several people have described it as "huge, bloated, slow, but mostly compliant". good enough for some, in other words. but hardly for anyone. "You told us to use Python for this million dollar system but halfways through its second day of operation, we realized that the production XML files were large enough to bring the server back- bone to its knees. We now have several gigabytes sitting in the input queue, and no way to catch up. The system simply isn't fast enough." > I would like to propose that the XML-SIG > takes on board the task of producing an > "XML features manifest (XFM)" and then > producing XFM declarations for all the > applications that make up the XML-SIG > distribution. exactly what's needed to make the right trade- off... after all, most of us are likely to be pro- gramming professionals *and* XML amateurs at the same time. if you need a certain kind of programmer to be able to successfully use XML in a project, the technology is DOA. From paul@prescod.net Sun Dec 19 18:00:04 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 12:00:04 -0600 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us><3856A77C.3A4D9F00@prescod.net><14423.49044.143333.790752@amarok.cnri.reston.va.us><3857CEB0.C29C5F24@prescod.net><00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com><3858E9CA.FCB3DD9B@prescod.net><006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> <3.0.6.32.19991219115740.009c6640@gpo.iol.ie> <012101bf4a49$8c369a80$f29b12c2@secret.pythonware.com> Message-ID: <385D1D24.29400607@prescod.net> Fredrik Lundh wrote: > > professional software development has always been > (and will always be) about making the right tradeoffs. James Clark's incredibly fast parsers for both XML and (the much, much, harder) SGML written in both C and Java show that there is no need to trade-off anything. > "You told us to use Python for this million dollar > system but halfways through its second day of > operation, we realized that the production XML > files were large enough to bring the server back- > bone to its knees. We now have several gigabytes > sitting in the input queue, and no way to catch > up. The system simply isn't fast enough." Expat can chew through 17 megabytes in 7 seconds on my laptop working under the crippling presence of Windows NT. The only trick is getting the Python binding fast enough. I'm not clear on why, of all the scripting language communities, we Python people are the only ones with an antipathy towards expat. I mean rather than debate about the various tradeoffs we could get standards conformance, performance and reduce our maintenance burden by sharing maintenance with the other users of expat: * Mozilla * Perl * TCL * Javascript Anyhow, let me ask whether in the *standard library* it is more important to support the XML specification properly or to be able to handle the gigabyte documents that most people are unlikely to ever encounter. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Sun Dec 19 18:25:14 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 10:25:14 -0800 (PST) Subject: [XML-SIG] expat (was: Developer's Day) In-Reply-To: <385D1D24.29400607@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: >... > Expat can chew through 17 megabytes in 7 seconds on my laptop working > under the crippling presence of Windows NT. The only trick is getting > the Python binding fast enough. I'm not clear on why, of all the > scripting language communities, we Python people are the only ones with > an antipathy towards expat. I mean rather than debate about the various Not me! > tradeoffs we could get standards conformance, performance and reduce our > maintenance burden by sharing maintenance with the other users of expat: > > * Mozilla > * Perl > * TCL > * Javascript * Apache Apache 1.3.9 and later *include* Expat because I checked it in there myself! I mentioned this somewhere recently: we should absolutely include pyexpat into 1.6, regardless of what other XML bits we might include. I do not think we include Expat itself, tho(!). Cheers, -g -- Greg Stein, http://www.lyra.org/ From bsb@winnegan.de Sun Dec 19 18:31:59 1999 From: bsb@winnegan.de (Siggy Brentrup) Date: 19 Dec 1999 19:31:59 +0100 Subject: [XML-SIG] example code? In-Reply-To: "Michal Wallace's message of "Sun, 19 Dec 1999 12:40:54 -0500 (EST)" References: Message-ID: <87u2le7peo.fsf@baal.winnegan.de> "Michal Wallace (sabren)" writes: > On 19 Dec 1999, Siggy Brentrup wrote: > > > As from what I read from the xmllib docs you have to derive a class > > along the following lines: > > Thanks, Siggy. > > Yup, that works. Thanks for testing :) > It seems kind of odd to me that your code for handling > one tag is spread out over four different places.. But I guess that's > just me. Thanks. :) That's quite easy with Python, class XMLTag: """abstract base class for XML tags""" def __init__(self, name, parser): parser.attributes[name] = self.attributes parser.elements[name] = self.handle_start, self.handle_end class SOME_TAG(XMLTag): attributes = { 'SOME_ATTR' : 'default value', # ... } def __init__(self, parser, *OTHER_ARGS): XMLTag.__init__(self, self.__class__.__name__, parser) def handle_start(self, attr_dict): # do sth usefull pass def handle_end(self): # do sth usefull with all that stuff pass class MY_XML_PARSER(xmllib.XMLParser): attributes = {} elements = {} def __init__(self, *OTHER_ARGS): apply(SOME_TAG, (self,)+OTHER_ARGS) As before, NO WARRANTY whatsoever. not-quite-Cato-ly-yoUr's Siggy -- Siggy Brentrup - bsb@winnegan.de - http://www.winnegan.de/ ****** ceterum censeo javascriptum esse restrictam ******* From tpassin@idsonline.com Sun Dec 19 18:38:49 1999 From: tpassin@idsonline.com (Thomas B. Passin) Date: Sun, 19 Dec 1999 13:38:49 -0500 Subject: [XML-SIG] example code? References: Message-ID: <001901bf4a50$4b6347e0$172a08d1@tomshp> Michal Wallace wrote: > > Hello, > > I've got a total newbie question here: > > I've been writing a simple little template system for web scripts. > The syntax for the files is XML-based, and I've been using xmllib... > I've read the docs, but I've yet to find any decent example code, and > right now almost all of my processing is being done through overriding > the unknown_starttag() and unknown_endtag() functions. What should > I be doing instead? > > The docs say to override the "elements" dictionary, but: > > def elements["sometag"][0](): > pass > > gives an error, and: > > def start_sometag(blahblah): > pass > > elements["sometag"][0] = start_sometag > > just seems like the wrong thing to be doing.. What's the "right" > way to do this? Does anyone have an example program I could borrow? > > Thanks, > > - Michal If you know all the element names you will use, do something easy like this, which handles an element named "specialTag": import xmllib """A simple class to demonstate how to handle your own elements""" class bareBones(xmllib.XMLParser): def __init__(self): xmllib.XMLParser.__init__(self) # Your element is called "specialTag" def start_specialTag(self, attrs): print "Start specialTag" print "element name:", attrs self.handle_data=self.do_data #invoke your content handler def end_specialTag(self): print "End specialTag" self.handle_data=self.null_data #reset content handler # A minimal data handler def do_data(self,data): print "===============\n",data,"\n===============" def null_data(self,data):pass doc=""" This element won't be reported This one will """ if __name__=="__main__": parser=bareBones() parser.feed(doc) parser.close() Tom Passin From tpassin@idsonline.com Sun Dec 19 19:37:39 1999 From: tpassin@idsonline.com (Thomas B. Passin) Date: Sun, 19 Dec 1999 14:37:39 -0500 Subject: [XML-SIG] example code? References: <87u2le7peo.fsf@baal.winnegan.de> Message-ID: <001701bf4a58$8380d860$96fbb1cd@tomshp> Tom Passin posted this code sample: If you know all the element names you will use, do something easy like this, which handles an element named "specialTag": import xmllib """A simple class to demonstate how to handle your own elements""" class bareBones(xmllib.XMLParser): def __init__(self): xmllib.XMLParser.__init__(self) # Your element is called "specialTag" def start_specialTag(self, attrs): print "Start specialTag" print "element name:", attrs self.handle_data=self.do_data #invoke your content handler def end_specialTag(self): print "End specialTag" self.handle_data=self.null_data #reset content handler # A minimal data handler def do_data(self,data): print "===============\n",data,"\n===============" def null_data(self,data):pass doc=""" This element won't be reported This one will """ if __name__=="__main__": parser=bareBones() parser.feed(doc) parser.close() ------------------------------------------------------ A minor error - I forgot to change the string printed in start_specialTag() was: print "element name:", attrs should be: print "attributes", attrs Tom Passin From paul@prescod.net Sun Dec 19 21:03:07 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 15:03:07 -0600 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us> <3856A77C.3A4D9F00@prescod.net> <14423.49044.143333.790752@amarok.cnri.reston.va.us> <3857CEB0.C29C5F24@prescod.net> <00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com> <3858E9CA.FCB3DD9B@prescod.net> <006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> <3.0.6.32.19991219115740.009c6640@gpo.iol.ie> Message-ID: <385D480B.BAB5D74A@prescod.net> Sean Mc Grath wrote: > > ... > I believe we should acknowledge that > variations on XML exist in the real world > and step in to avoid chaos developing. > On xml-dev before XML'99 I suggested the > idea of an XML "features manifest". A > structured document that declares what > features of XML 1.0 parser X or app. Y > supports/uses. > > I received about 6 e-mails saying it was > a good idea. Hardly an avalanche! I think that your idea is good to a point. The XML specification has certain "optional features" and of course there are optional Unicode encodings. It would be good to have documentation of those. But XML itself is not too hard to parse in its entirety. There is no reason to wave the white flag. If you are programming in C++, Java, Javascript or TCL it takes no more effort to parse all well-formed documents than it takes to parse a simpler subset. The urge to simplify XML is aesthetic, not practical. Practically speaking a mess of subsets -- even well-documented subsets -- is still a mess. As an old SGML'er I feel like "Been there. Done that. Let's not do it again." In the SGML days there was an excuse because performance was an issue but modern XML processors work at IO speeds. There is no excuse for non-conformance. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 21:03:45 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 15:03:45 -0600 Subject: [XML-SIG] Developer's Day References: <199912132354.SAA10101@amarok.cnri.reston.va.us><3856A77C.3A4D9F00@prescod.net><14423.49044.143333.790752@amarok.cnri.reston.va.us><3857CEB0.C29C5F24@prescod.net><00fc01bf47b7$0ecbae80$f29b12c2@secret.pythonware.com><3858E9CA.FCB3DD9B@prescod.net><006101bf4865$7a73f620$f29b12c2@secret.pythonware.com> <3.0.6.32.19991219115740.009c6640@gpo.iol.ie> <012101bf4a49$8c369a80$f29b12c2@secret.pythonware.com> Message-ID: <385D4831.740E08F1@prescod.net> Fredrik Lundh wrote: > >... > > "You told us to use Python for this million dollar > system but halfways through its second day of > operation, we realized that the production XML > files were large enough to bring the server back- > bone to its knees. We now have several gigabytes > sitting in the input queue, and no way to catch > up. The system simply isn't fast enough." To be accurate, on my computer, xmlwf, sgmlop and "copy" are all about the same speed. Obviously the limiting factor is my hard disk and has nothing to do with overhead of XML processors. Therefore parsing performance should be the least of our concerns. The real issue is the performance of the binding. PyExpat seems to expect the whole document at once (or at least that's what the SAX driver does): if not self.parser.Parse(fileobj.read(),1): self.__report_error() Obviously this is going to be slow for large documents. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From paul@prescod.net Sun Dec 19 21:34:37 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 15:34:37 -0600 Subject: [XML-SIG] expat (was: Developer's Day) References: Message-ID: <385D4F6D.A8E023EF@prescod.net> Greg Stein wrote: > > I mentioned this somewhere recently: we should absolutely include pyexpat > into 1.6, regardless of what other XML bits we might include. I do not > think we include Expat itself, tho(!). I think we should. IIRC, when compiled to be of minimal size, expat is only about 50K. Unfortunately I can't find the email where James Clark original described this feature so I can't recall the performance hit in doing so. I'm willing to bet that it doesn't affect ASCII parsing speed at all. This machine doesn't have a compiler on it right now so I can't benchmark. :( If anyone wants to try it out, you can read about the minimization option here: http://www.jclark.com/xml/expatfaq.html Anyhow, that option would make expat much smaller than (for example) bsddb.pyd. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Sun Dec 19 21:59:00 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 13:59:00 -0800 (PST) Subject: [XML-SIG] expat (was: Developer's Day) In-Reply-To: <385D4F6D.A8E023EF@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > I mentioned this somewhere recently: we should absolutely include pyexpat > > into 1.6, regardless of what other XML bits we might include. I do not > > think we include Expat itself, tho(!). > > I think we should. > > IIRC, when compiled to be of minimal size, expat is only about 50K. Euh... separate issue, right? We aren't bundling Expat into Python, just pyexpat... Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Sun Dec 19 22:30:14 1999 From: paul@prescod.net (Paul Prescod) Date: Sun, 19 Dec 1999 16:30:14 -0600 Subject: [XML-SIG] expat (was: Developer's Day) References: Message-ID: <385D5C76.E5640E9@prescod.net> Greg Stein wrote: > > > IIRC, when compiled to be of minimal size, expat is only about 50K. > > Euh... separate issue, right? We aren't bundling Expat into Python, just > pyexpat... I'm suggesting we should bundle expat, as Apache, Mozilla, Perl etc. do. I see no good reason not to. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Three things see no end: A loop with exit code done wrong A semaphore untested, and the change that comes along http://www.geezjan.org/humor/computers/threes.html From gstein@lyra.org Sun Dec 19 22:55:40 1999 From: gstein@lyra.org (Greg Stein) Date: Sun, 19 Dec 1999 14:55:40 -0800 (PST) Subject: [XML-SIG] expat (was: Developer's Day) In-Reply-To: <385D5C76.E5640E9@prescod.net> Message-ID: On Sun, 19 Dec 1999, Paul Prescod wrote: > Greg Stein wrote: > > > IIRC, when compiled to be of minimal size, expat is only about 50K. > > > > Euh... separate issue, right? We aren't bundling Expat into Python, just > > pyexpat... > > I'm suggesting we should bundle expat, as Apache, Mozilla, Perl etc. do. > I see no good reason not to. It is under the MPL. Dunno if that is an issue, but there ya go. Speaking from my Linux-centric world, I think Expat is becoming common enough that I'd rather avoid bundling it. Also, Guido just recently mentioned a reluctance to bundle something like zlib (partly due to possible version problems between Python's version and the installed version); I would guess that he would have the same reluctance towards Expat as he has towards zlib. Cheers, -g -- Greg Stein, http://www.lyra.org/ From mal@lemburg.com Sun Dec 19 23:06:58 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 20 Dec 1999 00:06:58 +0100 Subject: [XML-SIG] Re: [Distutils] Disposition of C extensions and packages References: <199912190243.VAA03844@207-172-57-180.s180.tnt2.ann.va.dialup.rcn.com> Message-ID: <385D6512.23665CAC@lemburg.com> "A.M. Kuchling" wrote: > > Questions: > > 1) A general Python question about packaging style: Is mixing > C extensions and Python modules in one package tree a bad > idea? It makes the whole tree platform-dependent, which is > probably annoying for sites maintaining Python installation > for different architectures. I have been using that setup for two years now with all of my mx extensions and so far it has been working great. If you maintain packages with C extensions for several platforms, you can simply install the packages under the platform subdirs in /usr/local/lib/python1.5 -- one copy for every platform. Disk space is no argument anymore nowadays. > 2) Another general question, this time o: how should this be > handled? Should C extensions always be effectively > top-level, and therefore go into site-packages? Should > there be an xml package holding .py files, and an X package > holding all the C extensions? (X = 'plat_xml', > 'xml_binary', or something like that) Just leave them in the package. I use a separate subpackage for the C extension which the packages modules then import. This makes mixed Python + C extensions and prototyping of C APIs in Python very simple and straight forward. > 4) Distutils question: is this a problem with the Distutils > code that needs fixing? I suspect not; if the tools make > it difficult to do stupid things like mix .py and .so > files, that's a good thing. I wouldn't like this; for a very simple reason: if someone wants to provide a Python rewrite of a C module which works as dropin replacement, the only way to handle this is by having a .so file and a .py file with the same name in the same directory. mxDateTime uses such a setup, for example. Note that .so files are found before .py files, thus if someone does have the .so file, Python will use the C module and not the Python one. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 13 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From tpassin@idsonline.com Sun Dec 19 23:57:32 1999 From: tpassin@idsonline.com (Thomas B. Passin) Date: Sun, 19 Dec 1999 18:57:32 -0500 Subject: [XML-SIG] expat (was: Developer's Day) References: Message-ID: <001001bf4a7c$d1a1b860$0101a8c0@tomshp> Greg Stein wrote: > On Sun, 19 Dec 1999, Paul Prescod wrote: > > Greg Stein wrote: > > > > IIRC, when compiled to be of minimal size, expat is only about 50K. > > > > > > Euh... separate issue, right? We aren't bundling Expat into Python, just > > > pyexpat... > > > > I'm suggesting we should bundle expat, as Apache, Mozilla, Perl etc. do. > > I see no good reason not to. > > It is under the MPL. Dunno if that is an issue, but there ya go. > > Speaking from my Linux-centric world, I think Expat is becoming common > enough that I'd rather avoid bundling it. Also, Guido just recently > mentioned a reluctance to bundle something like zlib (partly due to > possible version problems between Python's version and the installed > version); I would guess that he would have the same reluctance towards > Expat as he has towards zlib. > This is actually a reason to bundle it with the Python distribution - you know you have a version that works. I've not been able to get sgmlop working because of C API version differences. This is an example of the non-benefits of non-bundling. Tom Passin From uche.ogbuji@fourthought.com Mon Dec 20 04:16:49 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Sun, 19 Dec 1999 21:16:49 -0700 Subject: [XML-SIG] 4DOM link? In-Reply-To: Your message of "Sat, 18 Dec 1999 16:34:00 +0100." <19991218153400.D1841610DB@aachen.heimat.de> Message-ID: <199912200416.VAA01140@localhost.localdomain> > I just clicked on the link to the 4DOM page on the XML SIG webpage > > It said > > Not Found > > The requested URL /opentech/projects/4DOM/ was not found on this server. > > > Can somebody fix this? And maybe send me the correct link? :) Hmm. That's a _very_ old link. The current link is http://FourThought.com/4Suite/4DOM Could someone with write-access to the xml-sig page correct this? Note that 4DOM, 4XSLT and 4XPath can all be got from the main 4Suite page http://FourThought.com/4Suite Note that I am working on packaging new releases of all 4Suite components, which should be available tonight or so. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Mon Dec 20 07:52:26 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Mon, 20 Dec 1999 00:52:26 -0700 Subject: [XML-SIG] ANN: 4DOM 0.9.0 Message-ID: <385DE03A.474D01FD@fourthought.com> FourThought LLC (http://FourThought.com) announces the release of 4DOM 0.9.0 ----------------------- An XML/HTML Python library using the Document Object Model interface 4DOM is a Python library for XML and HTML processing and manipulation using the W3C's Document Object Model for interface. 4DOM implements DOM Core level 2, HTML level 2 and Level 2 Document Traversal. 4DOM should work on all platforms supported by Python. If you have any problems with a particular platform, please e-mail the authors. 4DOM is designed to allow developers rapidly design applications that read, write or manipulate HTML and XML. News ---- Changes: - Major re-write to match the general consensus DOm binding for Python. Code formerly in the form "node.getChildNodes()" is now to be used in the form "node._get_childNodes()" or simply "node.childNodes". Similarly "text.setData("spam")" becomes "text._set_data("spam")" or text.data = "spam" - Update to full Level 2 support in core and HTML, including namespace-support. - Many bug-fixes More info and Obtaining 4DOM ---------------------------- Please see http://FourThought.com/4Suite/4DOM Or you can download 4DOM from ftp://FourThought.com/pub/4Suite/4DOM 4DOM is distributed under a license similar to that of Python. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Mon Dec 20 08:12:16 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Mon, 20 Dec 1999 01:12:16 -0700 Subject: [XML-SIG] ANN: 4XSLT 0.8.0 and 4XPath 0.8.0 Message-ID: <385DE4E0.FF966D9D@fourthought.com> FourThought LLC (http://FourThought.com) announces the release of 4XSLT and 4XPath 0.8.0 ---------------------- A python implementation of the W3C's XSLT language 4XSLT is an XML transformation processor based on the W3C's specification for the XSLT transform language. 4XPath implements the W3C XPath language for indicating and selecting XML document components. http://www.w3.org/TR/xslt 4XPath implements the full 4XPath recommendation except for the 'lang' core function. Currently, 4XSLT supports a sub-set of the XSLT recommendation including the following: Full expression support and attribute-value template expansion xsl:include xsl:import xsl:template xsl:apply-imports xsl:apply-templates xsl:copy xsl:call-template xsl:if xsl:for-each xsl:choose xsl:element xsl:when xsl:attribute xsl:otherwise xsl:text xsl:message xsl:value-of xsl:variable xsl:processing-instruction xsl:param xsl:comment xsl:with-param xsl:strip-space xsl:key xsl:preserve-space xsl:copy-of xsl:output and, of course, xsl:stylesheet, xsl:transform, literal elements and text Using the xml output method, 4XSLT produces the result tree by throwing events from the emerging SAX 2 standard to a handler, so it can be easily modified to supply results to any SAX 2 consumer. For the 'html' and 'text' output methods special SAX consumers produce HTML DOM nodes and plain text respectively. News ---- Changes in 0.8.0 ---------------- - 4XSLT implements xsl:output - Fixes to namespace handling - support lteral element as entire style-sheet - update to latest 4DOM interface - many big-fixes and more extensive testing More info and Obtaining 4XSLT ----------------------------- Please see http://FourThought.com/4Suite/4XSLT Or you can download 4XSLT from ftp://FourThought.com/pub/4Suite/4XSLT 4XSLT is distributed under a license similar to that of Python. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Mon Dec 20 08:18:18 1999 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Mon, 20 Dec 1999 01:18:18 -0700 Subject: [XML-SIG] Re: 4DOM etc. Message-ID: <385DE64A.4DBDBD72@fourthought.com> Just to note that RPM Linux binaries for 4Suite-base, 4DOM, 4XSLT and 4XPath are available at ftp://ftp.fourthought.com/pub/mirrors/python4linux/redhat/ Along with many other Python-related RPMs maintained the python4linux group. We expect to complete Windows support for the 4Suite in January. Thank you. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Mon Dec 20 08:58:28 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 20 Dec 1999 01:58:28 -0700 Subject: [XML-SIG] Merry Xmas Courtesy 4XSLT Message-ID: <199912200858.BAA04099@localhost.localdomain> This is a multipart MIME message. --==_Exmh_11745643960 Content-Type: text/plain; charset=us-ascii OK, it's somewhat silly, but I can't resist. Who says XSLT is only for juggling XML? The attachment works with 4XSLT 0.9.0 Gaudete! -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org --==_Exmh_11745643960 Content-Type: text/plain; name="xslt-xmas.py"; charset=us-ascii Content-Description: xslt-xmas.py Content-Disposition: attachment; filename="xslt-xmas.py" #Demonstrates some basic programming from Ft.Xslt.Processor import Processor import Ft.Dom.Ext from Ft.Dom.Ext.Reader import Sax2 sheet_str_2 = """ '****Merry Xmas!****' * """ xml_source = """15""" def test(): processor = Processor() sheet = Sax2.FromXml(sheet_str_2) xml_dom = Sax2.FromXml(xml_source) processor.appendStylesheetNode(sheet) result = processor.run(xml_dom, ignorePis=1) print print result if __name__ == '__main__': test() --==_Exmh_11745643960-- From akuchlin@mems-exchange.org Mon Dec 20 15:54:12 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Mon, 20 Dec 1999 10:54:12 -0500 (EST) Subject: [XML-SIG] XML 0.5.2 release Message-ID: <199912201554.KAA12883@amarok.cnri.reston.va.us> Incited by Greg Stein's recent rants about releasing code more frequently, I've cut a new release of the CVS tree, version 0.5.2. I won't be announcing this release widely, because it might be broken in some way. Bugs will be fixed in new releases issued; I'm going to try to make new releases fairly often. http://www.python.org/sigs/xml-sig/files/PyXML-0.5.2.tar.gz I still have to go over the CVS logs and pull out a list of new features. -- A.M. Kuchling http://starship.python.net/crew/amk/ We may be in the sewer, but there's absolutely no need for that kind of gutter profanity. -- Sebastian, in SEBASTIAN O #2 From akuchlin@mems-exchange.org Mon Dec 20 16:37:33 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Mon, 20 Dec 1999 11:37:33 -0500 (EST) Subject: [XML-SIG] Future plans Message-ID: <199912201637.LAA12992@amarok.cnri.reston.va.us> For most of 1999 I've been incompetent in fostering the XML-SIG's work; things have been left to drift while I was distracted by other things, and now it's time to get things moving again. The 0.5.2 release is part of this, and I'm going to try to issue new versions fairly often (no slower than monthly) from now on. Some things to do: * I propose dropping the wstrop and xmlarch code from the CVS tree: wstrop because Python 1.6 will have built-in Unicode support of some strip, and xmlarch because architectual forms are fairly rarely used, and don't need to be in the core. * What about namespace support in SAX -- what's the status of SAX2? * The DOM needs more work. I've fixed the braindamaged __getattr__ that was terribly slow; building a DOM tree of "The Winter's Tale" now takes around 25 seconds, not 80. (4DOM takes around 12 seconds for the same job.) The accessor methods need to be renamed to match the Python CORBA mapping, and the code needs to be brought up to DOM Level 2, which will add namespace support. I'm still wondering if it's worth maintaining a second DOM implementation in parallel with 4DOM. 4DOM 0.9.0 already implements Core Level 2 according to Uche's recent announcement. I worry about 4DOM requiring you to do ReleaseNode() in order to break cycles, envisioning tricky bugs where you forget to release a node and end up leaking memory, or where you release too soon. Perhaps this is just paranoia; for most common applications, maybe you just create a DOM tree, rearrange it, write it out, and then release the whole tree -- no tricky business about keeping some nodes and releasing others. Any 4DOM users/developers want to comment on this? -- A.M. Kuchling http://starship.python.net/crew/amk/ Posting an article with a malformed address so that mail bounces when people reply: Poster and/or their admin are sent back to kindergarten. -- Kibo, in the Happynet Manifesto From l.szyster@ibm.net Mon Dec 20 16:51:54 1999 From: l.szyster@ibm.net (Laurent Szyster) Date: Mon, 20 Dec 1999 17:51:54 +0100 Subject: [XML-SIG] Future plans References: <199912201637.LAA12992@amarok.cnri.reston.va.us> Message-ID: <385E5EAA.E9B046E4@ibm.net> "Andrew M. Kuchling" wrote: > > * The DOM needs more work. I've fixed the braindamaged __getattr__ > that was terribly slow; building a DOM tree of "The Winter's > Tale" now takes around 25 seconds, not 80. (4DOM takes around > 12 seconds for the same job.) On what kind of computer (CPU model, clock speed)? Laurent Szyster From akuchlin@mems-exchange.org Mon Dec 20 17:09:32 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Mon, 20 Dec 1999 12:09:32 -0500 (EST) Subject: [XML-SIG] Future plans In-Reply-To: <385E5EAA.E9B046E4@ibm.net> References: <199912201637.LAA12992@amarok.cnri.reston.va.us> <385E5EAA.E9B046E4@ibm.net> Message-ID: <14430.25292.88714.618110@amarok.cnri.reston.va.us> Laurent Szyster writes: >"Andrew M. Kuchling" wrote: >> that was terribly slow; building a DOM tree of "The Winter's >> Tale" now takes around 25 seconds, not 80. (4DOM takes around >> 12 seconds for the same job.) >On what kind of computer (CPU model, clock speed)? My 266MHz Linux box. The code still needs more looking at; I'm not sure where the bottleneck is at the moment: the DOM, SAX, PyExpat, ...? -- A.M. Kuchling http://starship.python.net/crew/amk/ What is now proved was once only imagined. -- William Blake, "The Marriage of Heaven and Hell" From paul@prescod.net Mon Dec 20 17:42:23 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 20 Dec 1999 11:42:23 -0600 Subject: [XML-SIG] Future plans References: <199912201637.LAA12992@amarok.cnri.reston.va.us> Message-ID: <385E6A7F.35F71734@prescod.net> "Andrew M. Kuchling" wrote: > > ... > * I propose dropping the wstrop and xmlarch code from the CVS > tree: wstrop because Python 1.6 will have built-in Unicode > support of some strip, and xmlarch because architectual forms > are fairly rarely used, and don't need to be in the core. Find with me. > * What about namespace support in SAX -- what's the status of SAX2? Under active discussion. Should be done in a matter of weeks or months. > * The DOM needs more work. I've fixed the braindamaged __getattr__ > that was terribly slow; building a DOM tree of "The Winter's > Tale" now takes around 25 seconds, not 80. (4DOM takes around > 12 seconds for the same job.) I s there any virtue in embedding C code in Python 1.6 to build a C DOM that can be used by qp_xml, 4DOM and PyDOM (if they all survive). Obviously the object building would be much faster but the proxying might kill the performance gains. > I worry about 4DOM requiring you to do ReleaseNode() in order > to break cycles, envisioning tricky bugs where you forget to > release a node and end up leaking memory, or where you release > too soon. Isn't there a way to build a proxy system *on top of* 4DOM that calls ReleaseNode when appropriate. Then users can use the slow interface or reach down and get at the fast node objects if they are willing to take the responsibility to ReleaseNode() when they are done. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself The occasional act of disrespect for the American flag creates but a flickering insult to the values of democracy -- unless it provokes America into limiting the freedoms that are its hallmark. -- Paul Tash, executive editor of the St. Petersburg Times From uche.ogbuji@fourthought.com Mon Dec 20 17:50:43 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 20 Dec 1999 10:50:43 -0700 Subject: [XML-SIG] 4DOM CORBA-free Message-ID: <199912201750.KAA06509@localhost.localdomain> I forgot to mention that 4DOM is now 100% CORBA-free. We shall be releasing add-ons for CORBA support, but they shall have no effect on those who don't need distributed DOM. This also means that there is no need for "make" and no problem with any Python-supported platform. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From l.szyster@ibm.net Mon Dec 20 17:48:21 1999 From: l.szyster@ibm.net (Laurent Szyster) Date: Mon, 20 Dec 1999 18:48:21 +0100 Subject: [XML-SIG] Future plans References: <199912201637.LAA12992@amarok.cnri.reston.va.us> <385E5EAA.E9B046E4@ibm.net> <14430.25292.88714.618110@amarok.cnri.reston.va.us> Message-ID: <385E6BE5.F50A7634@ibm.net> "Andrew M. Kuchling" wrote: > > Laurent Szyster writes: > >"Andrew M. Kuchling" wrote: > >> that was terribly slow; building a DOM tree of "The Winter's > >> Tale" now takes around 25 seconds, not 80. (4DOM takes around > >> 12 seconds for the same job.) > >On what kind of computer (CPU model, clock speed)? > > My 266MHz Linux box. The code still needs more looking at; I'm not > sure where the bottleneck is at the moment: the DOM, SAX, PyExpat, ...? Most probably the DOM, surely no PyExpat. On my 233MHz Linux box (HP NetServer E45), using PyExpat, a modified qp_xml.py core with additional layers for Namespace and XPath support, my figures are: DOM 4DOM ------------------------------------------------------------ PyExpat with no callback 0.37 secs 1.48% 3.08% base qp_xml like parser 1.64 secs (0) 6.68% 13.68% Enhanced pythonic DOM 1.97 secs (1) 7.88% 16.42% XML Namespace support added 2.55 secs (2) 10.02% 21.25% XPath capable DOM 4.64 secs (3) 18.56% 38.67% (0) experience showed that it's node object instanciation that makes biggest performance difference between this parser and a PyExpat with no callback. (1) each element object instance __dict__ is modified so that you can access the nth occurence of it's child 'type' as element.type[i] or it's attributes 'name' as element.name (2) namespaces lookup, maps element types to classes instances based on the namespace info and apply functions for namespace's attributes (obviously, win_tale.xml only test the overhead of no namespace processing ;-) (3) builds additional kjSets and kjGraph data structures for fast XPath operation on the DOM (note: performance degrades with number of elements and attributes in excess of around 5.000). Laurent Szyster From l.szyster@ibm.net Mon Dec 20 17:51:45 1999 From: l.szyster@ibm.net (Laurent Szyster) Date: Mon, 20 Dec 1999 18:51:45 +0100 Subject: [XML-SIG] Future plans References: <199912201637.LAA12992@amarok.cnri.reston.va.us> <385E6A7F.35F71734@prescod.net> Message-ID: <385E6CB1.60ED6782@ibm.net> Paul Prescod wrote: > >"Andrew M. Kuchling" wrote: >> >> * The DOM needs more work. I've fixed the braindamaged __getattr__ >> that was terribly slow; building a DOM tree of "The Winter's >> Tale" now takes around 25 seconds, not 80. (4DOM takes around >> 12 seconds for the same job.) > > I s there any virtue in embedding C code in Python 1.6 to build a C > DOM that can be used by qp_xml, 4DOM and PyDOM (if they all survive). I saw some kind of code for this in the latest sgmlop release. And it looked a lot more like qp_xml.py DOM structures than anything else. Laurent Szyster From uche.ogbuji@fourthought.com Mon Dec 20 18:09:53 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 20 Dec 1999 11:09:53 -0700 Subject: [XML-SIG] Future plans In-Reply-To: Your message of "Mon, 20 Dec 1999 11:37:33 EST." <199912201637.LAA12992@amarok.cnri.reston.va.us> Message-ID: <199912201809.LAA06565@localhost.localdomain> > * I propose dropping the wstrop and xmlarch code from the CVS > tree: wstrop because Python 1.6 will have built-in Unicode > support of some strip, and xmlarch because architectual forms > are fairly rarely used, and don't need to be in the core. Agreed, although we might want to wait until Python 1.6 is out before dropping the former. > * What about namespace support in SAX -- what's the status of SAX2? Lars published a SAX2 module that pretty much covers the ground of the current status. I've been cajoling the folks on XML-DEV to finish the SAX2 spec, and things are coming about slowly. 4DOM comes with a pretty complete SAX2 -> DOM reader, which is used by 4XSLT. > * The DOM needs more work. I've fixed the braindamaged __getattr__ > that was terribly slow; building a DOM tree of "The Winter's > Tale" now takes around 25 seconds, not 80. (4DOM takes around > 12 seconds for the same job.) The accessor methods > need to be renamed to match the Python CORBA mapping, and the > code needs to be brought up to DOM Level 2, which will add > namespace support. DOM Level 2 also updates Document factories so that most nodes can be created in a standard way. The way they use DocumentType for the purpose is a bit dodgy, IMHO, but it's there. DOM Level 2 also for practical reasons occupies an odd position between straight XML 1.0 behavior and XML with Namespaces behavior, but it does the job of adding namespace support. > I'm still wondering if it's worth maintaining a second DOM > implementation in parallel with 4DOM. 4DOM 0.9.0 already > implements Core Level 2 according to Uche's recent > announcement. I used to say that PyDOM and 4DOM addressed different users, but over time, changes to 4DOM have made this less clear. 4DOM is now far more pythonic and light-weight than when it started out. > I worry about 4DOM requiring you to do ReleaseNode() in order > to break cycles, envisioning tricky bugs where you forget to > release a node and end up leaking memory, or where you release > too soon. Perhaps this is just paranoia; for most common > applications, maybe you just create a DOM tree, rearrange it, > write it out, and then release the whole tree -- no tricky > business about keeping some nodes and releasing others. > Any 4DOM users/developers want to comment on this? Yes, the ReleaseNode could be the source of memory leaks. We experimented with a few automatic methods, and were very unsatisfied with all of them. We haven't ever had any difficulty hunting down memory leaks when they occurred because of a missing "ReleaseNode", even without plumbo or cyclops. We have installed 4DOM into some rather long-running systems, and some at rather large scale. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From sanner@scripps.edu Mon Dec 20 18:16:57 1999 From: sanner@scripps.edu (Michel Sanner) Date: Mon, 20 Dec 1999 10:16:57 -0800 Subject: [XML-SIG] Re: [Distutils] Disposition of C extensions and packages In-Reply-To: "M.-A. Lemburg" "Re: [Distutils] Disposition of C extensions and packages" (Dec 20, 12:06am) References: <199912190243.VAA03844@207-172-57-180.s180.tnt2.ann.va.dialup.rcn.com> <385D6512.23665CAC@lemburg.com> Message-ID: <991220101657.ZM496939@noah.scripps.edu> On Dec 20, 12:06am, M.-A. Lemburg wrote: > Subject: Re: [Distutils] Disposition of C extensions and packages > "A.M. Kuchling" wrote: > > > > Questions: > > > > 1) A general Python question about packaging style: Is mixing > > C extensions and Python modules in one package tree a bad > > idea? It makes the whole tree platform-dependent, which is > > probably annoying for sites maintaining Python installation > > for different architectures. > > I have been using that setup for two years now with all of > my mx extensions and so far it has been working great. > > If you maintain packages with C extensions for several > platforms, you can simply install the packages under > the platform subdirs in /usr/local/lib/python1.5 -- one > copy for every platform. Disk space is no argument anymore > nowadays. > Agreed, disk space is not an issue, but file duplication is not very nice. I think this is a real weakness in Python: on one hand we have platform independent extensions which are great on the other hand we have native extensions when performance is required. This is, at least for me (in scientific computing) what makes Python such a great tool, BUT Python does not provide a mechanism to split a single package into platform dependent and platform independent part. We maintain Python and a number of packages for IRIX, SunOS, Dec Alpha OSF, Win32 and linux ... and having to update 5 .py files at every bug fix is a real pain. -Michel -- ----------------------------------------------------------------------- >>>>>>>>>> AREA CODE CHANGE <<<<<<<<< we are now 858 !!!!!!! Michel F. Sanner Ph.D. The Scripps Research Institute Assistant Professor Department of Molecular Biology 10550 North Torrey Pines Road Tel. (858) 784-2341 La Jolla, CA 92037 Fax. (858) 784-2860 sanner@scripps.edu http://www.scripps.edu/sanner ----------------------------------------------------------------------- From wunder@infoseek.com Mon Dec 20 18:26:57 1999 From: wunder@infoseek.com (Walter Underwood) Date: Mon, 20 Dec 1999 10:26:57 -0800 Subject: [XML-SIG] expat (was: Developer's Day) In-Reply-To: <385D5C76.E5640E9@prescod.net> References: Message-ID: <3.0.5.32.19991220102657.00be0520@corp.infoseek.com> At 04:30 PM 12/19/99 -0600, Paul Prescod wrote: >Greg Stein wrote: >> >> > IIRC, when compiled to be of minimal size, expat is only about 50K. >> >> Euh... separate issue, right? We aren't bundling Expat into Python, just >> pyexpat... > >I'm suggesting we should bundle expat, as Apache, Mozilla, Perl etc. do. >I see no good reason not to. I agree. We switched to Expat for our Python-based product, and ship it on Solaris, NT, Linux, and HP-UX. wunder -- Walter R. Underwood Senior Staff Engineer Infoseek Software GO Network, part of The Walt Disney Company wunder@infoseek.com http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 From gstein@lyra.org Mon Dec 20 18:42:25 1999 From: gstein@lyra.org (Greg Stein) Date: Mon, 20 Dec 1999 10:42:25 -0800 (PST) Subject: [XML-SIG] XML 0.5.2 release In-Reply-To: <199912201554.KAA12883@amarok.cnri.reston.va.us> Message-ID: This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. --1658348780-448052300-945715345=:16305 Content-Type: TEXT/PLAIN; charset=US-ASCII Woo hoo! Thanx, Andrew! On Mon, 20 Dec 1999, Andrew M. Kuchling wrote: > Incited by Greg Stein's recent rants about releasing code more > frequently, I've cut a new release of the CVS tree, version 0.5.2. I > won't be announcing this release widely, because it might be broken in > some way. Bugs will be fixed in new releases issued; I'm going to try > to make new releases fairly often. > > http://www.python.org/sigs/xml-sig/files/PyXML-0.5.2.tar.gz > > I still have to go over the CVS logs and pull out a list of new > features. I've attached a script that I use for yanking change logs from CVS. Feed the thing a symbol and it finds changes *after* that symbol. The regular CVS "log" command has a hard time with this "after" concept. It doesn't look like I hard-coded mod_dav into the script, so it ought to work fine for you. Cheers, -g -- Greg Stein, http://www.lyra.org/ --1658348780-448052300-945715345=:16305 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="gethistory.py" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="gethistory.py" IyEvdXNyL2Jpbi9lbnYgcHl0aG9uDQojDQojIEdldCB0aGUgaGlzdG9yeSBy ZWNvcmRzIGZvciBlYWNoIHJldmlzaW9uIHNpbmNlIGEgc3BlY2lmaWVkIHRh ZyBuYW1lLg0KIw0KIyBVU0FHRTogZ2V0aGlzdG9yeS5weSB0YWduYW1lDQoj DQoNCmltcG9ydCBvcw0KaW1wb3J0IHN5cw0KaW1wb3J0IHJlDQppbXBvcnQg c3RyaW5nDQoNCnRhZyA9IHN5cy5hcmd2WzFdDQoNCl9yZV9nZXR2c24gPSBy ZS5jb21waWxlKCdeXHQlczogKC4qKVxuJCcgJSB0YWcpDQoNCnAgPSBvcy5w b3BlbignY3ZzIGxvZyAyPiAvZGV2L251bGwnLCAncicpDQoNCmN1cmZpbGUg PSBjdXJ2c24gPSBoZWFkID0gTm9uZQ0KaXNfYXR0aWMgPSAwDQpsb2d1bnRp bCA9IE5vbmUNCg0Kd2hpbGUgMToNCiAgbGluZSA9IHAucmVhZGxpbmUoKQ0K ICBpZiBub3QgbGluZToNCiAgICBicmVhaw0KICBpZiBsb2d1bnRpbDoNCiAg ICBpZiBsaW5lID09IGxvZ3VudGlsOg0KICAgICAgbG9ndW50aWwgPSBOb25l DQogICAgZWxzZToNCiAgICAgIHByaW50IGxpbmVbOi0xXQ0KICAgIGNvbnRp bnVlDQogIGlmIGxpbmVbOjldID09ICdSQ1MgZmlsZTonOg0KICAgIGlzX2F0 dGljID0gc3RyaW5nLmZpbmQobGluZSwgJ0F0dGljJykgIT0gLTENCiAgICBj b250aW51ZQ0KICBpZiBsaW5lWzoxM10gPT0gJ1dvcmtpbmcgZmlsZTonOg0K ICAgIGN1cmZpbGUgPSBsaW5lWzE0Oi0xXQ0KICAgIGNvbnRpbnVlDQogIGlm IGxpbmVbOjVdID09ICdoZWFkOic6DQogICAgaGVhZCA9IGxpbmVbNjotMV0N CiAgICBjb250aW51ZQ0KICBtYXRjaCA9IF9yZV9nZXR2c24ubWF0Y2gobGlu ZSkNCiAgaWYgbWF0Y2g6DQogICAgdnNuID0gbWF0Y2guZ3JvdXAoMSkNCiAg ICBjb250aW51ZQ0KICBpZiBsaW5lWzoxMl0gPT0gJ2Rlc2NyaXB0aW9uOic6 DQogICAgaWYgaXNfYXR0aWM6DQogICAgICBjdXJmaWxlID0gY3VydnNuID0g Tm9uZQ0KICAgICAgY29udGludWUNCiAgICBpZiBub3QgY3VyZmlsZSBvciBu b3QgaGVhZDoNCiAgICAgIHByaW50ICdFUlJPUjogYXQgZGVzY3JpcHRpb24s IGJ1dCBubyB3b3JraW5nIGZpbGUgYW5kL29yIGhlYWQgZm91bmQuJw0KICAg ICAgYnJlYWsNCiAgICBpZiBub3QgdnNuOg0KICAgICAgcHJpbnQgY3VyZmls ZSwgJzogLi4nLCBoZWFkLCAnIChub3QgdGFnZ2VkIHdpdGggJXMpJyAlIHRh Zw0KICAgIGVsc2U6DQogICAgICB2c25wYXJ0cyA9IG1hcChpbnQsIHN0cmlu Zy5zcGxpdCh2c24sICcuJykpDQogICAgICB2c25wYXJ0c1stMV0gPSB2c25w YXJ0c1stMV0gKyAxDQogICAgICBoZWFkcGFydHMgPSBtYXAoaW50LCBzdHJp bmcuc3BsaXQoaGVhZCwgJy4nKSkNCiAgICAgIGlmIHZzbnBhcnRzID4gaGVh ZHBhcnRzOg0KICAgICAgICBwcmludCBjdXJmaWxlLCAnOiBubyBjaGFuZ2Vz IHNpbmNlJywgdGFnDQogICAgICBlbHNlOg0KICAgICAgICBuZXd2c24gPSBz dHJpbmcuam9pbihtYXAoc3RyLCB2c25wYXJ0cyksICcuJykNCiAgICAgICAg cHJpbnQNCiAgICAgICAgcHJpbnQgJz09PT09ICVzOiAlcyAuLiAlcyA9PT09 PScgJSAoY3VyZmlsZSwgbmV3dnNuLCBoZWFkKQ0KICAgICAgICBsb2d1bnRp bCA9ICdyZXZpc2lvbiAnICsgdnNuICsgJ1xuJw0KICAgICAgICBwLnJlYWRs aW5lKCkJIyBza2lwIG9uZSBsaW5lDQogICAgY3VyZmlsZSA9IGN1cnZzbiA9 IGhlYWQgPSBOb25lDQogICAgY29udGludWUNCg== --1658348780-448052300-945715345=:16305-- From ken@bitsko.slc.ut.us Mon Dec 20 19:41:49 1999 From: ken@bitsko.slc.ut.us (Ken MacLeod) Date: 20 Dec 1999 13:41:49 -0600 Subject: [XML-SIG] Future plans In-Reply-To: Paul Prescod's message of "Mon, 20 Dec 1999 11:42:23 -0600" References: <199912201637.LAA12992@amarok.cnri.reston.va.us> <385E6A7F.35F71734@prescod.net> Message-ID: Paul Prescod writes: > Is there any virtue in embedding C code in Python 1.6 to build a C > DOM that can be used by qp_xml, 4DOM and PyDOM (if they all > survive). Obviously the object building would be much faster but > the proxying might kill the performance gains. I would be very interested in collaborating with someone on building a C-based library that supports proxying and can be easily embedded into multiple languages or used directly in C/C++. -- Ken MacLeod ken@bitsko.slc.ut.us From akuchlin@mems-exchange.org Mon Dec 20 21:05:36 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Mon, 20 Dec 1999 16:05:36 -0500 (EST) Subject: [XML-SIG] Future plans In-Reply-To: References: <199912201637.LAA12992@amarok.cnri.reston.va.us> <385E6A7F.35F71734@prescod.net> Message-ID: <14430.39456.629097.819673@amarok.cnri.reston.va.us> Ken MacLeod writes: >Paul Prescod writes: >I would be very interested in collaborating with someone on building a >C-based library that supports proxying and can be easily embedded into >multiple languages or used directly in C/C++. What about Xerces (xml.apache.org)? The Web page says "A Perl wrapper is provided for the C++ version of Xerces, which allows access to a fully validating DOM XML parser from Perl," which implies that a wrapped version of the library is usable from a scripting language. I've never looked at the Xerces code; anyone have any experience with it? -- A.M. Kuchling http://starship.python.net/crew/amk/ You mustn't kill me. You don't love me. You d-don't even know me. -- The Furies kill Abel, in SANDMAN #66: "The Kindly Ones:10" From mal@lemburg.com Mon Dec 20 18:55:36 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 20 Dec 1999 19:55:36 +0100 Subject: [XML-SIG] Re: [Distutils] Disposition of C extensions and packages References: <199912190243.VAA03844@207-172-57-180.s180.tnt2.ann.va.dialup.rcn.com> <385D6512.23665CAC@lemburg.com> <991220101657.ZM496939@noah.scripps.edu> Message-ID: <385E7BA8.9BDC7A6@lemburg.com> Michel Sanner wrote: > > On Dec 20, 12:06am, M.-A. Lemburg wrote: > > Subject: Re: [Distutils] Disposition of C extensions and packages > > "A.M. Kuchling" wrote: > > > > > > Questions: > > > > > > 1) A general Python question about packaging style: Is mixing > > > C extensions and Python modules in one package tree a bad > > > idea? It makes the whole tree platform-dependent, which is > > > probably annoying for sites maintaining Python installation > > > for different architectures. > > > > I have been using that setup for two years now with all of > > my mx extensions and so far it has been working great. > > > > If you maintain packages with C extensions for several > > platforms, you can simply install the packages under > > the platform subdirs in /usr/local/lib/python1.5 -- one > > copy for every platform. Disk space is no argument anymore > > nowadays. > > > > Agreed, disk space is not an issue, but file duplication is not very nice. I > think this is a real weakness in Python: on one hand we have platform > independent extensions which are great on the other hand we have native > extensions when performance is required. This is, at least for me (in > scientific computing) what makes Python such a great tool, BUT Python does not > provide a mechanism to split a single package into platform dependent and > platform independent part. We maintain Python and a number of packages for > IRIX, SunOS, Dec Alpha OSF, Win32 and linux ... and having to update 5 .py > files at every bug fix is a real pain. One way to solve this is by editing the __init__.py module of the package containing the C extension and tweaking the __path__ global so that the correct shared modules for the importing platform is found. I've never tried this, but it should work... Note that at least Linux .so files and Windows .pyd files can live happily side-by-side in one directory. With distutils in place this issue should basically disappear altoghether, since then the whole building process would be automated -- not sure whether distutils allows having different setups for different architectures, but given that it is written in Python, it should be possible to automate all aspects of multi-platform installations. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 11 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From ken@bitsko.slc.ut.us Mon Dec 20 22:09:11 1999 From: ken@bitsko.slc.ut.us (Ken MacLeod) Date: 20 Dec 1999 16:09:11 -0600 Subject: [XML-SIG] Future plans In-Reply-To: "Andrew M. Kuchling"'s message of "Mon, 20 Dec 1999 16:05:36 -0500 (EST)" References: <199912201637.LAA12992@amarok.cnri.reston.va.us> <385E6A7F.35F71734@prescod.net> <14430.39456.629097.819673@amarok.cnri.reston.va.us> Message-ID: "Andrew M. Kuchling" writes: > Ken MacLeod writes: > > >I would be very interested in collaborating with someone on > >building a C-based library that supports proxying and can be easily > >embedded into multiple languages or used directly in C/C++. > > What about Xerces (xml.apache.org)? The Web page says "A Perl > wrapper is provided for the C++ version of Xerces, which allows > access to a fully validating DOM XML parser from Perl," which > implies that a wrapped version of the library is usable from a > scripting language. I've never looked at the Xerces code; anyone > have any experience with it? I'd like to find or build a library that is not DOM-based at the bottom, something that could easily support either or both a grove and DOM interface on top of it. The library should also be usable for data sets that have been transformed from XML into application objects (i.e. non-XML objects). Now that I'm writing another message and thinking about it more it occurred to me, ``OK, what is the difference between these new objects and existing Python objects?'' I recall now that the answer was given earlier in the CORBA and getters/setters thread: Python needs a specialized (and optimized) version of __getattr__ and __setattr__ that can handle generated properties as well as data validation. I don't recall who proposed that (the thread was too long to look for it again), I guess I'm buying into helping implement that ;-) This library, then, should have an object implementation that can tell when it needs to call a getter/setter or can just present the underlying field directly. This library should also optimize the "parent" proxy implementation as well. -- Ken MacLeod ken@bitsko.slc.ut.us From a.eyre@optichrome.com Wed Dec 22 10:39:27 1999 From: a.eyre@optichrome.com (Adrian Eyre) Date: Wed, 22 Dec 1999 10:39:27 -0000 Subject: [XML-SIG] 4DOM CORBA-free In-Reply-To: <199912201750.KAA06509@localhost.localdomain> Message-ID: <000201bf4c68$d22a4ac0$3acbd9c2@peridot.optichrome.com> > I forgot to mention that 4DOM is now 100% CORBA-free. Just out of interest, what was the reason for 4DOM using CORBA beforehand? -------------------------------------------- Adrian Eyre http://www.optichrome.com From guido@CNRI.Reston.VA.US Wed Dec 22 20:00:19 1999 From: guido@CNRI.Reston.VA.US (Guido van Rossum) Date: Wed, 22 Dec 1999 15:00:19 -0500 Subject: [XML-SIG] Two weeks Till Python Conference Early Bird Registration Deadline! Message-ID: <199912222000.PAA17264@eric.cnri.reston.va.us> We know that the Python conference isn't until the next millennium. You have exactly two weeks left to register and qualify for the early bird registration. Since most of that time most people are taking off for the holidays, it's really NOW OR NEVER! If you haven't registered and paid by January 5, you will paying full price... So, be smart and register NOW. Also don't forget to book your hotel room by January 3. Some highlights from the conference program: - 8 tutorials on topics ranging from JPython to Fnorb; - a keynote by Open Source evangelist Eric Raymond; - another by Randy Pausch, father of the Alice Virtual Reality project; - a separate track for Zope developers and users; - live demonstrations of important Python applications; - refereed papers, and short talks on current topics; - a developers' day where the feature set of Python 2.0 is worked out. Our motto, due to Bruce Eckel, is: "Life's better without braces." Come and join us at the Key Bridge Marriott in Rosslyn (across the bridge from Georgetown), January 24-27 in 2000. Make the Python conference the first conference you attend in the new millennium! The early bird registration deadline is January 5. More info: http://www.python.org/workshops/2000-01/ The program is now complete with the titles of all presentations. There is still space in the demo session and in the short talks session. --Guido van Rossum (home page: http://www.python.org/~guido/) From walter@data.franken.de Wed Dec 22 16:25:53 1999 From: walter@data.franken.de (Walter Doerwald) Date: Wed, 22 Dec 1999 18:25:53 +0200 Subject: [XML-SIG] Developer's Day In-Reply-To: <012101bf4a49$8c369a80$f29b12c2@secret.pythonware.com> Message-ID: <45919654@data.franken.de> On Sun, 19 Dec 1999 18:50:32 +0100 Fredrik Lundh wrote: > Sean Mc Grath wrote: > > At 12:21 PM 12/18/99 -0600, Paul Prescod wrote: > > > What's the point of standards if implementors violate them willy nill= y? >=20 > professional software development has always been > (and will always be) about making the right tradeoffs. > some examples: > > the "xmllib" tradeoff is "if you have python, it's > there. cannot handle everything, so it's best > to use in cases where you know the source". > > the "sgmlop" tradeoff is "like xmllib, but much > faster." >=20 > the "SXP" tradeoff (this is our upcoming sgmlop > replacement) is "like sgmlop, but usually faster, > fully supports utf-8 and unicode, and is written ^^^^^^^ Does this mean UCS-2/UTF-16 encoded unicode like NotePad does on WinNT? > in pure python 1.6 (!)" When? > [...] Servus... Walter -- Walter D=F6rwald =B7 walter@data.franken.de =B7 Kommunikationnetz Franken e= =2EV. From xml-sig@teleo.net Fri Dec 24 07:05:44 1999 From: xml-sig@teleo.net (Patrick Phalen) Date: Thu, 23 Dec 1999 23:05:44 -0800 Subject: [XML-SIG] Fwd: Re: WISH! Open Source XML Editor [was Re: psgml namespaces andschemas] Message-ID: <99122323203301.00933@quadra.teleo.net> I've just downloaded the Swish XML editor, which is now Open Source. It It looks quite nice. It's written in TCL/Tk; so there should be some potential for this to play nice with Python? The announcement, forwarded from xml-dev: ---------- ## Forwarded Message ## ---------- Subject: Re: WISH! Open Source XML Editor Date: Fri, 24 Dec 1999 16:35:24 +1100 From: Steve Ball Peter Murray-Rust wrote: Wish no longer... an Open Source XML Editor is here right now: Swish! My company, Zveno, recently decided to release our editor as an Open Source Software project. We haven't announced it yet, because we've been pretty busy with other projects, but the source code is available right now at: ftp://ftp.zveno.com/swish/Swish-1.0b5.tar.gz or ftp://ftp.zveno.com/swish/Swish-1.0b5.zip You need to have Tcl/Tk 8.1 (or better) installed to run it. > There is anyway a > shortage of editors at present, and those that do exist are (not > unreasonably) usually tied to a single author's point of view (e.g. > streamed text, hierarchical content, etc.) As far as I know, none of them > are easily extensible at API level, and those that do have APIs will differ > enormously from each other. All quite true. I'll mention my points-of-view and then discuss the Swish project. Firstly, Swish is written in Tcl/Tk. Some people like Tcl, some don't... but one has to choose an implementation language and Tcl provides a number of practical advantages - simplicity, good GUI toolkit, extensibility. One of the most important advantages is packaging; it is quite easy to get Tcl/Tk installed on a platform, and it is relatively easy to create a single executable for folks to download. If you want to argue the relative merits of Tcl/Tk, then perhaps it would be best to email me offline from the list. Another feature of Tcl is that it plays nicely with other languages. For example, using Tcl Blend (the Tcl interface to Java) we could incorporate calls to Java classes such as Java (validating) parsers, XSL processors, etc. The idea here is that Tcl provides a high-level glue language for components provided in Java (or Python, or C++, or...) As far as design choices go, I have modelled some simple UIs (tree view, XML source view) but I am very keen to explore alternatives. That's a major reason for making this package open source. Swish has a plugin system to allow for this kind of extension. APIs are very, very important and it is early days for Swish. Developing a comprehensive plugin API is on my TODO list. Perhaps the Tcl Plugin API should have a corresponding Java API? > The lack of an API for an editor effectively makes it impossible for people > to develop a modular approach. Many of the "non-textual" DTD/schemas will > require specialist editors (my own interest is chemistry, but Math, > Geography/maps, SVG, etc are all similar). We need to be able to > concentrate *just* on the domain-specific parts of our subject, and not to > be concerned with general structural or technical editing. I have recognised that application/domain-specific UIs are going to be extremely important. The interfaces that I have supplied (as for other existing XML editors) are too general-purpose. Again, the plugin mechanism is there to cater for new interfaces. > I have raised this subject from time to time over the last year or two and > haven't found it easy to get interest. Now that XML is really here, editing > is a key requirement for creating documents. For example, I know that there > is pressure to create graduate theses in electronic form - but this is not > easy in XML if there is anything other than text in the thesis. Are there > other readers of this list that understand the problem and do we have a way > forward? Well, I'm "putting my code where my mouth is" to find a way forward. Anyone who is interested is very welcome to contact me. We here at Zveno are working hard (despite the Summer holidays) to get the website updated to support Swish's new status. Please bear with us while we catch up on our workload! Have a great Christmas everyone, and a New Year that's a blast! Cheers, Steve Ball -- Steve Ball | Swish XML Editor | Training & Seminars Zveno Pty Ltd | Web Tcl Complete | XML XSL http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development Steve.Ball@zveno.com +-----------------------+--------------------- Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To unsubscribe, mailto:majordomo@ic.ac.uk the following message; unsubscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) ------------------------------------------------------- From akuchlin@mems-exchange.org Sun Dec 26 18:16:40 1999 From: akuchlin@mems-exchange.org (Andrew M. Kuchling) Date: Sun, 26 Dec 1999 13:16:40 -0500 (EST) Subject: [XML-SIG] Future plans In-Reply-To: <199912201809.LAA06565@localhost.localdomain> References: <199912201637.LAA12992@amarok.cnri.reston.va.us> <199912201809.LAA06565@localhost.localdomain> Message-ID: <14438.23432.224508.779220@amarok.cnri.reston.va.us> uche.ogbuji@fourthought.com writes: >I used to say that PyDOM and 4DOM addressed different users, but over time, >changes to 4DOM have made this less clear. 4DOM is now far more pythonic and >light-weight than when it started out. I'd really like to discuss this issue more, because I've started implementing DOM2 for PyDOM, but I'd hate to expend unneeded effort on such a significant bit of work. Some pros and cons: Pros: * An existing DOM Level 2 implementation. * Maintainers use it actively for real work; PyDOM maintainer has short attention span. * Has XPath and XSLT tools built on top of it. (Paul Prescod wrote a few weeks ago that "Ideally we would have one (or at most two!) implementation of each of the major specs: XML, SAX, Unicode, XPath, XPointer, XSLT, DOM"; if you take 4DOM + 4XSL + 4Path, this would mean that Unicode is the only missing piece.) * Faster than PyDOM * Potential for CORBA support by adding some extra bits Cons: * Does anyone other than the maintainers have any experience with it? Any comments? (If you don't want to slag it off publicly, you can send me unfavorable comments privately, and I'll preserve your anonymity.) * Uses Ft.Dom package name, not xml.dom * Potential incompatibilities with existing code, Sean's book, etc. (But probably a bit of glue code will let us smooth over such problems.) * Requires releasing nodes explicitly * Licensing OK? (Currently it's Python-style, but the 4DOM TODO list implies that this may be reconsidered -- only free software licences are listed as candidates, so I'm not worried about FourThought turning evil.) * Requires that 4Suite base be added to XML-SIG distribution (But the only dependency, at least in the DOM, seems to be on Ft.Lib.TraceOut.) -- A.M. Kuchling http://starship.python.net/crew/amk/ Worlds may freeze and suns may perish, but there stirs something within us now that can never die again. -- H.G. Wells From uche.ogbuji@fourthought.com Mon Dec 27 00:26:18 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Sun, 26 Dec 1999 17:26:18 -0700 Subject: [XML-SIG] Future plans In-Reply-To: Your message of "Sun, 26 Dec 1999 13:16:40 EST." <14438.23432.224508.779220@amarok.cnri.reston.va.us> Message-ID: <199912270026.RAA03421@localhost.localdomain> > uche.ogbuji@fourthought.com writes: > * Does anyone other than the maintainers have any experience with it? > Any comments? (If you don't want to slag it off publicly, you can > send me unfavorable comments privately, and I'll preserve your > anonymity.) I'll be straightforward: I haven't heard of any other serious use of 4DOM besides our internal uses. I know of several small-scale, and incidental cases, but nothing more. This could be because of 4DOM's youth, because it changes so often, because DOM is so inevitable unpythonic, or because it is of poor quality. Whatever the reason, I don't expect that you'll hear from a large chorus of major 4DOM users. > * Uses Ft.Dom package name, not xml.dom > * Potential incompatibilities with existing code, Sean's book, etc. > (But probably a bit of glue code will let us smooth over such > problems.) We've already changed the package name once. It is a pain, but we could do it again. > * Requires releasing nodes explicitly > > * Licensing OK? (Currently it's Python-style, but the 4DOM TODO list > implies that this may be reconsidered -- only free software licences > are listed as candidates, so I'm not worried about FourThought > turning evil.) I really need to take out that TODO item. We wrote it when it was still LGPL and we were looking for better. Now that we picked the Python license it will stay that way. > * Requires that 4Suite base be added to XML-SIG distribution > (But the only dependency, at least in the DOM, seems to be on > Ft.Lib.TraceOut.) Yes, it's only TraceOut. We were considering a tool for stripping TraceOut calls in cases where the millisecond of performance was a concern, but haven't done so. An obvious solution is to remove the trace statements, although I'll sorely miss them while debugging. We could also leave them in and strip them before packaging, or we could move Lib.Traceout into Dom.Ext. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From hannu@tm.ee Mon Dec 27 09:16:29 1999 From: hannu@tm.ee (Hannu Krosing) Date: Mon, 27 Dec 1999 11:16:29 +0200 Subject: [XML-SIG] Future plans References: <199912270026.RAA03421@localhost.localdomain> Message-ID: <38672E6D.C2551A43@tm.ee> uche.ogbuji@fourthought.com wrote: > > > * Requires that 4Suite base be added to XML-SIG distribution > > (But the only dependency, at least in the DOM, seems to be on > > Ft.Lib.TraceOut.) > > Yes, it's only TraceOut. We were considering a tool for stripping TraceOut > calls in cases where the millisecond of performance was a concern, but haven't > done so. An obvious solution is to remove the trace statements, although I'll > sorely miss them while debugging. We could also leave them in and strip them > before packaging, or we could move Lib.Traceout into Dom.Ext. AFAIK the canonical pythonic way would be try: from Ft.Lib import TraceOut except: from fake_ft_lib import TraceOut ---------------- Hannu From paul@prescod.net Mon Dec 27 12:29:35 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 27 Dec 1999 07:29:35 -0500 Subject: [XML-SIG] Future plans References: <199912270026.RAA03421@localhost.localdomain> Message-ID: <38675BAF.E0B1ED7F@prescod.net> uche.ogbuji@fourthought.com wrote: > > I'll be straightforward: I haven't heard of any other serious use of 4DOM > besides our internal uses. I know of several small-scale, and incidental > cases, but nothing more. This could be because of 4DOM's youth, because it > changes so often, because DOM is so inevitable unpythonic, or because it is of > poor quality. Whatever the reason, I don't expect that you'll hear from a > large chorus of major 4DOM users. In my case it is because: * I got PyDOM "for free" with the rest of the XML-SIG's package * Historically, PyDOM has been much easier to install. * I have had no need for PyDOM's extra features. * I actually use groves more than the DOM. Paul Prescod From umesh@itsoft.net Mon Dec 27 14:33:00 1999 From: umesh@itsoft.net (umesh singh) Date: Mon, 27 Dec 1999 20:03:00 +0530 Subject: [XML-SIG] help on install Message-ID: <3867789C.BA6CD99D@itsoft.net> hi ! help needed for installation... tried make -f Makefile.pre.in boot < make: *** No rule to make target `Makefile.pre.in'. Stop. > make -f Makefile.pre.in Makefile VERSION=1.5 installdir=xml < make: Makefile.pre.in: No such file or directory make: *** No rule to make target `Makefile.pre.in'. Stop. > os -redhat user-new --------------------------------------------------- umesh http://www.itsoft.net From tyrsted@daimi.au.dk Mon Dec 27 15:32:18 1999 From: tyrsted@daimi.au.dk (Michael Tyrsted) Date: Mon, 27 Dec 1999 16:32:18 +0100 Subject: [XML-SIG] Pyexpat Message-ID: <001301bf507f$9023d7c0$b811e182@daimi.au.dk> This is a multi-part message in MIME format. ------=_NextPart_000_0010_01BF5087.F15EEB80 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, We are a couple of students that are trying to use webDAV in a project. = For that reason we need a XML-parser in Python. We have downloaded = PyXML-0.5.2.tar.gz found at http://www.python.org/sigs/xml-sig/files/ . = When we try to build the parser we get the following message: poseidon:~...extensions% make -f Makefile.pre.in boot rm -f *.o *~ rm -f `find . -name '*.pyc'` rm -f `find . -name '*.o'` rm -f `find . -name '*~'` cd expat ; make clean make[1]: Entering directory = `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions/expat' rm -f xmltok/xmltok.o xmltok/xmlrole.o xmlwf/xmlwf.o xmlwf/xmlfile.o = xmlwf/codepage.o xmlparse/xmlparse.o xmlparse/hashtable.o = xmlwf/unixfilemap.o xmlwf/xmlwf make[1]: Leaving directory = `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions/expat' rm -f *.a tags TAGS config.c Makefile.pre python sedscript rm -f *.so *.sl so_locations cd expat ; make clobber make[1]: Entering directory = `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions/expat' rm -f xmltok/xmltok.o xmltok/xmlrole.o xmlwf/xmlwf.o xmlwf/xmlfile.o = xmlwf/codepage.o xmlparse/xmlparse.o xmlparse/hashtable.o = xmlwf/unixfilemap.o xmlwf/xmlwf rm -f libexpat.a make[1]: Leaving directory = `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions/expat' VERSION=3D`python -c "import sys; print sys.version[:3]"`; \ installdir=3D`python -c "import sys; print sys.prefix"`; \ exec_installdir=3D`python -c "import sys; print sys.exec_prefix"`; \ make -f ./Makefile.pre.in VPATH=3D. srcdir=3D. \ VERSION=3D$VERSION \ installdir=3D$installdir \ exec_installdir=3D$exec_installdir \ Makefile make[1]: Entering directory = `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions' sed -n \ -e '1s/.*/1i\\/p' \ -e '2s%.*%# Generated automatically from Makefile.pre.in by = sedscript.%p' \ -e '/^VERSION=3D/s/^VERSION=3D[ ]*\(.*\)/s%@VERSION[@]%\1%/p' \ -e '/^CC=3D/s/^CC=3D[ ]*\(.*\)/s%@CC[@]%\1%/p' \ -e '/^CCC=3D/s/^CCC=3D[ ]*\(.*\)/s%#@SET_CCC[@]%CCC=3D\1%/p' \ -e '/^LINKCC=3D/s/^LINKCC=3D[ ]*\(.*\)/s%@LINKCC[@]%\1%/p' \ -e '/^SGI_ABI=3D/s/^SGI_ABI=3D[ ]*\(.*\)/s%@SGI_ABI[@]%\1%/p' \ -e '/^OPT=3D/s/^OPT=3D[ ]*\(.*\)/s%@OPT[@]%\1%/p' \ Can't open=20 make[1]: *** [sedscript] Error 1 make[1]: Leaving directory = `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions' make: *** [boot] Error 2 poseidon:~...extensions%=20 What are we doing wrong? We are trying to build the parser on a SGI Irix = machine, is that a problem? In advance Thanks Michael Tyrsted UNIVERSITY OF AARHUS DEPARTMENT OF COMPUTER SCIENCE=20 ------=_NextPart_000_0010_01BF5087.F15EEB80 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi,
 
We are a couple of students that are = trying to use=20 webDAV in a project. For that reason we need a XML-parser in = Python. We=20 have downloaded PyXML-0.5.2.tar.gz found at http://www.python.org/= sigs/xml-sig/files/ .=20 When we try to build the parser we get the following = message:
 
poseidon:~...extensions% make -f = Makefile.pre.in=20 boot
rm -f *.o *~
rm -f `find . -name '*.pyc'`
rm -f `find . = -name=20 '*.o'`
rm -f `find . -name '*~'`
cd expat ; make clean
make[1]: = Entering directory=20 `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions/expat'
rm = -f=20 xmltok/xmltok.o xmltok/xmlrole.o xmlwf/xmlwf.o xmlwf/xmlfile.o = xmlwf/codepage.o=20 xmlparse/xmlparse.o xmlparse/hashtable.o xmlwf/unixfilemap.o=20 xmlwf/xmlwf
make[1]: Leaving directory=20 `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions/expat'
rm = -f *.a=20 tags TAGS config.c Makefile.pre python sedscript
rm -f *.so *.sl=20 so_locations
cd expat ; make clobber
make[1]: Entering directory=20 `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions/expat'
rm = -f=20 xmltok/xmltok.o xmltok/xmlrole.o xmlwf/xmlwf.o xmlwf/xmlfile.o = xmlwf/codepage.o=20 xmlparse/xmlparse.o xmlparse/hashtable.o xmlwf/unixfilemap.o = xmlwf/xmlwf
rm=20 -f libexpat.a
make[1]: Leaving directory=20 `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions/expat'
VERS= ION=3D`python=20 -c "import sys; print sys.version[:3]"`; \
installdir=3D`python -c = "import sys;=20 print sys.prefix"`; \
exec_installdir=3D`python -c "import sys; print = sys.exec_prefix"`; \
make -f ./Makefile.pre.in VPATH=3D. srcdir=3D.=20 \
        VERSION=3D$VERSION=20 \
        installdir=3D$installdir = \
        = exec_installdir=3D$exec_installdir=20 \
        Makefile
make[1]: = Entering=20 directory = `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions'
sed -n=20 \
 -e '1s/.*/1i\\/p' \
 -e '2s%.*%# Generated = automatically from=20 Makefile.pre.in by sedscript.%p' \
 -e=20 '/^VERSION=3D/s/^VERSION=3D[    = ]*\(.*\)/s%@VERSION[@]%\1%/p'=20 \
 -e '/^CC=3D/s/^CC=3D[     =20 ]*\(.*\)/s%@CC[@]%\1%/p' \
 -e = '/^CCC=3D/s/^CCC=3D[   =20 ]*\(.*\)/s%#@SET_CCC[@]%CCC=3D\1%/p' \
 -e=20 '/^LINKCC=3D/s/^LINKCC=3D[     =20 ]*\(.*\)/s%@LINKCC[@]%\1%/p' \
 -e=20 '/^SGI_ABI=3D/s/^SGI_ABI=3D[    = ]*\(.*\)/s%@SGI_ABI[@]%\1%/p'=20 \
 -e '/^OPT=3D/s/^OPT=3D[    = ]*\(.*\)/s%@OPT[@]%\1%/p'=20 \
Can't open
make[1]: *** [sedscript] Error 1
make[1]: Leaving = directory = `/users/tyrsted/public_html/scripts/PyXML-0.5.2/extensions'
make:=20 *** [boot] Error 2
poseidon:~...extensions%
What are we doing wrong? We are trying = to build the=20 parser on a SGI Irix machine, is that a problem?
 
In advance Thanks
 
Michael Tyrsted
 
UNIVERSITY OF=20 AARHUS
DEPARTMENT OF COMPUTER SCIENCE=20
------=_NextPart_000_0010_01BF5087.F15EEB80-- From uche.ogbuji@fourthought.com Mon Dec 27 16:04:41 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 27 Dec 1999 09:04:41 -0700 Subject: [XML-SIG] Future plans In-Reply-To: Your message of "Mon, 27 Dec 1999 11:16:29 +0200." <38672E6D.C2551A43@tm.ee> Message-ID: <199912271604.JAA00961@localhost.localdomain> > uche.ogbuji@fourthought.com wrote: > > > * Requires that 4Suite base be added to XML-SIG distribution > > > (But the only dependency, at least in the DOM, seems to be on > > > Ft.Lib.TraceOut.) > > = > > Yes, it's only TraceOut. We were considering a tool for stripping Tr= aceOut > > calls in cases where the millisecond of performance was a concern, bu= t haven't > > done so. An obvious solution is to remove the trace statements, alth= ough I'll > > sorely miss them while debugging. We could also leave them in and st= rip them > > before packaging, or we could move Lib.Traceout into Dom.Ext. > = > AFAIK the canonical pythonic way would be > = > try: > from Ft.Lib import TraceOut > except: > from fake_ft_lib import TraceOut Well, if Traceout is moved to, say xml.dom.ext, we can just use that = throughout 4DOM. To be sure, I'd like to see TraceOut or something like = it in = the standard Python library. It is _really_ nice for debugging to be abl= e to = turn tracing on and off at the command line by manipulating environment = variables, and we've used the TIMER option as a cheap profiler when the = official Python profiler was giving dodgy results. -- = Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Mon Dec 27 16:18:25 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 27 Dec 1999 09:18:25 -0700 Subject: [XML-SIG] Future plans In-Reply-To: Your message of "Mon, 27 Dec 1999 07:29:35 EST." <38675BAF.E0B1ED7F@prescod.net> Message-ID: <199912271618.JAA01004@localhost.localdomain> Paul Prescod: > * I actually use groves more than the DOM. BTW, do you use the GPS package for your groves? We have actually been taking a hard look at groves and GPS ever since your groves-advocacy post a few months ago on XMLDEV and since Geir's recent GPS announcement. I think we need to become clearer on a few matters before we proceed more confidently, though. Have you considered adding a few paragraphs on inheritance (in the pure sense, and not funked by C++'s confusions) to your groves short tutorial? We use a lot of data-polymorphism in our designs and we've had a hard time figuring out how to model that in the grove view. About groves in general, I think it's a wonderful model, and quite natural, if not for all the very odd terminology and inscrutable language in the standards documents. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Mon Dec 27 16:19:09 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Mon, 27 Dec 1999 09:19:09 -0700 Subject: [XML-SIG] Future plans In-Reply-To: Your message of "Mon, 27 Dec 1999 07:29:35 EST." <38675BAF.E0B1ED7F@prescod.net> Message-ID: <199912271619.JAA01019@localhost.localdomain> > Uche Ogbuji: > > I'll be straightforward: I haven't heard of any other serious use of 4DOM > > besides our internal uses. I know of several small-scale, and incidental > > cases, but nothing more. This could be because of 4DOM's youth, because it > > changes so often, because DOM is so inevitable unpythonic, or because it is of > > poor quality. Whatever the reason, I don't expect that you'll hear from a > > large chorus of major 4DOM users. Paul Prescod: > In my case it is because: > > * I got PyDOM "for free" with the rest of the XML-SIG's package > * Historically, PyDOM has been much easier to install. > * I have had no need for PyDOM's extra features. > * I actually use groves more than the DOM. Well, this is good to know. Note that 4DOM is now almost as easy to install as PyDOM (just untar/unzip to your PYTHONPATH both 4Suite-base and 4DOM), although you have already noted the platform-bias of 4XSLT and 4XPath, which we expect to address in a few weeks by distributing Windows binaries. Other than that, your reasons are, I expect, typical (besides the groves part). If 4DOM does become the core Python DOM I expect its use to increase quite a bit. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From dieter@handshake.de Mon Dec 27 16:20:45 1999 From: dieter@handshake.de (Dieter Maurer) Date: Mon, 27 Dec 1999 17:20:45 +0100 (CET) Subject: [XML-SIG] Future plans In-Reply-To: <14438.23432.224508.779220@amarok.cnri.reston.va.us> References: <199912201809.LAA06565@localhost.localdomain> <14438.23432.224508.779220@amarok.cnri.reston.va.us> Message-ID: <14439.36351.242089.334220@lindm.dm> Andrew M. Kuchling writes: > I'd really like to discuss this issue more, because I've started > implementing DOM2 for PyDOM, but I'd hate to expend unneeded effort on > such a significant bit of work. Some pros and cons: I did not yet use 4DOM. But I decided to go for it the next time I will seriously work with XML/XSL. I have effectively discontinued work on my PyXPath implementation on top of PyDOM, because only very serious reasons justify the duplication of effort. As for the explicite memory management necessary for 4DOM: It is very easy to introduce cycles in Python's data structures. I would expect that many non-trivial Python programs contain such cycles and leak memory -- without dramatic effect. I can live with explicite memory management for DOM trees, even though they tend to be rather large structures. For short running programs, this will not be a problem. When it becomes a problem, I will find the most pressing leaks. - Dieter From gstein@lyra.org Tue Dec 28 22:26:31 1999 From: gstein@lyra.org (Greg Stein) Date: Tue, 28 Dec 1999 14:26:31 -0800 (PST) Subject: [XML-SIG] Pyexpat In-Reply-To: <001301bf507f$9023d7c0$b811e182@daimi.au.dk> Message-ID: On Mon, 27 Dec 1999, Michael Tyrsted wrote: >... > We are a couple of students that are trying to use webDAV in a > project. For that reason we need a XML-parser in Python. We have > downloaded PyXML-0.5.2.tar.gz found at > http://www.python.org/sigs/xml-sig/files/ . When we try to build the > parser we get the following message: It looks like you do not have the Python source on your machine. You need that in order to build modules. I see you're on SGI Irix, but on my RedHat box, I have installed the "python" and "python-devel" packages. The latter package adds /usr/lib/python1.5/config/ to my system. That directory contains the necessary libraries, config files, and makefiles to build additional Python extension modules (such as pyexpat). Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Wed Dec 29 13:00:43 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 08:00:43 -0500 Subject: [XML-SIG] Future plans References: <199912201809.LAA06565@localhost.localdomain> <14438.23432.224508.779220@amarok.cnri.reston.va.us> <14439.36351.242089.334220@lindm.dm> Message-ID: <386A05FB.E7EB1ABD@prescod.net> Dieter Maurer wrote: > > I have effectively discontinued work > on my PyXPath implementation on top of PyDOM, because only > very serious reasons justify the duplication of effort. Dieter, is there any chance that your work could turn into a Python parser for XPaths that could be used in place of the 4thought one? I'm concerned about the portability of their C code. Even once they compile for Windows, there is still the Mac and other funky stuff. It might be nice to have a Python "backup" as with Pickle/CPickle, xmllib/sgmlop and so forth. > As for the explicite memory management necessary for 4DOM: > It is very easy to introduce cycles in Python's data structures. > I would expect that many non-trivial Python programs > contain such cycles and leak memory -- without dramatic > effect. I can live with explicite memory management for DOM trees, > even though they tend to be rather large structures. I would like to suggest again the idea that for simple uses we could use a proxy and that sophisticated users could "ask for" a fast version and get back an unproxied version that they must release. My concern is for newbies. They are thinking "XML documents" not "cycles." The default should be safe but a little slow. Paul Prescod From paul@prescod.net Wed Dec 29 14:08:08 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 29 Dec 1999 09:08:08 -0500 Subject: [XML-SIG] Groves References: <199912271618.JAA01004@localhost.localdomain> Message-ID: <386A15C8.A3657F80@prescod.net> uche.ogbuji@fourthought.com wrote: > > Paul Prescod: > > > * I actually use groves more than the DOM. > > BTW, do you use the GPS package for your groves? No, I used groves long before GPS came about and GPS has one big difference from the three implementations I've used in the past. It uses item syntax for fetching properties instead of attr syntax. Geir says that this is because Zope doesn't like __getattr__ overrides. Nobody has had time to do the research necessary to resolve this issue. > Have you considered adding a few paragraphs on inheritance (in the pure sense, > and not funked by C++'s confusions) to your groves short tutorial? We use a > lot of data-polymorphism in our designs and we've had a hard time figuring out > how to model that in the grove view. Currently the grove view is flat. Groves have no data modelling capabilities that do not directly relate to addressing. In my humble opinion someone should probably take the best ideas from groves and put them on top of a richer data modelling language like RDF schemas or OMG object definition language. There are too many ways to spell "integer" and "attribute" in the world. > About groves in general, I think it's a wonderful model, and quite natural, if > not for all the very odd terminology and inscrutable language in the standards > documents. At least the specification is extremely formal and precise! Paul Prescod From jim@digicool.com Wed Dec 29 15:01:47 1999 From: jim@digicool.com (Jim Fulton) Date: Wed, 29 Dec 1999 15:01:47 +0000 Subject: [XML-SIG] Groves References: <199912271618.JAA01004@localhost.localdomain> <386A15C8.A3657F80@prescod.net> Message-ID: <386A225B.E8111DC1@digicool.com> Paul Prescod wrote: > > uche.ogbuji@fourthought.com wrote: > > > > Paul Prescod: > > > > > * I actually use groves more than the DOM. > > > > BTW, do you use the GPS package for your groves? > > No, I used groves long before GPS came about and GPS has one big > difference from the three implementations I've used in the past. It uses > item syntax for fetching properties instead of attr syntax. Geir says > that this is because Zope doesn't like __getattr__ overrides. Nobody has > had time to do the research necessary to resolve this issue. Right, __getattr__ is already overridden by Zope. It might be possible to use computed attributes to achiev this. This is what we plan to do (have done?) in Zope's internal DOM implementation. If the attribute names are fixed and fetching is all that's needed, then computed attributes should do the trick: from ComputedAttribute import ComputedAttribute from Persistence import Persistent class Foo(Persistent): # Any ExtensionClass base will do # a name attribute that actually uses __name__ name=ComputedAttribute(lambda self: self.__name__) I haven't looked at GPS yet :(, but I suspect it uses acquisition to handle parent links without creating cycles. If it does, then it wants to use computed attributes that are acquisition aware: from ComputedAttribute import ComputedAttribute from Persistence import Persistent from Acquisition import Explicit class Foo(Persistent, Explicit): # a parent attribute that uses self.aq_parent parent=ComputedAttribute( lambda self: self.aq_parent, # the implementation 1 # a bit of magic that makes this work with acquisition, # Unfortunately, this makes it require acquisition.... ) Jim -- Jim Fulton mailto:jim@digicool.com Technical Director (888) 344-4332 Python Powered! Digital Creations http://www.digicool.com http://www.python.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats. From uche.ogbuji@fourthought.com Wed Dec 29 15:06:36 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Wed, 29 Dec 1999 08:06:36 -0700 Subject: [XML-SIG] Future plans In-Reply-To: Your message of "Wed, 29 Dec 1999 08:00:43 EST." <386A05FB.E7EB1ABD@prescod.net> Message-ID: <199912291506.IAA02594@localhost.localdomain> > Dieter Maurer wrote: > > I have effectively discontinued work > > on my PyXPath implementation on top of PyDOM, because only > > very serious reasons justify the duplication of effort. Paul Prescod: > Dieter, is there any chance that your work could turn into a Python > parser for XPaths that could be used in place of the 4thought one? I'm > concerned about the portability of their C code. Even once they compile > for Windows, there is still the Mac and other funky stuff. It might be > nice to have a Python "backup" as with Pickle/CPickle, xmllib/sgmlop and > so forth. Multiple versions are always nice if the maintainers don't mind duplicating the work: choice is good as any of the users of the 8 or 9 XML libraries for Java or even XSLT processors for Java will attest. However, I hardly think that sheer portability will be any more of an obstacle for 4XPath as it is for any other C-based technology. Python itself uses C/Bison and is portable to multiple platforms, including "the MAC and other funky stuff". I don't see why 4XSLT should be any different. It uses no Posix commands, just the basic C library. For platforms where Bison is hard to come by, the obvious solution is to generate the C parser source in Unix or Windows and compile them in the other platform. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Wed Dec 29 15:41:15 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Wed, 29 Dec 1999 08:41:15 -0700 Subject: [XML-SIG] Groves In-Reply-To: Your message of "Wed, 29 Dec 1999 09:08:08 EST." <386A15C8.A3657F80@prescod.net> Message-ID: <199912291541.IAA02695@localhost.localdomain> > uche.ogbuji@fourthought.com wrote: > > Have you considered adding a few paragraphs on inheritance (in the pu= re sense, > > and not funked by C++'s confusions) to your groves short tutorial? W= e use a > > lot of data-polymorphism in our designs and we've had a hard time fig= uring out > > how to model that in the grove view. > = > Currently the grove view is flat. Groves have no data modelling > capabilities that do not directly relate to addressing. In my humble > opinion someone should probably take the best ideas from groves and put= > them on top of a richer data modelling language like RDF schemas or OMG= > object definition language. There are too many ways to spell "integer" > and "attribute" in the world. Thanks. Unfortunately, this probably puts practical use of the pure grov= e = model on hold for us. We are actually working heavily with RDF in a curr= ent = project, using internal Python/RDF tools that the Python community will = probably see soon in OSS form, and we'll probably continue to use RDF = directly. I shall continue to study the grove model, however, as a means= of = thinking as clearly as possible about data. -- = Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From mclay@nist.gov Wed Dec 29 18:59:40 1999 From: mclay@nist.gov (Michael McLay) Date: Wed, 29 Dec 1999 13:59:40 -0500 (EST) Subject: [XML-SIG] XMI question In-Reply-To: <386A15C8.A3657F80@prescod.net> References: <199912271618.JAA01004@localhost.localdomain> <386A15C8.A3657F80@prescod.net> Message-ID: <14442.23068.97593.547798@fermi.eeel.nist.gov> Paul Prescod writes: > > Currently the grove view is flat. Groves have no data modelling > capabilities that do not directly relate to addressing. In my humble > opinion someone should probably take the best ideas from groves and put > them on top of a richer data modelling language like RDF schemas or OMG > object definition language. There are too many ways to spell "integer" > and "attribute" in the world. It would be helpful to here about the pros and cons of various object modeling notations. The DTD notation is not sufficient when modeling engineering data that is to be encoded using XML. This maillist has had references to XML Schema, DTDs, adn RDF, but the focus of the discussion hasn't made it clear which is the right choice for representing engineering design data. The lead candidate for me is the XML Metadata Interchange (XMI) format. It provides an XML based notation for spelling "integer" and "attribute". My concern with making this choice is that I haven't seen any discussion of XMI on this list. I would expect XMI to be a good candidate for use in the Python XML toolkit considering the pedigree of XMI? The OMG is developing the XMI for exchanging model information between UML modeling tools. IBM has a tool[1] for moving data from Rational Rose to XMI and back. There web site is a bit out of date but they have an active mailing list. Have I missed out on something that is wrong with XMI. It looks promising, but the Python XML experts haven't touched it as far as I can tell. The XMI format seems to be catching on in the UML community. ObjectDomain has it high on their list as an import/export format and the Argo/UML project at http://www.ics.uci.edu/pub/arch/uml/ is developing an open source UML tool that uses XMI as the native file format. OASIS has a good repository of information on XMI at http://www.oasis-open.org/cover/xmi.html I found quite a bit of useful information on XMI and UML tools at http://www.objectsbydesign.com/ In particular the pages http://www.objectsbydesign.com/projects/xmi_to_html.html and http://www.objectsbydesign.com/tools/umltools_byPrice.html [1] http://www.alphaworks.ibm.com/tech/xmitoolkit From fredrik@pythonware.com Wed Dec 29 17:35:54 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Wed, 29 Dec 1999 18:35:54 +0100 Subject: [XML-SIG] Developer's Day Message-ID: <005301bf5223$2b71bb30$f29b12c2@secret.pythonware.com> (I'm currently not on this list, so it took me a while to notice this post. please cc any followups to me): Walter Doerwald wrote: > > the "SXP" tradeoff (this is our upcoming sgmlop > > replacement) is "like sgmlop, but usually faster, > > fully supports utf-8 and unicode, and is written > ^^^^^^^ > Does this mean UCS-2/UTF-16 encoded unicode like > NotePad does on WinNT? yes -- thanks to the new unicode string type that is currently being added to the python core. the SXP library relies on this and the new unicode reg- exp engine (SRE). > > in pure python 1.6 (!)" > > When? afaik, the unicode and SRE sources will appear in the CVS version early next year. SXP should be available soon thereafter. as for when there will be an official 1.6 release, your guess is as good as mine ;-) From uche.ogbuji@fourthought.com Wed Dec 29 17:31:47 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Wed, 29 Dec 1999 10:31:47 -0700 Subject: [XML-SIG] XMI question In-Reply-To: Your message of "Wed, 29 Dec 1999 13:59:40 EST." <14442.23068.97593.547798@fermi.eeel.nist.gov> Message-ID: <199912291731.KAA03133@localhost.localdomain> > Paul Prescod writes: > > Currently the grove view is flat. Groves have no data modelling > > capabilities that do not directly relate to addressing. In my humble > > opinion someone should probably take the best ideas from groves and put > > them on top of a richer data modelling language like RDF schemas or OMG > > object definition language. There are too many ways to spell "integer" > > and "attribute" in the world. > > It would be helpful to here about the pros and cons of various object > modeling notations. The DTD notation is not sufficient when modeling > engineering data that is to be encoded using XML. This maillist has > had references to XML Schema, DTDs, adn RDF, but the focus of the > discussion hasn't made it clear which is the right choice for > representing engineering design data. This is an on-going discussion across industries and communities, and any discussion here would merely point to more cogent discussion in the worlds of the OMG, W3C, Dublin Core, ISO, etc. The candidates include as diverse entrants as RDF, UML (of which XMI is just a serialization), ISO Basic Semantic Repository, and groves. Note that there are various differences in the core domain of all these examples, and that is part of the debate. The world has discovered intelligent data and the fruit is falling in bunches. > The lead candidate for me is the XML Metadata Interchange (XMI) > format. I would quite disagree. First of all, I think that for most purposes, UxF, a David against the XMI Goliath, is a better (certainly simpler) serialization of the UML meta-model for several purposes. But even if you are merely advocating UML, I would claim that even though I'm an experienced OO developer and a strong advocate of OO over other methodologies for _application_ design, I think it is woefully inadequate for generalized data modeling. True, some of its shortcomings help present brain explosion (for instance, even RDF avoids N-ary relationships) but it often imposes unnatural constraints on data, such as those encoded in traditional OO sub-typing. The solution is to go one step lower. OO builds a model specialized for app development on top of such more fundamental models as conceptual graphs and RDF. For general data modeling, I suggest these latter models, since they are more general and powerful than the OO sub-set. > It provides an XML based notation for spelling "integer" and > "attribute". My concern with making this choice is that I haven't > seen any discussion of XMI on this list. I would expect XMI to be a > good candidate for use in the Python XML toolkit considering the > pedigree of XMI? Not a fraction of the pedigree of RDF, which builds on work of the W3C, IETF, ISO, Dublin Core, etc. > The OMG is developing the XMI for exchanging model > information between UML modeling tools. IBM has a tool[1] for moving > data from Rational Rose to XMI and back. There web site is a bit out > of date but they have an active mailing list. I will admit that RDF visualization tools need to progress a bit. The one at the W3C absolutely bites. It would be a neat coup for Python if we could create the first usable RDF visualization system. Any graphic layout wizards out there? > Have I missed out on something that is wrong with XMI. It looks > promising, but the Python XML experts haven't touched it as far as I > can tell. > The XMI format seems to be catching on in the UML community. > ObjectDomain has it high on their list as an import/export format and > the Argo/UML project at http://www.ics.uci.edu/pub/arch/uml/ is > developing an open source UML tool that uses XMI as the native file > format. FourThought actually uses Argo/UML and thus XMI, but we have a preliminary style-sheet to convert it to UxF, which we'll use as a primary format. Again, this is purely for app-development. For data and meta-data, it won't do. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From mclay@nist.gov Thu Dec 30 00:41:08 1999 From: mclay@nist.gov (Michael McLay) Date: Wed, 29 Dec 1999 19:41:08 -0500 (EST) Subject: [XML-SIG] XMI question In-Reply-To: <199912291731.KAA03133@localhost.localdomain> References: <14442.23068.97593.547798@fermi.eeel.nist.gov> <199912291731.KAA03133@localhost.localdomain> Message-ID: <14442.43556.73267.362667@fermi.eeel.nist.gov> Thanks for the reply. It was most helpful. uche.ogbuji@fourthought.com writes: > > It would be helpful to here about the pros and cons of various object > > modeling notations. The DTD notation is not sufficient when modeling > > engineering data that is to be encoded using XML. This maillist has > > had references to XML Schema, DTDs, adn RDF, but the focus of the > > discussion hasn't made it clear which is the right choice for > > representing engineering design data. > > This is an on-going discussion across industries and communities, and > any discussion here would merely point to more cogent discussion in > the worlds of the OMG, W3C, Dublin Core, ISO, etc. The candidates > include as diverse entrants as RDF, UML (of which XMI is just a > serialization), ISO Basic Semantic Repository, and groves. Note that > there are various differences in the core domain of all these > examples, and that is part of the debate. The world has discovered > intelligent data and the fruit is falling in bunches. > > > The lead candidate for me is the XML Metadata Interchange (XMI) > > format. > > I would quite disagree. First of all, I think that for most purposes, > UxF, a David against the XMI Goliath, is a better (certainly simpler) > serialization of the UML meta-model for several purposes. But even if > you are merely advocating UML, I would claim that even though I'm an > experienced OO developer and a strong advocate of OO over other > methodologies for _application_ design, I think it is woefully > inadequate for generalized data modeling. The perception of these technologies vary with personal experience. That is why I asked the question. I was interested in hearing a Python community discussion of the topic because I need some sense of what is important for Python. Why it is important is also of keen interest. I hadn't heard of UxF before this. I took a quick look and I agree with you on the subject of serialization. It looks like it would be much easier to extract the relevant data from the UxF representation. Do any UML tools output UxF? I suspect it should be pretty easy to add it as an output format from ObjectDomain. > True, some of its shortcomings help present brain explosion (for > instance, even RDF avoids N-ary relationships) but it often imposes > unnatural constraints on data, such as those encoded in traditional OO > sub-typing. The solution is to go one step lower. OO builds a model > specialized for app development on top of such more fundamental models > as conceptual graphs and RDF. For general data modeling, I suggest > these latter models, since they are more general and powerful than the > OO sub-set. The RDF capabilities may be well suited to document representation, but the mapping of data from a CAD or CAM tool to the RDF notation doesn't look very appealing. I agree with the comment about unnatural constraints. > > It provides an XML based notation for spelling "integer" and > > "attribute". My concern with making this choice is that I haven't > > seen any discussion of XMI on this list. I would expect XMI to be a > > good candidate for use in the Python XML toolkit considering the > > pedigree of XMI? > > Not a fraction of the pedigree of RDF, which builds on work of the > W3C, IETF, ISO, Dublin Core, etc. IETF and W3C are important to web publishing, but but OMG and the UML communities are important to other sectors of technology. The representation of engineering data is one example. > > The OMG is developing the XMI for exchanging model > > information between UML modeling tools. IBM has a tool[1] for moving > > data from Rational Rose to XMI and back. There web site is a bit out > > of date but they have an active mailing list. > > I will admit that RDF visualization tools need to progress a bit. The > one at the W3C absolutely bites. It would be a neat coup for Python > if we could create the first usable RDF visualization system. Any > graphic layout wizards out there? It's not the visualization tools that are troublesome. I'm concerned that the focus of the technology may not be well suited to encoding the manufacturing information that is required to manufacture a printed circuit board or printed circuit assembly. I'm still investigating the RDF documents so maybe I haven't found what I'm looking for. > > FourThought actually uses Argo/UML and thus XMI, but we have a > preliminary style-sheet to convert it to UxF, which we'll use as a > primary format. Again, this is purely for app-development. For data > and meta-data, it won't do. Cool, is the style-sheet available? What do you mean by it is purely for app-development? My primarily interest is in defining a product data file format that would be read in by an application like a CAM tool. Are you saying UxF "won't do" for this type of problem? It looks like UxF contains exactly the data I need. My application's import/export format consists of several hundred entities that are heavily intertwined. The model is completed and I have a Python program that can read in the meta-data from the current BNF-like notation. There is also an Express model of the same entities. The Express model was manually created using an Express-G tool. Recreating the model in Express was a painful task and post-processing tools for Express are primitive at best. My hope is to improve the reuse of the model information by dumping the current meta-data out of the Python tool into XMI or UxF. The model would then be read into a tool like Argo/UML or ObjectDomain. Once the conversion is complted future changes would be maintained using UML and the old BNF-like notation would be abandoned. Using UML tools should make it much easier to create a tool chain that transforms the model into software for manipulating the manufacturing design data. Time to head for the Y2K bunker. Assuming society doesn't melt down I'll be back on Monday. From tpassin@idsonline.com Thu Dec 30 04:00:18 1999 From: tpassin@idsonline.com (Thomas B. Passin) Date: Wed, 29 Dec 1999 23:00:18 -0500 Subject: [XML-SIG] XMI question References: <199912291731.KAA03133@localhost.localdomain> Message-ID: <001d01bf527a$65b03e00$ca2a08d1@tomshp> wrote > I would quite disagree. First of all, I think that for most purposes, UxF, a > David against the XMI Goliath, is a better (certainly simpler) serialization > of the UML meta-model for several purposes. But even if you are merely > advocating UML, I would claim that even though I'm an experienced OO developer > and a strong advocate of OO over other methodologies for _application_ design, > I think it is woefully inadequate for generalized data modeling. > Yes, there's more to data modeling than schemas, even OO schemas. > True, some of its shortcomings help present brain explosion (for instance, > even RDF avoids N-ary relationships) but it often imposes unnatural > constraints on data, such as those encoded in traditional OO sub-typing. The > solution is to go one step lower. OO builds a model specialized for app > development on top of such more fundamental models as conceptual graphs and > RDF. For general data modeling, I suggest these latter models, since they are > more general and powerful than the OO sub-set. > Nice to hear that someone else looks favorably on conceptual graphs. I assume you mean Sowa's conceptual graphs? For developing data models, I find I like to use an informal hybrid between high-level OML (Open Modeling Language - something like UML but cleaner and more adaptable) and conceptual graphs. Once the real shape of the model becomes clear, you can decide if you want to realize it as a relational database, an object database, or whatever. Conceptual graphs can be expressed in KIF (Knowledge Interchange Format) or in their own serialization format. On the other hand, UML, OML, and conceptual graphs are all implemented graphically as boxes (nodes), lines (arcs), and annotations. So it would seem that any language that can represent arbitrary collections of node and arcs should be able to represent CGs, KIF, etc. I've been speculating whether KIF can be represented in XML -haven't done anything but speculate so far, though. Sowa showed how to extend CGs to 2nd order logic, which is enough to allow them to represent schemas. So there might be another route to marry schemas and XML, via CGs and/or KIF. Regards, Tom Passin From dieter@handshake.de Thu Dec 30 10:44:23 1999 From: dieter@handshake.de (Dieter Maurer) Date: Thu, 30 Dec 1999 11:44:23 +0100 (CET) Subject: [XML-SIG] Future plans In-Reply-To: <386A05FB.E7EB1ABD@prescod.net> References: <199912201809.LAA06565@localhost.localdomain> <386A05FB.E7EB1ABD@prescod.net> Message-ID: <14443.12000.615017.586079@lindm.dm> Paul Prescod writes: > Dieter, is there any chance that your work could turn into a Python > parser for XPaths that could be used in place of the 4thought one? My parser is based on Scott Hassan's (mailto:hassan@cs.stanford.edu) PyBison package. This package has an unknown copying policy. Some analysis would be necessary to determine whether the 4Thought parser can be replaced with this one. Following suggestions in this list, my parser uses a factory to create XPath objects as result of parsing. Replacing the factory should allow to adapt it easily to a different framework. Thus, in principle, it should be possible. The factory, of cause, ties together parser and framework. Changes in the framework are likely to call for changes in the factory placing a burden on the maintainer. I will take this burden only if the pure 4Thought solution will not work in my environments (privately: Linux 2; at work: Solaris, Windows and (fading out) Macs). As told in a previous post, I do not yet have experience with the 5Thought packages. - Dieter From dieter@handshake.de Thu Dec 30 11:11:42 1999 From: dieter@handshake.de (Dieter Maurer) Date: Thu, 30 Dec 1999 12:11:42 +0100 (CET) Subject: [XML-SIG] Future plans In-Reply-To: <199912291506.IAA02594@localhost.localdomain> References: <386A05FB.E7EB1ABD@prescod.net> <199912291506.IAA02594@localhost.localdomain> Message-ID: <14443.14386.555248.977816@lindm.dm> uche.ogbuji@fourthought.com writes: > Paul Prescod: > > > ... It might be > > nice to have a Python "backup" as with Pickle/CPickle, xmllib/sgmlop and > > so forth. > > Multiple versions are always nice if the maintainers don't mind duplicating > the work: choice is good as any of the users of the 8 or 9 XML libraries for > Java or even XSLT processors for Java will attest. Multiple versions are only a gain, when each version is, at least in some sense, better than the other versions. I expect, that the variety of Java XML parsers and XSLT processors will thin out because the developpers/maintainers do not see much sense to support software that does not give them a strategic advantage. > However, I hardly think that sheer portability will be any more of an obstacle > for 4XPath as it is for any other C-based technology. Python itself uses > C/Bison and is portable to multiple platforms, including "the MAC and other > funky stuff". I don't see why 4XSLT should be any different. It uses no > Posix commands, just the basic C library. > > For platforms where Bison is hard to come by, the obvious solution is to > generate the C parser source in Unix or Windows and compile them in the other > platform. A pure Python solution has the advantage that people without C development system (quite an investment, in money and know-how) can use it. Any C based extension has a disadvantage, unless it is part of a distribution supported for many platforms (such as the Python core itself). When something changes, a Python module is changed once and runs on any platform. For a C based extension, recompilation and test is necessary on all platforms. - Dieter From Mike.Olson@Fourthought.com Thu Dec 30 16:40:59 1999 From: Mike.Olson@Fourthought.com (Mike Olson) Date: Thu, 30 Dec 1999 09:40:59 -0700 Subject: [XML-SIG] Future plans References: <199912201809.LAA06565@localhost.localdomain> <386A05FB.E7EB1ABD@prescod.net> <14443.12000.615017.586079@lindm.dm> Message-ID: <386B8B1B.3B9E2E3A@Fourthought.com> --------------A74FA3AA85BC826B449A3891 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Dieter Maurer wrote: > Paul Prescod writes: > > Dieter, is there any chance that your work could turn into a Python > > parser for XPaths that could be used in place of the 4thought one? > My parser is based on Scott Hassan's (mailto:hassan@cs.stanford.edu) > PyBison package. This package has an unknown copying policy. > > Some analysis would be necessary to determine whether the 4Thought parser > can be replaced with this one. Following suggestions in this list, > my parser uses a factory to create XPath objects as result of > parsing. this is how 4XPath works as well. If your factory could be modified to return 4XPath parser objects then I don't think we will have a problem. One note though, there is more parsing done in 4XSLT namely patterns, that are not covered in the XPath spec. To remove all traces of C from the Ft tools, we would need to expand your parser to handle patterns as well. though these are trivial if you can parse XPath. when I get a moment, I will look through your code again. It sounds like it will be possible to support multiple parsers though. > Replacing the factory should allow to adapt it > easily to a different framework. Thus, in principle, it > should be possible. The factory, of cause, ties together parser > and framework. Changes in the framework are likely to call > for changes in the factory placing a burden on the maintainer. > I will take this burden only if the pure 4Thought > solution will not work in my environments (privately: Linux 2; > at work: Solaris, Windows and (fading out) Macs). > > As told in a previous post, I do not yet have experience with > the 5Thought packages. > cool, another thought :) Mike > > - Dieter > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Mike Olson Consultant Member FourThought LLC http://www.fourthought.com http://www.opentechnology.com 720-304-0152 --------------A74FA3AA85BC826B449A3891 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit Dieter Maurer wrote:
Paul Prescod writes:
 > Dieter, is there any chance that your work could turn into a Python
 > parser for XPaths that could be used in place of the 4thought one?
My parser is based on Scott Hassan's (mailto:hassan@cs.stanford.edu)
PyBison package. This package has an unknown copying policy.

Some analysis would be necessary to determine whether the 4Thought parser
can be replaced with this one. Following suggestions in this list,
my parser uses a factory to create XPath objects as result of
parsing.

this is how 4XPath works as well.  If your factory could be modified to return 4XPath parser objects then I don't think we will have a problem.  One note though, there is more parsing done in 4XSLT namely patterns, that are not covered in the XPath spec.  To remove all traces of C from the Ft tools, we would need to expand your parser to handle patterns as well.  though these are trivial if you can parse XPath.

when I get a moment, I will look through your code again.  It sounds like it will be possible to support multiple parsers though.
 

Replacing the factory should allow to adapt it
easily to a different framework. Thus, in principle, it
should be possible. The factory, of cause, ties together parser
and framework. Changes in the framework are likely to call
for changes in the factory placing a burden on the maintainer.
I will take this burden only if the pure 4Thought
solution will not work in my environments (privately: Linux 2;
at work: Solaris, Windows and (fading out) Macs).

As told in a previous post, I do not yet have experience with
the 5Thought packages.
 

cool, another thought :)

Mike

 
- Dieter

_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://www.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson
Consultant Member FourThought LLC
http://www.fourthought.com http://www.opentechnology.com
720-304-0152
  --------------A74FA3AA85BC826B449A3891-- From gstein@lyra.org Thu Dec 30 17:35:45 1999 From: gstein@lyra.org (Greg Stein) Date: Thu, 30 Dec 1999 09:35:45 -0800 (PST) Subject: [XML-SIG] Future plans In-Reply-To: <386A05FB.E7EB1ABD@prescod.net> Message-ID: On Wed, 29 Dec 1999, Paul Prescod wrote: > Dieter Maurer wrote: >... > > As for the explicite memory management necessary for 4DOM: > > It is very easy to introduce cycles in Python's data structures. > > I would expect that many non-trivial Python programs > > contain such cycles and leak memory -- without dramatic > > effect. I can live with explicite memory management for DOM trees, > > even though they tend to be rather large structures. > > I would like to suggest again the idea that for simple uses we could use > a proxy and that sophisticated users could "ask for" a fast version and > get back an unproxied version that they must release. > > My concern is for newbies. They are thinking "XML documents" not > "cycles." The default should be safe but a little slow. I solve the problem in qp_xml by avoiding it altogether :-) The output elements of qp_xml does not contain a parent reference. I've found that the application always knows the parent anyhow since it had to traverse down thru the parent in the first place. And if you're passing around subtrees of an XML document, then they are treated context-free (i.e. it doesn't matter what the parent is). Just my 3 cents (inflation in the new millenia :-) Cheers, -g -- Greg Stein, http://www.lyra.org/ From paul@prescod.net Thu Dec 30 17:54:59 1999 From: paul@prescod.net (Paul Prescod) Date: Thu, 30 Dec 1999 12:54:59 -0500 Subject: [XML-SIG] Future plans References: Message-ID: <386B9C73.D2191938@prescod.net> Greg Stein wrote: > > .... > > I solve the problem in qp_xml by avoiding it altogether :-) The output > elements of qp_xml does not contain a parent reference. I've found that > the application always knows the parent anyhow since it had to traverse > down thru the parent in the first place. In most complicated XML document types you will follow references or links to an object and the underlying systems handes the traversal. You could create "smart pointers" that maintain parent knowledge but that's pretty messy and inefficient. Paul Prescod From Mike.Olson@Fourthought.com Thu Dec 30 18:30:39 1999 From: Mike.Olson@Fourthought.com (Mike Olson) Date: Thu, 30 Dec 1999 11:30:39 -0700 Subject: [XML-SIG] Future plans References: Message-ID: <386BA4CF.FC3DC8C9@Fourthought.com> --------------2DB133F0D6BB61CB4A4E242F Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Greg Stein wrote: > On Wed, 29 Dec 1999, Paul Prescod wrote: > > Dieter Maurer wrote: > >... > > > As for the explicite memory management necessary for 4DOM: > > > It is very easy to introduce cycles in Python's data structures. > > > I would expect that many non-trivial Python programs > > > contain such cycles and leak memory -- without dramatic > > > effect. I can live with explicite memory management for DOM trees, > > > even though they tend to be rather large structures. > > > > I would like to suggest again the idea that for simple uses we could use > > a proxy and that sophisticated users could "ask for" a fast version and > > get back an unproxied version that they must release. > > > > My concern is for newbies. They are thinking "XML documents" not > > "cycles." The default should be safe but a little slow. > > I solve the problem in qp_xml by avoiding it altogether :-) The output > elements of qp_xml does not contain a parent reference. I've found that > the application always knows the parent anyhow since it had to traverse > down thru the parent in the first place. And if you're passing around > subtrees of an XML document, then they are treated context-free (i.e. it > doesn't matter what the parent is). > This works great for simple XML processing, but will never work for XPath processing. the simplest XPath expression, "../", or "/ROOT", would not be possible, as nodes are not context free in XPath, but need to know there absolute location in the tree. You could wrap the DOM elements in XPath, and force the XPath processor to handle destruction of parent/child relationships, but then you are sacrificing XPath/XSLT/XLink/XQL performance in exchange for getting around a minor inconvience (in my opinion) I think the best solution is documentation. Yes, it is unpythonic for a user to have to deal with memory mangement. However, if it is documented well, both in references and demos, I think even the newest of newbies will notice and comply. After all, they have to look in the docs/demos anyways for any new package they try. Most newbies will be writting small scripts to process XML which will run and quit anyways. If they are writting a long running server, then they will need to be aware of circular reference anyways, and adding a line of code to free up a DOM tree will be the least of thier worries. $.03 more in the pot Mike > > Just my 3 cents (inflation in the new millenia :-) > > Cheers, > -g > > -- > Greg Stein, http://www.lyra.org/ > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Mike Olson Consultant Member FourThought LLC http://www.fourthought.com http://www.opentechnology.com 720-304-0152 --------------2DB133F0D6BB61CB4A4E242F Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit Greg Stein wrote:
On Wed, 29 Dec 1999, Paul Prescod wrote:
> Dieter Maurer wrote:
>...
> > As for the explicite memory management necessary for 4DOM:
> >   It is very easy to introduce cycles in Python's data structures.
> >   I would expect that many non-trivial Python programs
> >   contain such cycles and leak memory -- without dramatic
> >   effect. I can live with explicite memory management for DOM trees,
> >   even though they tend to be rather large structures.
>
> I would like to suggest again the idea that for simple uses we could use
> a proxy and that sophisticated users could "ask for" a fast version and
> get back an unproxied version that they must release.
>
> My concern is for newbies. They are thinking "XML documents" not
> "cycles." The default should be safe but a little slow.

I solve the problem in qp_xml by avoiding it altogether :-)  The output
elements of qp_xml does not contain a parent reference. I've found that
the application always knows the parent anyhow since it had to traverse
down thru the parent in the first place. And if you're passing around
subtrees of an XML document, then they are treated context-free (i.e. it
doesn't matter what the parent is).
 

This works great for simple XML processing, but will never work for XPath processing.  the simplest XPath expression, "../", or "/ROOT",  would not be possible, as nodes are not context free in XPath, but need to know there absolute location in the tree.

You could wrap the DOM elements in XPath, and force the XPath processor to handle destruction of parent/child relationships, but then you are sacrificing XPath/XSLT/XLink/XQL performance in exchange for getting around a minor inconvience (in my opinion)

I think the best solution is documentation.  Yes, it is unpythonic for a user to have to deal with memory mangement.  However, if it is documented well, both in references and demos, I think even the newest of newbies will notice and comply.  After all, they have to look in the docs/demos anyways for any new package they try.

Most newbies will be writting small scripts to process XML which will run and quit anyways.  If they are writting a long running server, then they will need to be aware of circular reference anyways, and adding a line of code to free up a DOM tree will be the least of thier worries.

$.03 more in the pot

Mike
 

 
Just my 3 cents (inflation in the new millenia :-)

Cheers,
-g

--
Greg Stein, http://www.lyra.org/

_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://www.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson
Consultant Member FourThought LLC
http://www.fourthought.com http://www.opentechnology.com
720-304-0152
  --------------2DB133F0D6BB61CB4A4E242F-- From walter@data.franken.de Thu Dec 30 18:47:18 1999 From: walter@data.franken.de (Walter Doerwald) Date: Thu, 30 Dec 1999 20:47:18 +0200 Subject: [XML-SIG] Developer's Day In-Reply-To: <005301bf5223$2b71bb30$f29b12c2@secret.pythonware.com> Message-ID: <45919686@data.franken.de> On Wed, 29 Dec 1999 18:35:54 +0100 Fredrik Lundh wrote: > (I'm currently not on this list, so it took me > a while to notice this post. please cc any > followups to me): >=20 > Walter Doerwald wrote: > > > the "SXP" tradeoff (this is our upcoming sgmlop > > > replacement) is "like sgmlop, but usually faster, > > > fully supports utf-8 and unicode, and is written > > ^^^^^^^ > > Does this mean UCS-2/UTF-16 encoded unicode like > > NotePad does on WinNT? >=20 > yes -- thanks to the new unicode string type that > is currently being added to the python core. the > SXP library relies on this and the new unicode reg- > exp engine (SRE). I hope that Unicode is really in the python core, I want to type Python scripts in UCS-2 with japanese variable names. ;) > > > in pure python 1.6 (!)" > > > > When? > > afaik, the unicode and SRE sources will appear in > the CVS version early next year. SXP should be > available soon thereafter. > > as for when there will be an official 1.6 release, > your guess is as good as mine ;-) Servus... Walter -- Walter D=F6rwald =B7 walter@data.franken.de =B7 Kommunikationnetz Franken e= =2EV. From hassan@CS.Stanford.EDU Fri Dec 31 19:10:32 1999 From: hassan@CS.Stanford.EDU (Scott Hassan) Date: Fri, 31 Dec 1999 11:10:32 -0800 (PST) Subject: [XML-SIG] Future plans In-Reply-To: <14443.12000.615017.586079@lindm.dm> References: <199912201809.LAA06565@localhost.localdomain> <386A05FB.E7EB1ABD@prescod.net> <14443.12000.615017.586079@lindm.dm> Message-ID: <14444.65335.913858.842193@blue.dotfunk.com> Dieter Maurer writes: > Paul Prescod writes: > > Dieter, is there any chance that your work could turn into a Python > > parser for XPaths that could be used in place of the 4thought one? > My parser is based on Scott Hassan's (mailto:hassan@cs.stanford.edu) > PyBison package. This package has an unknown copying policy. > I believe you are the only one I know of who is using PyBison for something useful. Although I haven't explicitly stated this, PyBison has the same license as the Python license. Copy at will. :) Scott