From uche.ogbuji@fourthought.com Thu Aug 1 07:49:31 2002 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: 01 Aug 2002 00:49:31 -0600 Subject: [XML-SIG] XPath and XSLT extension documentation Message-ID: <1028184574.8979.950.camel@malatesta> I have added pretty extensive documentation for user-defined XSLT and XPath extension functions and elements to my Akara docs. http://uche.ogbuji.net/tech/akara/pyxml/xslt-ext-funcs/ http://uche.ogbuji.net/tech/akara/pyxml/xslt-ext-elems/ http://uche.ogbuji.net/tech/akara/4suite/xslt-xpath-ext/ Just FYI, here is the current TOC of the PyXML Akara (http://uche.ogbuji.net/tech/akara/pyxml/) Printing and re-serializing XML from DOM Basic use of the Python XPath API cDomlette and minidom Parsing SAX from a 4Suite InputSource Basic use of the Python XSLT API Writing DOM from XSLT transforms Basic SAX processing Processing encoded files What software to install Interleaving SAX and DOM How to Implement your own 4XSLT extension functions XML-RPC Basic DOM processing How to Implement your own 4XSLT extension elements Writing alternate stylesheet and source doc resolvers for XSLT Here is the current TOC of the 4Suite Akara (http://uche.ogbuji.net/tech/akara/4suite/) User and group information available in the repository 4Suite performance Configuring the repository itself Using user-defined extension elements and functions in the repository 4Suite repository: the good, the bad and the ugly Built-in parameters for XSLT processing invoked from the repository 4Suite RDF scopes Troubleshooting the repository -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Track chair, XML/Web Services One Boston: http://www.xmlconference.com/ Basic XML and RDF techniques for knowledge management, Part 7 - http://www-106.ibm.com/developerworks/xml/library/x-think12.html Keeping pace with James Clark - http://www-106.ibm.com/developerworks/xml/library/x-jclark.html Python and XML development using 4Suite, Part 3: 4RDF - http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/8A1EA5A2CF4621C386256BBB006F4CEC From loewis@informatik.hu-berlin.de Thu Aug 1 11:37:29 2002 From: loewis@informatik.hu-berlin.de (Martin v. =?iso-8859-1?q?L=F6wis?=) Date: 01 Aug 2002 12:37:29 +0200 Subject: [XML-SIG] PyXML 0.8 is released Message-ID: Version 0.8 of the Python/XML distribution is now available. It should be considered a beta release, and can be downloaded from the following URLs: http://prdownloads.sourceforge.net/pyxml/PyXML-0.8.tar.gz http://prdownloads.sourceforge.net/pyxml/PyXML-0.8.win32-py2.1.exe http://prdownloads.sourceforge.net/pyxml/PyXML-0.8.win32-py2.2.exe http://prdownloads.sourceforge.net/pyxml/PyXML-0.8-2.2.i386.rpm Changes in this version, compared to 0.7.1: * Python 1.5 is not supported anymore; Python 2.0 or higher is required. * Expat has been updated to 1.95.4. * pyexpat is now always built. * pyexpat now reports skipped entities. * pyexpat now can combine subsequent character data events into a single callback invocation; set the parser's buffer_text attribute to true to enable this feature. * pyexpat can now report namespace prefixes. Set the parser's namespace_prefixes attribute to true to enable this feature. * Various bugs in sgmlop have been fixed. * Various DOM Level 3 symbolic constants have been added; DOMStringSizeErr can now be spelled as DomstringSizeErr again, and ValidationErr has been defined. * Various DOM L1, L2 and L3 features have been added to minidom: userdata, isSupported, getInterface, wholeText, replaceWholeText, Entity, Notation * minidom's .toxml now allows the caller to specify an encoding. * The new module xml.dom.xmlbuilder implements the load part of the DOM L3 Load/Store spec. * The new module xml.dom.expatbuilder allows to create minidom trees more efficiently, by using expat directly (rather than using SAX). This is normally used via xml.dom.xmlbuilder. * Bugs in c14n namespace processing have been fixed. * Minor bugs in xmlproc have been fixed. * xml.sax.expatreader now invokes resolveEntity properly. * The sgmlop SAX driver now invokes skippedEntity. * The xml-howto has been updated. * Bugs in the MSIE, ADR, and NS XBEL parsers have been fixed. The Python/XML distribution contains the basic tools required for processing XML data using the Python programming language, assembled into one easy-to-install package. The distribution includes parsers and standard interfaces such as SAX and DOM, along with various other useful modules. The package currently contains: * XML parsers: Pyexpat (Jack Jansen), xmlproc (Lars Marius Garshol), sgmlop (Fredrik Lundh). * SAX interface (Lars Marius Garshol) * minidom DOM implementation (Paul Prescod, others) * 4DOM and 4XPath from Fourthought (Uche Ogbuji, Mike Olson) * Schema implementations: TREX (James Tauber) * Various utility modules and functions (various people) * Documentation and example programs (various people) The code is being developed bazaar-style by contributors from the Python XML Special Interest Group, so please send comments and questions to . Bug reports may be filed on SourceForge: http://sourceforge.net/tracker/index.php?group_id=3D6473&atid=3D106473 For more information about Python and XML, see: http://www.python.org/topics/xml/ --=20 Martin v. L=F6wis http://www.informatik.hu-berlin.de/~loewis From fdrake@acm.org Thu Aug 1 18:13:41 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 1 Aug 2002 13:13:41 -0400 Subject: [XML-SIG] DOM Level 3 Core comments Message-ID: <15689.27717.141252.150241@grendel.zope.com> I've sent some comments on the DOM Level 3 Core to the W3C: http://lists.w3.org/Archives/Public/www-dom/2002JulSep/0049.html (In case anyone's interested.) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From larsga@garshol.priv.no Thu Aug 1 18:23:13 2002 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 01 Aug 2002 19:23:13 +0200 Subject: [XML-SIG] (no subject) In-Reply-To: References: Message-ID: * Martin v. Loewis | | I was hoping that, in particular, Lars-Marius can comment on this | patch. I looked at it very briefly now. The patch, and the author's comments about it, both look reasonable to me. I think we should check it in. If there are any problems I think they are likely to be minor and could be fixed later. Separating the parser(s) from the resolver itself would have been nice, though. -- Lars Marius Garshol, Ontopian ISO SC34/WG3, OASIS GeoLang TC From Martina@Oefelein.de Thu Aug 1 20:01:45 2002 From: Martina@Oefelein.de (Martina Oefelein) Date: Thu, 1 Aug 2002 21:01:45 +0200 Subject: [XML-SIG] PyXML 0.8 build for MacPython? In-Reply-To: References: Message-ID: Has anybody a build of PyXML for MacPython 2.2? At 12:37 Uhr +0200 01.08.2002, Martin v. L=F6wis wrote: >Version 0.8 of the Python/XML distribution is now available. It >should be considered a beta release, and can be downloaded from >the following URLs: > >http://prdownloads.sourceforge.net/pyxml/PyXML-0.8.tar.gz >http://prdownloads.sourceforge.net/pyxml/PyXML-0.8.win32-py2.1.exe >http://prdownloads.sourceforge.net/pyxml/PyXML-0.8.win32-py2.2.exe >http://prdownloads.sourceforge.net/pyxml/PyXML-0.8-2.2.i386.rpm -- ciao Martina From noreply@sourceforge.net Fri Aug 2 14:51:39 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 02 Aug 2002 06:51:39 -0700 Subject: [XML-SIG] [ pyxml-Bugs-590132 ] Compilation failed using PyDoc_STR Message-ID: Bugs item #590132, was opened at 2002-08-02 06:51 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=590132&group_id=6473 Category: pyexpat Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Compilation failed using PyDoc_STR Initial Comment: On my compiler (SGI Irix 6.5 MIPSpro Compilers: Version 7.30) the compilation failed because the compiler doesn't want to initialize static chars using the macro PyDoc_STR It works just fine removing the "secure" brackets (line 13 extensions/pyexpat.c) #define PyDoc_STR(str) str ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=590132&group_id=6473 From fdrake@acm.org Fri Aug 2 17:32:23 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 2 Aug 2002 12:32:23 -0400 Subject: [XML-SIG] Element.localName, Attr.localName Message-ID: <15690.46103.579963.126437@grendel.zope.com> Consider this XML 1.0 document: This is just XML 1.0, no namespaces! As I read it, the DOM for this should have 5 attribute nodes: nodeName/name nodeValue/value ------------- ----------------------- a:a a b:b b xmlns:A http://xml.python.org/a xmlns:a http://xml.python.org/a xmlns:b http://xml.python.org/b The localName, namespaceURI, and prefix for each should be None. Unfortunately, xml.dom.minidom was broken in a late checkin, and localName gets a non-None value for all of these. Am I understanding the DOM specification wrong, or is this really broken? I can fix it if this is indeed broken. This is the test I used, added to the end of test/test_xmlbuilder.py: builder.setFeature("namespaces", 0) run_checks(builder, #((nsuri, localName), value), [((None, None), "a"), ((None, None), "b"), ((None, None), "http://xml.python.org/a"), ((None, None), "http://xml.python.org/a"), ((None, None), "http://xml.python.org/b"), ]) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From jeremy.kloth@fourthought.com Fri Aug 2 18:29:40 2002 From: jeremy.kloth@fourthought.com (Jeremy Kloth) Date: Fri, 2 Aug 2002 11:29:40 -0600 Subject: [XML-SIG] Element.localName, Attr.localName References: <15690.46103.579963.126437@grendel.zope.com> Message-ID: <000e01c23a4a$2f658b10$1a01a8c0@zeus> ----- Original Message ----- From: "Fred L. Drake, Jr." To: "XML-SIG" Sent: Friday, August 02, 2002 10:32 AM Subject: [XML-SIG] Element.localName, Attr.localName > > Consider this XML 1.0 document: > > xmlns:A="http://xml.python.org/a" > xmlns:b="http://xml.python.org/b" > a:a="a" b:b="b" > /> > > This is just XML 1.0, no namespaces! > > As I read it, the DOM for this should have 5 attribute nodes: > > nodeName/name nodeValue/value > ------------- ----------------------- > a:a a > b:b b > xmlns:A http://xml.python.org/a > xmlns:a http://xml.python.org/a > xmlns:b http://xml.python.org/b > > > The localName, namespaceURI, and prefix for each should be None. That depends on whether the NS methods were was to create them. > Unfortunately, xml.dom.minidom was broken in a late checkin, and > localName gets a non-None value for all of these. > > Am I understanding the DOM specification wrong, or is this really > broken? I can fix it if this is indeed broken. Only if they were added through the non-NS methods. However a quick scan through minidom reveals that localName is formed on-the-fly by splitting on ':'. So I guess it is indeed broken. -- Jeremy Kloth jeremy.kloth@fourthought.com Fourthought, Inc. http://fourthought.com, http://4suite.org From matt_g_@hotmail.com Sat Aug 3 04:45:40 2002 From: matt_g_@hotmail.com (Matt G.) Date: Sat, 03 Aug 2002 03:45:40 +0000 Subject: [XML-SIG] XmlProc and the ANY element Message-ID: >From: Daniel Shane >To: xml-sig@python.org >Subject: [XML-SIG] XmlProc and the ANY element >Date: Fri, 26 Jul 2002 14:30:36 -0400 Sorry not to reply to this sooner - I've fallen a bit behind on my PyXML mail. >I am currently building an application using xmlproc and have found that I >need to know what are the next valid elements even when the content model >is >ANY. In its current state XMLProc returns the empty list and although some >may have built applications that expect this behavior I think it should >really return the list of all the valid DTD elements. Are you sure that's the correct behavior, for the 'ANY' content model? I thought it meant that *any* element was valid - whether in the DTD (internal or external subset) or not. If it is open-ended, then you obviously couldn't list all the possible elements. Matt Gruenke _________________________________________________________________ Chat with friends online, try MSN Messenger: http://messenger.msn.com From mike@skew.org Sat Aug 3 04:50:56 2002 From: mike@skew.org (Mike Brown) Date: Fri, 2 Aug 2002 21:50:56 -0600 (MDT) Subject: [XML-SIG] XmlProc and the ANY element In-Reply-To: "from Matt G. at Aug 3, 2002 03:45:40 am" Message-ID: <200208030350.g733ouh7071783@chilled.skew.org> Matt G. wrote: > Are you sure that's the correct behavior, for the 'ANY' content model? I > thought it meant that *any* element was valid - whether in the DTD (internal > or external subset) or not. Nope, ANY means any *declared* element. From martin@v.loewis.de Sat Aug 3 10:31:27 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 03 Aug 2002 11:31:27 +0200 Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: <15690.46103.579963.126437@grendel.zope.com> References: <15690.46103.579963.126437@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > Consider this XML 1.0 document: > > xmlns:A="http://xml.python.org/a" > xmlns:b="http://xml.python.org/b" > a:a="a" b:b="b" > /> > > This is just XML 1.0, no namespaces! Why do you say that this document has no namespaces? It looks to me like it has! It may be that an application is not *aware* of the namespaces, but they surely are present. > As I read it, the DOM for this should have 5 attribute nodes: > > nodeName/name nodeValue/value > ------------- ----------------------- > a:a a > b:b b > xmlns:A http://xml.python.org/a > xmlns:a http://xml.python.org/a > xmlns:b http://xml.python.org/b > > > The localName, namespaceURI, and prefix for each should be None. That is not true. In DOM Level 2 and onwards, that should be nodeName localName namespaceURI prefix a:a a http://xml.python.org/a a b:b b http://xml.python.org/b b xmlns:A A http://www.w3.org/2000/xmlns/ xmlns xmlns:a a http://www.w3.org/2000/xmlns/ xmlns xmlns:b b http://www.w3.org/2000/xmlns/ xmlns > builder.setFeature("namespaces", 0) Ah, you are turning off the feature "namespaces". I don't think the Load/Store spec says precisely what that means for load - it only says what that means for store. One could, of course, guess that it means to set all those attributes to null - in which case your interpretation would be correct. Notice, however, that the pre-LS-builders are supposed to do the equivalent of the namespaces feature being activated. Regards, Martin From fredrik@pythonware.com Mon Aug 5 15:14:26 2002 From: fredrik@pythonware.com (Fredrik Lundh) Date: Mon, 5 Aug 2002 16:14:26 +0200 Subject: [XML-SIG] Preparing for PyXML 0.8 References: <15687.3302.102707.808032@grendel.zope.com> <15687.19432.316123.786584@grendel.zope.com> <1028083355.1072.795.camel@malatesta> Message-ID: <031401c23c8a$6a69fec0$05d141d5@hagrid> uche wrote: > > The HTMLParser module from the standard library? Or is there another > > I'm missing? > > The std library. 2.1 or later onnly, ya know. that should be 2.2 and later, right? note that if you don't use do/start/end handlers, you can support earlier versions by falling back on SGMLParser: try: from HTMLParser import HTMLParser except ImportError: from sgmllib import SGMLParser # hack to use sgmllib's SGMLParser to emulate 2.2's HTMLParser class HTMLParser(SGMLParser): # the following only works as long as this class doesn't # provide any do, start, or end handlers def unknown_starttag(self, tag, attrs): self.handle_starttag(tag, attrs) def unknown_endtag(self, tag): self.handle_endtag(tag) # (taken from xmltoys.HTMLTreeBuilder) From Mike.Olson@fourthought.com Wed Aug 7 00:08:58 2002 From: Mike.Olson@fourthought.com (Mike Olson) Date: 06 Aug 2002 17:08:58 -0600 Subject: [XML-SIG] Proper way of generating a parentless Node object In-Reply-To: <20020619114155.A16494@galadriel.alyra.org> References: <20020618133247.P5191@galadriel.alyra.org> <1024433560.15286.14.camel@penny> <20020619114155.A16494@galadriel.alyra.org> Message-ID: <1028675340.25589.11.camel@penny> On Wed, 2002-06-19 at 10:41, Mark Humphrey wrote: > On Tue, Jun 18, 2002 at 02:52:38PM -0600, Mike Olson wrote: > > > > > > or similar > > > > Mike Hi Mark, Sorry for the long delay in getting back to you. Lost in the inbox..... > > Okay, this appears to work, so I will go with it for now. > > How do people normally process a document for a particular namespace? Do they have one class or a set of classes that process that namespace only? I've set my system up so that one class processes a namespace (ns1), and when it encounters a tag for another namespace (ns2), and the schema for ns1 says that's a legal tag, then it calls into the object designed to handle ns2. > > Does this make sense, or am I missing an important point somewhere? I don't follow your question. Are you searching a document for a specific namespace? Mike > > -- > Mark "Markus" Humphrey > http://www.alyra.org/~msph/ > This email has a digital signature. If you can't verify the signature, you > can't prove it's from me. Learn about encryption on my web site. > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig -- Mike Olson Principal Consultant mike.olson@fourthought.com +1 303 583 9900 x 102 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, http://4Suite.org Boulder, CO 80301-2537, USA XML strategy, XML tools, knowledge management From jmlchristensen@yahoo.com Wed Aug 7 14:14:22 2002 From: jmlchristensen@yahoo.com (James Christensen) Date: Wed, 7 Aug 2002 06:14:22 -0700 (PDT) Subject: [XML-SIG] XMLFilterBase examples? Message-ID: <20020807131422.386.qmail@web20905.mail.yahoo.com> My SAX-life up to now has been pretty much confined to perl with an occasional experimental romp with java, but now I'm using python. What I want to do and what I really like about SAX processing is setting up multiple discrete filters that take in a stream of sax events and produce a stream of sax events. I suppose this is possible in python, but I have to admit it's not immediately obvious to me how to achieve this. I suppose part of the problem is that I'm not quite clear on the proper way to write a handler that's not just a sax event sink. Any leads, tips, example bits of code, pointers, etc would be greatly appreciated. A python implementation of XML::SAX::Machines would be met with thunderous acclamation. James __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com From martin@v.loewis.de Wed Aug 7 19:18:04 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 07 Aug 2002 20:18:04 +0200 Subject: [XML-SIG] XMLFilterBase examples? In-Reply-To: <20020807131422.386.qmail@web20905.mail.yahoo.com> References: <20020807131422.386.qmail@web20905.mail.yahoo.com> Message-ID: James Christensen writes: > What I want to do and what I really like about SAX > processing is setting up multiple discrete filters > that take in a stream of sax events and produce a > stream of sax events. I suppose this is possible in > python, but I have to admit it's not immediately > obvious to me how to achieve this. I suppose part of > the problem is that I'm not quite clear on the proper > way to write a handler that's not just a sax event > sink. You should inherit from xml.sax.saxutils.XMLFilterBase. This is essentially a XMLReader which is initialized with another XMLReader (its parent). You can then use the filter in place of the original reader. The filter installs itself as a handler for everything, forwarding everything to the application's handler, and forwards everything from the application to the reader. You can override the events you are interested in. HTH, Martin From jhaefner@biology.usu.edu Wed Aug 7 21:58:32 2002 From: jhaefner@biology.usu.edu (James Haefner) Date: Wed, 07 Aug 2002 14:58:32 -0600 Subject: [XML-SIG] pyexpat problem Message-ID: <3D5189F8.7000200@biology.usu.edu> Hi I'm not a member of this list yet, but I hope you'll help anyway, or direct me to a source. I have RedHat 7.2 and need to upgrade printconf, but am having problems with python2.2 installation. My setup can not find "pyexpat". Unfortunately, I am quite ignorant of python. Here is what I have, all taken from recent rpms: $ rpm -q printconf printconf-0.3.61-4.1 $ rpm -q printconf-gui printconf-gui-0.3.61-4.1 $ rpm -q PyXML PyXML-0.7.1-4 The error from printconf-gui is: # printconf-gui Traceback (most recent call last): File "/usr/sbin/printconf-gui", line 7, in ? import printconf_gui File "/usr/share/printconf/util/printconf_gui.py", line 40, in ? from printconf_conf import * File "/usr/share/printconf/util/printconf_conf.py", line 81, in ? from xml.utils import qp_xml File "/usr/src/build/121260-i386/install/usr/lib/python2.2/site-packages/_xmlplus/utils/qp_xml.py", line 23, in ? ImportError: cannot import name pyexpat The last path printed does not exist, but /usr/lib/python2.2/site-packages/_xmlplus/utils/qp_xml.py does exit. Also I have installed something called /usr/lib/python2.2/lib-dynload/pyexpat.so also /usr/lib/python2.2/site-packages/_xmlplus/sax/drivers/drv_pyexpat.py It looks to me that I have not installed python correctly (other python code fails). Is it true that simply installing the pyxml rpm is insufficient? I can find no xml_setup.py file. Is there a tutorial for using python modules installed via rpm? Any help would be greatly appreciated, and if it were sent directly to me, I would be sure not to miss it. Thanks in advance. If there is a FAQ for this error, I haven't found it. Jim Haefner -- James W. Haefner Email: jhaefner@biology.usu.edu Dept Biology/Ecology Center Voice: 435-797-3553 Utah State University FAX: 435-797-1575 Logan, UT 84322-5305 From Bedrich.Kosata@vscht.cz Thu Aug 8 08:23:56 2002 From: Bedrich.Kosata@vscht.cz (Beda Kosata) Date: Thu, 08 Aug 2002 09:23:56 +0200 Subject: [XML-SIG] PyXML breaks localization Message-ID: <3D521C8C.6070606@vscht.cz> Hi everybody, I have just found that PyXML calls gettext.install() on import (via dom/MessageSource.py), which means that simply calling: import xml.dom puts _() to your __builtin__ namespace. Even worse is, that previously installed _() is overridden, thus breaking the localization of running app. Even if this is easy to fix by calling gettext.install() after xml.dom was imported, it does not seem to me as a Good Thing(tm). I think that the localization scheme of PyXML should be changed according to http://python.org/doc/2.2.1/lib/node207.html in order not to pollute the __builtin__ namespace. Cheers BEDA -- ============================================================ Beda Kosata (kosatab@vscht.cz) Department of Organic Chemistry Institute of Chemical Technology Prague - Czech Republic ============================================================ From martin@v.loewis.de Thu Aug 8 08:10:05 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 08 Aug 2002 09:10:05 +0200 Subject: [XML-SIG] pyexpat problem In-Reply-To: <3D5189F8.7000200@biology.usu.edu> References: <3D5189F8.7000200@biology.usu.edu> Message-ID: James Haefner writes: > I have RedHat 7.2 and need to upgrade printconf, but am having > problems with python2.2 installation. My setup can not find "pyexpat". > Unfortunately, I am quite ignorant of python. Here is what I have, > all taken from recent rpms: > > > $ rpm -q printconf > printconf-0.3.61-4.1 > $ rpm -q printconf-gui > printconf-gui-0.3.61-4.1 > $ rpm -q PyXML > PyXML-0.7.1-4 Can you tell whether you have a "expat" package installed also? I don't know how Redhat has chosen to package PyXML and Python 2.2, but I recommend to install all packages that remotely have "expat" in their name, in all spellings that you can think of. > The last path printed does not exist, but > /usr/lib/python2.2/site-packages/_xmlplus/utils/qp_xml.py > > does exit. Also I have installed something called > > /usr/lib/python2.2/lib-dynload/pyexpat.so Ok, what happens if you do import pyexpat on an interactive Python prompt? > It looks to me that I have not installed python correctly (other > python code fails). Is it true that simply installing the pyxml rpm > is insufficient? I can find no xml_setup.py file. Is there a > tutorial for using python modules installed via rpm? Hard to tell - you'd have to ask Redhat for that. Regards, Martin From uche.ogbuji@fourthought.com Thu Aug 8 16:03:15 2002 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 08 Aug 2002 09:03:15 -0600 Subject: [XML-SIG] XMLFilterBase examples? In-Reply-To: Message from James Christensen of "Wed, 07 Aug 2002 06:14:22 PDT." <20020807131422.386.qmail@web20905.mail.yahoo.com> Message-ID: > My SAX-life up to now has been pretty much confined to > perl with an occasional experimental romp with java, > but now I'm using python. > > What I want to do and what I really like about SAX > processing is setting up multiple discrete filters > that take in a stream of sax events and produce a > stream of sax events. I suppose this is possible in > python, but I have to admit it's not immediately > obvious to me how to achieve this. I suppose part of > the problem is that I'm not quite clear on the proper > way to write a handler that's not just a sax event > sink. > > Any leads, tips, example bits of code, pointers, etc > would be greatly appreciated. A python implementation > of XML::SAX::Machines would be met with thunderous > acclamation. I see that Martin already gave you the basics. Ask further questions as you need them, but about this XML::SAX::Machines thing: I'm not sure what it is, but if, as you suggest, it's something that would help newbies to SAX filter writing, we'd be grateful if you were willing to implement and post such a module with our help and guidance. We'd also be grateful if you post exactly how you succeed in your task when you do so that others like you can benefit. Thanks. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Track chair, XML/Web Services One Boston: http://www.xmlconference.com/ Basic XML and RDF techniques for knowledge management, Part 7 - http://www-106.ibm.com/developerworks/xml/library/x-think12.html Keeping pace with James Clark - http://www-106.ibm.com/developerworks/xml/libra ry/x-jclark.html Python and XML development using 4Suite, Part 3: 4RDF - http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/8A 1EA5A2CF4621C386256BBB006F4CEC From uche.ogbuji@fourthought.com Thu Aug 8 16:07:20 2002 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 08 Aug 2002 09:07:20 -0600 Subject: [XML-SIG] PyXML breaks localization In-Reply-To: Message from Beda Kosata of "Thu, 08 Aug 2002 09:23:56 +0200." <3D521C8C.6070606@vscht.cz> Message-ID: > Hi everybody, > I have just found that PyXML calls gettext.install() on import (via > dom/MessageSource.py), which means that simply calling: > > import xml.dom > > puts _() to your __builtin__ namespace. Even worse is, that previously > installed _() is overridden, thus breaking the localization of running app. > Even if this is easy to fix by calling gettext.install() after xml.dom > was imported, it does not seem to me as a Good Thing(tm). > > I think that the localization scheme of PyXML should be changed > according to http://python.org/doc/2.2.1/lib/node207.html in order not > to pollute the __builtin__ namespace. I'll look into this. I've wanted to touch up the l10n code, anyway. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Track chair, XML/Web Services One Boston: http://www.xmlconference.com/ Basic XML and RDF techniques for knowledge management, Part 7 - http://www-106.ibm.com/developerworks/xml/library/x-think12.html Keeping pace with James Clark - http://www-106.ibm.com/developerworks/xml/libra ry/x-jclark.html Python and XML development using 4Suite, Part 3: 4RDF - http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/8A 1EA5A2CF4621C386256BBB006F4CEC From fdrake@acm.org Thu Aug 8 18:36:09 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 8 Aug 2002 13:36:09 -0400 Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: References: <15690.46103.579963.126437@grendel.zope.com> Message-ID: <15698.44041.680284.604789@grendel.zope.com> Martin v. Loewis writes: > > > xmlns:A="http://xml.python.org/a" > > xmlns:b="http://xml.python.org/b" > > a:a="a" b:b="b" > > /> > > > > This is just XML 1.0, no namespaces! > > Why do you say that this document has no namespaces? It looks to me > like it has! Because I've said this is only an XML 1.0 document only; it happens to attributes that would be namespace declarations and prefixes if namespace processing were active, but it isn't. > It may be that an application is not *aware* of the namespaces, but > they surely are present. Er, no, there are no namespaces because this is only an XML 1.0 document; the namespaces recommendation does not apply. > That is not true. In DOM Level 2 and onwards, that should be [...list omitted...] If namespaces are active, yes. > > builder.setFeature("namespaces", 0) > > Ah, you are turning off the feature "namespaces". I don't think the Like I said, there are no namespaces. ;-) > Load/Store spec says precisely what that means for load - it only says > what that means for store. One could, of course, guess that it means > to set all those attributes to null - in which case your > interpretation would be correct. Ok, so if namespaces *are* active, what should be the localName of the element in my example? Or even this document would do: There is nothing there to give a namespace to the element; what should localName be? Note that the createElement() vs. createElementNS() distinction isn't helpful here, either, since there's nothing in the LS spec that gives guidance on which to use in this case, or even requires that either is used in any case -- that's left as an implementation detail. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake@acm.org Thu Aug 8 19:46:59 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 8 Aug 2002 14:46:59 -0400 Subject: [XML-SIG] Proposed Expat API changes Message-ID: <15698.48291.93711.153184@grendel.zope.com> --+P7Q+Q/GTh Content-Type: text/plain; charset=us-ascii Content-Description: message body and .signature Content-Transfer-Encoding: 7bit I've proposed some changes to Expat's C API on the expat-discuss list; these changes would allow pull-based and mixed-mode parsers to be built on top of Expat. Unfortunately, the message hasn't appeared in the online archives; this is the cost of using SF's mailing lists. ;-( I've attached the proposal to this email, in case anyone is interested. Followups pertaining to Expat's C API should be directed to the expat-discuss list: http://sourceforge.net/mail/?group_id=10127 -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation --+P7Q+Q/GTh Content-Type: text/plain Content-Description: Proposed Expat API changes Content-Disposition: inline; filename="api-changes.txt" Content-Transfer-Encoding: 7bit Implementing a blocking mode in Expat ===================================== Requests for a pull-based API for Expat have surfaced a few times over (at least) the last couple of years; there is a feature request for this on SourceForge (issue #544682): http://sourceforge.net/tracker/index.php?func=detail&aid=544682&group_id=10127&atid=110127 An additional motivation is that we'd like to be able to share a codebase with the Mozilla project, which is currently using a substantially modified version of an older version of Expat. Pull-based parsers have become increasingly popular as the limitations of DOM- or SAX-like APIs have become better known. The pull-based APIs provide an opportunity to build each part of an application in the way that's most appropriate, allowing a mixture of DOM- and SAX-like behaviors. Expat could provide the basis for an efficient pull-based API if it offered an opportunity to suspend parsing temporarily, allowing parsing to resume when the application is ready for additional information from the document. A .NET-like API could easily be built on top of such a feature. Karl Waclawek and I have been having discussions about this, and think we have a good idea of how to introduce such a feature into Expat. There are questions and issues regarding the possible API that would need to be exposed; I've summarized our ideas an analysis below in the form of two alternate API proposals. We welcome feedback and discussion, including the introduction of additional API proposals, on the expat-discuss list. Supporting Information ---------------------- Expat 1.95.6 / 1.96 will include a new enumeration, XML_Status, specifying return values for the XML_Parse() and XML_ParseBuffer() functions. Our recommendation is that the result of XML_Parse() and XML_ParseBuffer() be tested for these values specifically, even when using older versions of Expat 1.95.x -- this will be completely equivalent in practice. This change allows us to extend the number of possible return values in the future; the documented API in Expat 1.95 through 1.95.4 really only defines a boolean interpretation of these return values, but only the two specific values, now named by XML_Status enum names, were actually used. API Option 1 ------------ This alternative introduces two new functions and three new constants. These are only needed if an application uses the new functionality. XML_STATUS_SUSPENDED New value in the XML_Status enumeration. This is only used if XML_SuspendParser() has been called. XML_ERROR_NOT_SUSPENDED XML_ERROR_SUSPENDED These new error codes would be used to indicate that a call to the parser was made when the parser was not in the expected internal state, and indicate programming errors in the application. XML_Status XML_SuspendParser(XML_Parser parser) Inform the parser that parsing should be suspended when the currently active callback returns. It should only be called from a callback. Returns XML_STATUS_OK or XML_STATUS_ERROR. Multiple calls to XML_SuspendParser() during a callback are allowed, and are equivalent to a single call to XML_SuspendParser(). It is an error to call this function while a callback function is not active. XML_Status XML_ResumeParser(XML_Parser parser) Resume parsing using a suspended parser. Returns XML_STATUS_OK, XML_STATUS_ERROR, or XML_STATUS_SUSPENDED. If the parser has not been suspended, this returns XML_STATUS_ERROR, and XML_GetErrorCode() returns XML_ERROR_NOT_SUSPENDED. The parser is not invalidated in this case, and parsing may be continued with additional input using XML_Parse() or XML_ParseBuffer(). The following functions change: XML_Status XML_Parse(XML_Parser parser, const char *s, int len, int isFinal) XML_Status XML_ParseBuffer(XML_Parser parser, int len, int isFinal) These two existing functions will change the meaning of their return value slightly. If parsing is suspended using XML_SuspendParser(), they will return XML_STATUS_SUSPENDED, otherwise the current values of XML_STATUS_OK and XML_STATUS_ERROR may be returned. If XML_STATUS_SUSPENDED is returned, the parse of the input document can only be resumed using XML_ResumeParser(). If either of these is called on a suspended parser, XML_ERROR_OK will be returned with the error code XML_ERROR_SUSPENDED returned by XML_GetErrorCode(). The parser is not invalidated in this case, and parsing may still be resumed. void * XML_GetBuffer(XML_Parser parser, int len) If the parser has been suspended, returns NULL and XML_GetErrorCode() returns XML_ERROR_SUSPENDED. Parsing the input which has already been passed into Expat should be continued using XML_ResumeParser(). No changes if the parser was not suspended. Potential Issues ---------------- The risk inherent in this API varient is that it does change the interpretation of the return code for XML_Parse() and XML_ParseBuffer(). This is only significant if any callback ever calls XML_SuspendParser(). In the case of suspension, XML_STATUS_SUSPENDED would be returned, but an existing main loop will recognize this as a successful parse. This would be a programming error in the revised API, but not the old API. If the buffer being parsed was not the last buffer, a reasonable error would be returned when the main loop calls XML_Parse() or XML_ParseBuffer() is called again, but if the last input buffer was already passed (isFinal is true), there would be no opportunity to report the error, possibly making it difficult to diagnose application errors introduced by this change. We don't know how important this change is in practice for Expat 1.95.x users; we would appreciate feedback on the expat-discuss list. API Option 2 ------------ This version of the API changes provide increased backward compatibility, at the cost of a cruftier API to Expat. An alternate version of the API also adds the XML_SuspendParser() and XML_ResumeParser() functions, and the new XML_ERROR_* constants, but not the new XML_Status value. This variant would describe suspension as a pseudo-error from the XML_Parse() and XML_ParseBuffer() functions, allowing existing applications to report "errors" from the main loop if they had not been prepared for the suspension feature, but some callback function called XML_SuspendParser(). This would only be expected to occur during development, but applications that only suspend parsing occaissionally may find that poorly tested code paths expose problems late in the development cycle or even after the application has entered production. The alternate version uses this description for XML_Parse() and XML_ParseBuffer(): XML_Status XML_Parse(XML_Parser parser, const char *s, int len, int isFinal) XML_Status XML_ParseBuffer(XML_Parser parser, int len, int isFinal) If XML_STATUS_ERROR is returned, a main loop which supports the suspension feature needs to check whether XML_GetErrorCode(parser) == XML_ERROR_SUSPENDED. If so, the parse was suspended and the call to continue the parse needs to be XML_ResumeParser(). Otherwise, the error is "real". This approach conflates error codes with the state of the parse, and labels the normal operation of the parser as an error. --+P7Q+Q/GTh-- From Mike.Olson@fourthought.com Thu Aug 8 19:52:14 2002 From: Mike.Olson@fourthought.com (Mike Olson) Date: 08 Aug 2002 12:52:14 -0600 Subject: [XML-SIG] Proposed Expat API changes In-Reply-To: <15698.48291.93711.153184@grendel.zope.com> References: <15698.48291.93711.153184@grendel.zope.com> Message-ID: <1028832736.3844.2.camel@penny> On Thu, 2002-08-08 at 12:46, Fred L. Drake, Jr. wrote: I think option 1 is the best choice. It will not break code unless someone goes in and adds calls to suspend the parser. As mentioned, this would break with new return values to XML_Parse, etc. however, if they are in there making changes might as well change two places. Mike > > I've proposed some changes to Expat's C API on the expat-discuss list; > these changes would allow pull-based and mixed-mode parsers to be > built on top of Expat. > > Unfortunately, the message hasn't appeared in the online archives; > this is the cost of using SF's mailing lists. ;-( I've attached the > proposal to this email, in case anyone is interested. Followups > pertaining to Expat's C API should be directed to the expat-discuss > list: > > http://sourceforge.net/mail/?group_id=10127 > > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation > > ---- > > Implementing a blocking mode in Expat > ===================================== > > Requests for a pull-based API for Expat have surfaced a few times over > (at least) the last couple of years; there is a feature request for > this on SourceForge (issue #544682): > > http://sourceforge.net/tracker/index.php?func=detail&aid=544682&group_id=10127&atid=110127 > > An additional motivation is that we'd like to be able to share a > codebase with the Mozilla project, which is currently using a > substantially modified version of an older version of Expat. > > Pull-based parsers have become increasingly popular as the limitations > of DOM- or SAX-like APIs have become better known. The pull-based > APIs provide an opportunity to build each part of an application in > the way that's most appropriate, allowing a mixture of DOM- and > SAX-like behaviors. > > Expat could provide the basis for an efficient pull-based API if it > offered an opportunity to suspend parsing temporarily, allowing > parsing to resume when the application is ready for additional > information from the document. A .NET-like API could easily be built > on top of such a feature. > > Karl Waclawek and I have been having discussions about this, and think > we have a good idea of how to introduce such a feature into Expat. > There are questions and issues regarding the possible API that would > need to be exposed; I've summarized our ideas an analysis below in the > form of two alternate API proposals. > > We welcome feedback and discussion, including the introduction of > additional API proposals, on the expat-discuss list. > > > Supporting Information > ---------------------- > > Expat 1.95.6 / 1.96 will include a new enumeration, XML_Status, > specifying return values for the XML_Parse() and XML_ParseBuffer() > functions. Our recommendation is that the result of XML_Parse() and > XML_ParseBuffer() be tested for these values specifically, even when > using older versions of Expat 1.95.x -- this will be completely > equivalent in practice. This change allows us to extend the number of > possible return values in the future; the documented API in Expat 1.95 > through 1.95.4 really only defines a boolean interpretation of these > return values, but only the two specific values, now named by > XML_Status enum names, were actually used. > > > API Option 1 > ------------ > > This alternative introduces two new functions and three new constants. > These are only needed if an application uses the new functionality. > > XML_STATUS_SUSPENDED > > New value in the XML_Status enumeration. This is only used if > XML_SuspendParser() has been called. > > XML_ERROR_NOT_SUSPENDED > XML_ERROR_SUSPENDED > > These new error codes would be used to indicate that a call to the > parser was made when the parser was not in the expected internal > state, and indicate programming errors in the application. > > XML_Status > XML_SuspendParser(XML_Parser parser) > > Inform the parser that parsing should be suspended when the > currently active callback returns. It should only be called from > a callback. Returns XML_STATUS_OK or XML_STATUS_ERROR. Multiple > calls to XML_SuspendParser() during a callback are allowed, and > are equivalent to a single call to XML_SuspendParser(). It is an > error to call this function while a callback function is not > active. > > XML_Status > XML_ResumeParser(XML_Parser parser) > > Resume parsing using a suspended parser. Returns XML_STATUS_OK, > XML_STATUS_ERROR, or XML_STATUS_SUSPENDED. If the parser has not > been suspended, this returns XML_STATUS_ERROR, and > XML_GetErrorCode() returns XML_ERROR_NOT_SUSPENDED. The parser is > not invalidated in this case, and parsing may be continued with > additional input using XML_Parse() or XML_ParseBuffer(). > > The following functions change: > > XML_Status > XML_Parse(XML_Parser parser, const char *s, int len, int isFinal) > > XML_Status > XML_ParseBuffer(XML_Parser parser, int len, int isFinal) > > These two existing functions will change the meaning of their > return value slightly. If parsing is suspended using > XML_SuspendParser(), they will return XML_STATUS_SUSPENDED, > otherwise the current values of XML_STATUS_OK and XML_STATUS_ERROR > may be returned. > > If XML_STATUS_SUSPENDED is returned, the parse of the input > document can only be resumed using XML_ResumeParser(). If either > of these is called on a suspended parser, XML_ERROR_OK will be > returned with the error code XML_ERROR_SUSPENDED returned by > XML_GetErrorCode(). The parser is not invalidated in this case, > and parsing may still be resumed. > > void * > XML_GetBuffer(XML_Parser parser, int len) > > If the parser has been suspended, returns NULL and > XML_GetErrorCode() returns XML_ERROR_SUSPENDED. Parsing the input > which has already been passed into Expat should be continued using > XML_ResumeParser(). No changes if the parser was not suspended. > > > Potential Issues > ---------------- > > The risk inherent in this API varient is that it does change the > interpretation of the return code for XML_Parse() and > XML_ParseBuffer(). This is only significant if any callback ever > calls XML_SuspendParser(). In the case of suspension, > XML_STATUS_SUSPENDED would be returned, but an existing main loop will > recognize this as a successful parse. This would be a programming > error in the revised API, but not the old API. If the buffer being > parsed was not the last buffer, a reasonable error would be returned > when the main loop calls XML_Parse() or XML_ParseBuffer() is called > again, but if the last input buffer was already passed (isFinal is > true), there would be no opportunity to report the error, possibly > making it difficult to diagnose application errors introduced by this > change. > > We don't know how important this change is in practice for Expat > 1.95.x users; we would appreciate feedback on the expat-discuss list. > > > API Option 2 > ------------ > > This version of the API changes provide increased backward > compatibility, at the cost of a cruftier API to Expat. > > An alternate version of the API also adds the XML_SuspendParser() and > XML_ResumeParser() functions, and the new XML_ERROR_* constants, but > not the new XML_Status value. This variant would describe suspension > as a pseudo-error from the XML_Parse() and XML_ParseBuffer() > functions, allowing existing applications to report "errors" from the > main loop if they had not been prepared for the suspension feature, > but some callback function called XML_SuspendParser(). This would > only be expected to occur during development, but applications that > only suspend parsing occaissionally may find that poorly tested code > paths expose problems late in the development cycle or even after the > application has entered production. > > The alternate version uses this description for XML_Parse() and > XML_ParseBuffer(): > > XML_Status > XML_Parse(XML_Parser parser, const char *s, int len, int isFinal) > > XML_Status > XML_ParseBuffer(XML_Parser parser, int len, int isFinal) > > If XML_STATUS_ERROR is returned, a main loop which supports the > suspension feature needs to check whether XML_GetErrorCode(parser) > == XML_ERROR_SUSPENDED. If so, the parse was suspended and the > call to continue the parse needs to be XML_ResumeParser(). > Otherwise, the error is "real". > > This approach conflates error codes with the state of the parse, and > labels the normal operation of the parser as an error. -- Mike Olson Principal Consultant mike.olson@fourthought.com +1 303 583 9900 x 102 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, http://4Suite.org Boulder, CO 80301-2537, USA XML strategy, XML tools, knowledge management From Martina@Oefelein.de Thu Aug 8 21:01:54 2002 From: Martina@Oefelein.de (Martina Oefelein) Date: Thu, 8 Aug 2002 22:01:54 +0200 Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: <15698.44041.680284.604789@grendel.zope.com> References: <15690.46103.579963.126437@grendel.zope.com> <15698.44041.680284.604789@grendel.zope.com> Message-ID: At 13:36 Uhr -0400 08.08.2002, Fred L. Drake, Jr. wrote: >Martin v. Loewis writes: > > > > > xmlns:A="http://xml.python.org/a" > > > xmlns:b="http://xml.python.org/b" > > > a:a="a" b:b="b" > > > /> > > > > > > This is just XML 1.0, no namespaces! > > > > Why do you say that this document has no namespaces? It looks to me > > like it has! > >Because I've said this is only an XML 1.0 document only; it happens to >attributes that would be namespace declarations and prefixes if >namespace processing were active, but it isn't. > > > It may be that an application is not *aware* of the namespaces, but > > they surely are present. > >Er, no, there are no namespaces because this is only an XML 1.0 >document; the namespaces recommendation does not apply. Even XML 1.0 reserves colons for namespaces. See this note in section 2.3: Note: The colon character within XML names is reserved for experimentation with name spaces. Its meaning is expected to be standardized at some future point, at which point those documents using the colon for experimental purposes may need to be updated. (There is no guarantee that any name-space mechanism adopted for XML will in fact use the colon as a name-space delimiter.) In practice, this means that authors should not use the colon in XML names except as part of name-space experiments, but that XML processors should accept the colon as a name character. ciao Martina From martin@v.loewis.de Thu Aug 8 21:29:44 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 08 Aug 2002 22:29:44 +0200 Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: <15698.44041.680284.604789@grendel.zope.com> References: <15690.46103.579963.126437@grendel.zope.com> <15698.44041.680284.604789@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > Because I've said this is only an XML 1.0 document only; it happens to > attributes that would be namespace declarations and prefixes if > namespace processing were active, but it isn't. What else could it be? There is only one version of XML. > Er, no, there are no namespaces because this is only an XML 1.0 > document; the namespaces recommendation does not apply. The document does conform to the namespaces recommendation, though. > > That is not true. In DOM Level 2 and onwards, that should be > [...list omitted...] > > If namespaces are active, yes. What do you mean by "active"? Namespaces cannot be active or passive. A document can either conform to the namespaces recommendation or not conform to it, see http://www.w3.org/TR/1999/REC-xml-names-19990114/ This document does conform. Furthermore, an application can be aware or not aware of the namespaces - but this is independent of the document. > Like I said, there are no namespaces. ;-) I hear you say that, but I cannot understand what you mean. The namespaces are there, even if you are unaware that they are. > Ok, so if namespaces *are* active, what should be the localName of the > element in my example? Or even this document would do: > > > > There is nothing there to give a namespace to the element; what > should localName be? That is easy to answer: There is no default namespace (5.2), "If the URI reference in a default namespace declaration is empty, then unprefixed elements in the scope of the declaration are not considered to be in any namespace." So the localName is u"doc". > Note that the createElement() vs. createElementNS() distinction > isn't helpful here, either, since there's nothing in the LS spec > that gives guidance on which to use in this case, or even requires > that either is used in any case -- that's left as an implementation > detail. And rightly so - the LS spec does not even claim that you can implement it merely by using the DOM API. Instead, the real question is how the loader should operate when the "namespaces" feature is off. I don't think the spec currently says; most likely, the authors of the spec assume that the loader fills out the namespace attributes, anyway. They are probably wrong in this position - the loader then has no way to process a document meaningfully that is not namespace conforming. Regards, Martin From jedp@ilm.com Fri Aug 9 19:37:29 2002 From: jedp@ilm.com (Jed Parsons) Date: Fri, 9 Aug 2002 11:37:29 -0700 (PDT) Subject: [XML-SIG] preserving doctype declaration with xml.dom? Message-ID: <200208091837.LAA92083@ocean.lucasdigital.com> I'm losing information when going from text to dom and back to text again. I'm using PyXML-0.7.1 and 4Suite-0.12.0a2. If I have a document like this (the css thing is for an editor): ... blah ... And I do this: # convert xml file to dom reader = xml.dom.ext.reader.PyExpat.Reader() input = open("file.xml") dom = reader.fromString(input.read()) # ... add some attributes to some elements here, only # touching things in the dom.documentElement ... # convert back to text and save string = StringIO.StringIO() xml.dom.ext.Print(dom, string) # and write string.getValue() to a file... When I look at my new file, the doctype declaration just says Am I doing something wrong? Thanks in advance for any help, Jed -- Jed Parsons Industrial Light + Magic (415) 448-2974 grep(do{for(ord){(!$_&&print"$s\n")||(($O+=(($_-1)%6+1)and grep(vec($s,$O++,1)=1,1..int(($_-6*6-1)/6))))}},(split(//, "++,++2-27,280,481=1-7.1++2,800+++2,8310/1+4131+1++2,80\0. What!?"))); From martin@v.loewis.de Sat Aug 10 00:35:55 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 10 Aug 2002 01:35:55 +0200 Subject: [XML-SIG] preserving doctype declaration with xml.dom? In-Reply-To: <200208091837.LAA92083@ocean.lucasdigital.com> References: <200208091837.LAA92083@ocean.lucasdigital.com> Message-ID: Jed Parsons writes: > I'm losing information when going from text to dom and > back to text again. That is not surprising; none of the DOM implementations preserves this information. If you need the functionality, you are encouraged to research this issue, and propose fixes; please expect this to be very difficult. Regards, Martin From jedp@ilm.com Sat Aug 10 00:43:43 2002 From: jedp@ilm.com (Jed Parsons) Date: Fri, 9 Aug 2002 16:43:43 -0700 (PDT) Subject: [XML-SIG] preserving doctype declaration with xml.dom? In-Reply-To: References: <200208091837.LAA92083@ocean.lucasdigital.com> Message-ID: <200208092343.QAA72537@ocean.lucasdigital.com> I see. I shall pursue workarounds. Thanks, j Martin v. Loewis writes: > Jed Parsons writes: > > > I'm losing information when going from text to dom and > > back to text again. > > That is not surprising; none of the DOM implementations preserves this > information. If you need the functionality, you are encouraged to > research this issue, and propose fixes; please expect this to be very > difficult. > > Regards, > Martin -- Jed Parsons Industrial Light + Magic (415) 448-2974 grep(do{for(ord){(!$_&&print"$s\n")||(($O+=(($_-1)%6+1)and grep(vec($s,$O++,1)=1,1..int(($_-6*6-1)/6))))}},(split(//, "++,++2-27,280,481=1-7.1++2,800+++2,8310/1+4131+1++2,80\0. What!?"))); From uche.ogbuji@fourthought.com Sun Aug 11 07:26:06 2002 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sun, 11 Aug 2002 00:26:06 -0600 Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: Message from "Fred L. Drake, Jr." of "Thu, 08 Aug 2002 13:36:09 EDT." <15698.44041.680284.604789@grendel.zope.com> Message-ID: > > Martin v. Loewis writes: > > > > > xmlns:A="http://xml.python.org/a" > > > xmlns:b="http://xml.python.org/b" > > > a:a="a" b:b="b" > > > /> > > > > > > This is just XML 1.0, no namespaces! > > > > Why do you say that this document has no namespaces? It looks to me > > like it has! > > Because I've said this is only an XML 1.0 document only; it happens to > attributes that would be namespace declarations and prefixes if > namespace processing were active, but it isn't. Umm. Not so fast. This document is not XML 1.0 well-formed because attribute names starting with "xml" are reserved. It only becomes WF through the auspices of XMLNS. So either it's a namespaces document, or it's not an XML document at all :-) -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Track chair, XML/Web Services One Boston: http://www.xmlconference.com/ Basic XML and RDF techniques for knowledge management, Part 7 - http://www-106.ibm.com/developerworks/xml/library/x-think12.html Keeping pace with James Clark - http://www-106.ibm.com/developerworks/xml/libra ry/x-jclark.html Python and XML development using 4Suite, Part 3: 4RDF - http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/8A 1EA5A2CF4621C386256BBB006F4CEC From Alexandre.Fayolle@logilab.fr Mon Aug 12 14:32:34 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Mon, 12 Aug 2002 15:32:34 +0200 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: <87vg714nvh.fsf@marant.org> References: <87vg714nvh.fsf@marant.org> Message-ID: <20020812133234.GB17140@orion.logilab.fr> Hello, I'm the current Debian maintainer for python-xml and python-4suite, and Jérôme has forwarded me you mail. > From: "J. Imlay" > To: jerome@debian.org > cc: jimlay@u.washington.edu > Date: Sat, 27 Jul 2002 00:35:40 -0700 (PDT) > Subject: python2.1-xml but with xml.dom.ext.reader.PyExpat? > > Hello, I know this isn't your department but I can't figure out who this > developer for this actually is. It looks like it's 4suite but I don't > think it is because I thought PyExpat was done by the PyExpat people who > are not 4Suite. If you could forward this to the appropriate party, (and > keep me in the cc if you will) I'd appreciate it. Actually, it's the PyXML code you are using (4DOM, to which xml.dom.ext belongs, was donated by the 4Suite team to the PyXML project). I'm cc'ing the PyXML mailing list for further discussion. > > from xml.dom.ext.reader import PyExpat > reader = PyExpat.Reader() > doc = reader.fromUri(uri) > > If the uri contains a #sign (as uri's with references to an anchor tag > do), the # sign should be ignored no? Instead if > uri="http://purl.org/file#" and you ask for the file, the webserver > (depending on how smart it is, apache figures it out, but not all web > servers do) will return a 404. And the url handeler does not realize it's > a 404 and proceeds to choke on the non-xml output. So 2 things. > > 1. It should (I think, you of course can disagree if you think I am > ignorant) pick off the # before making the GET request. > > 2. If there is a http error returned in the GET request it should return > that rather than trying to parse the 404 page as XML and dieing with a > line 1 column 54 error. (the error baffled more than 1 Programmer beyond > solvability, it took some haxoring to figure out it was the # at the end > of the URL that was bombing it) This is certainly a bug, but after having given a look at the code in PyXML, I'd say that it is most likeky a bug in the urllib module from the python standard library, which doesn't throw an exception when an HTTP error is encountered. >>> from urllib import urlopen >>> urlopen('http://purl.org/file#').read() '\n\n404 Not Found\n\n

Not Found

\nThe requested URL /file was not found on this server.

\n\n' >>> urlopen('http://purl.org/file').read() '\n\n404 Not Found\n\n

Not Found

\nThe requested URL /file was not found on this server.

\n\n' Now, this has been fixed in urllib2: >>> from urllib2 import urlopen >>> urlopen('http://purl.org/file').read() Traceback (most recent call last): File "", line 1, in ? <...> File "/usr/lib/python2.1/urllib2.py", line 425, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 404: Not Found Since support for Python1.5 has been dropped from PyXML, perhaps using urllib2 instead of urllib should be considered. I don't know if this module is available in Python2.0, though. Any opinion? Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From Mike.Olson@fourthought.com Mon Aug 12 15:29:19 2002 From: Mike.Olson@fourthought.com (Mike Olson) Date: 12 Aug 2002 08:29:19 -0600 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: <20020812133234.GB17140@orion.logilab.fr> References: <87vg714nvh.fsf@marant.org> <20020812133234.GB17140@orion.logilab.fr> Message-ID: <1029162560.6760.28.camel@penny> On Mon, 2002-08-12 at 07:32, Alexandre wrote: I think it is a good idea. Mike > Hello, I'm the current Debian maintainer for python-xml and > python-4suite, and J=E9r=F4me has forwarded me you mail. >=20 > > From: "J. Imlay" > > To: jerome@debian.org > > cc: jimlay@u.washington.edu > > Date: Sat, 27 Jul 2002 00:35:40 -0700 (PDT) > > Subject: python2.1-xml but with xml.dom.ext.reader.PyExpat? > >=20 > > Hello, I know this isn't your department but I can't figure out who thi= s > > developer for this actually is. It looks like it's 4suite but I don't > > think it is because I thought PyExpat was done by the PyExpat people wh= o > > are not 4Suite. If you could forward this to the appropriate party, (an= d > > keep me in the cc if you will) I'd appreciate it. >=20 > Actually, it's the PyXML code you are using (4DOM, to which xml.dom.ext > belongs, was donated by the 4Suite team to the PyXML project). I'm > cc'ing the PyXML mailing list for further discussion.=20 >=20 > >=20 > > from xml.dom.ext.reader import PyExpat > > reader =3D PyExpat.Reader() > > doc =3D reader.fromUri(uri) > >=20 > > If the uri contains a #sign (as uri's with references to an anchor tag > > do), the # sign should be ignored no? Instead if > > uri=3D"http://purl.org/file#" and you ask for the file, the webserver > > (depending on how smart it is, apache figures it out, but not all web > > servers do) will return a 404. And the url handeler does not realize it= 's > > a 404 and proceeds to choke on the non-xml output. So 2 things. > >=20 > > 1. It should (I think, you of course can disagree if you think I am > > ignorant) pick off the # before making the GET request. > >=20 > > 2. If there is a http error returned in the GET request it should retur= n > > that rather than trying to parse the 404 page as XML and dieing with a > > line 1 column 54 error. (the error baffled more than 1 Programmer beyon= d > > solvability, it took some haxoring to figure out it was the # at the en= d > > of the URL that was bombing it) >=20 > This is certainly a bug, but after having given a look at the code in > PyXML, I'd say that it is most likeky a bug in the urllib module from > the python standard library, which doesn't throw an exception when an > HTTP error is encountered. >=20 > >>> from urllib import urlopen > >>> urlopen('http://purl.org/file#').read() > ' 2.0//EN">\n\n404 Not > Found\n\n

Not Found

\nThe requested URL > /file was not found on this server.

\n\n' > >>> urlopen('http://purl.org/file').read() > ' 2.0//EN">\n\n404 Not > Found\n\n

Not Found

\nThe requested URL > /file was not found on this server.

\n\n' >=20 > Now, this has been fixed in urllib2:=20 >=20 > >>> from urllib2 import urlopen > >>> urlopen('http://purl.org/file').read() > Traceback (most recent call last): > File "", line 1, in ? > <...> > File "/usr/lib/python2.1/urllib2.py", line 425, in http_error_default > raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) > urllib2.HTTPError: HTTP Error 404: Not Found > =20 > Since support for Python1.5 has been dropped from PyXML, perhaps using > urllib2 instead of urllib should be considered. I don't know if this > module is available in Python2.0, though. >=20 > Any opinion? >=20 > Alexandre Fayolle > --=20 > LOGILAB, Paris (France). > http://www.logilab.com http://www.logilab.fr http://www.logilab.org > Narval, the first software agent available as free software (GPL). >=20 > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig --=20 Mike Olson Principal Consultant mike.olson@fourthought.com +1 303 583 9900 x 102 Fourthought, Inc. http://Fourthought.com=20 4735 East Walnut St, http://4Suite.org Boulder, CO 80301-2537, USA XML strategy, XML tools, knowledge management From martin@v.loewis.de Mon Aug 12 16:50:06 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 12 Aug 2002 17:50:06 +0200 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: <1029162560.6760.28.camel@penny> References: <87vg714nvh.fsf@marant.org> <20020812133234.GB17140@orion.logilab.fr> <1029162560.6760.28.camel@penny> Message-ID: Mike Olson writes: > I think it is a good idea. I agree. Alexandre, would you like to implement that change? Regards, Martin From noreply@sourceforge.net Mon Aug 12 20:25:23 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 12 Aug 2002 12:25:23 -0700 Subject: [XML-SIG] [ pyxml-Bugs-594207 ] undefined symbol: error in pyexpat Message-ID: Bugs item #594207, was opened at 2002-08-12 19:25 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=594207&group_id=6473 Category: pyexpat Group: None Status: Open Resolution: None Priority: 5 Submitted By: Ray Leyva (rayleyva) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: undefined symbol: error in pyexpat Initial Comment: >>> from xml.xslt.Processor import Processor Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.2/site-packages/_xmlplus/xslt/Processor.py", line 24, in ? from xml.xslt import StylesheetReader, ReleaseNode File "/usr/lib/python2.2/site-packages/_xmlplus/xslt/StylesheetReader.py", line 286, in ? from xml.parsers import expat File "/usr/lib/python2.2/site-packages/_xmlplus/parsers/expat.py", line 4, in ? from pyexpat import * ImportError: /usr/lib/python2.2/site-packages/_xmlplus/parsers/pyexpat.so: undefined symbol: PyUnicodeUCS2_DecodeUTF8 Thanks, Ray ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=594207&group_id=6473 From Alexandre.Fayolle@logilab.fr Tue Aug 13 08:38:51 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Tue, 13 Aug 2002 09:38:51 +0200 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: References: <87vg714nvh.fsf@marant.org> <20020812133234.GB17140@orion.logilab.fr> <1029162560.6760.28.camel@penny> Message-ID: <20020813073851.GC18060@orion.logilab.fr> On Mon, Aug 12, 2002 at 05:50:06PM +0200, Martin v. Loewis wrote: > Mike Olson writes: > > > I think it is a good idea. > > I agree. Alexandre, would you like to implement that change? I'll give it a try today. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From Alexandre.Fayolle@logilab.fr Tue Aug 13 10:25:24 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Tue, 13 Aug 2002 11:25:24 +0200 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: <20020813073851.GC18060@orion.logilab.fr> References: <87vg714nvh.fsf@marant.org> <20020812133234.GB17140@orion.logilab.fr> <1029162560.6760.28.camel@penny> <20020813073851.GC18060@orion.logilab.fr> Message-ID: <20020813092524.GG18060@orion.logilab.fr> On Tue, Aug 13, 2002 at 09:38:51AM +0200, Alexandre wrote: > On Mon, Aug 12, 2002 at 05:50:06PM +0200, Martin v. Loewis wrote: > > Mike Olson writes: > > > > > I think it is a good idea. > > > > I agree. Alexandre, would you like to implement that change? > > I'll give it a try today. OK, I've got something up and running, which runs nicely with regrtest.py. I'll be committing it very soon. The changes are mainly on the line of changing urllib.urlopen to urllib2.urlopen, and urllib.basejoin to urlparse.urljoin. However, I noticed that whereas urllib.urlopen whas happy with filesystem paths, urllib2.urlopen is not (e.g. urllib2.urlopen('/etc/passwd') raises an AssertionError). I had to work a bit on xml/parsers/xmlproc/xmlapp.py so that the regression tests would pass. My concern is that there may be other places with similar issues which are not triggered by regrtest.py. It would be a great help if you could install the new version and tell me if it causes problems with your application. Thanks. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From dkuhlman@cutter.rexx.com Wed Aug 14 01:36:50 2002 From: dkuhlman@cutter.rexx.com (Dave Kuhlman) Date: Tue, 13 Aug 2002 17:36:50 -0700 Subject: [XML-SIG] ANN: Enhancements to generateDS.py Message-ID: <20020813173650.A71626@cutter.rexx.com> I've enhanced generateDS.py. What it does -- generateDS.py takes a XML Schema definition of an XML document type and generates Python source code that defines classes that can be used to represent the elements in that document type. It also generates a parser that will load an XML document of that type into instances of the generated Python classes. What I've added -- The ability to generate a separate file containing subclasses of the element generation classes. This will make it more convenient for user/developers to add customizing methods to the classes, to add customizing behaviors in different tasks in different files, etc. I've also written a document that compares the use of XSLT and generateDS.py for the purpose of performing transformations on XML documents. You can find it at: http://www.rexx.com/~dkuhlman/#generateDS Comments and suggestions are welcome. - Dave -- Dave Kuhlman dkuhlman@rexx.com http://www.rexx.com/~dkuhlman From mertz@gnosis.cx Wed Aug 14 05:01:29 2002 From: mertz@gnosis.cx (David Mertz, Ph.D.) Date: Wed, 14 Aug 2002 00:01:29 -0400 Subject: [XML-SIG] Listing Gnosis_Utils with PyXML pages Message-ID: Ooops.... I think I mangled some headers, let me try again (sorry): -------- Forwarded message -------- I must confess that I have not followed developments in PyXML as closely as I might have. I suppose there is something notable about that lacuna, given that I write about both XML and Python. But my strategy tends to be to follow a particular project closely for little stints while I am writing about them, but not between such coverage (at least not in detail). Anyway, preamble done, I am a bit stricken to look through some PyXML pages, and not see my Gnosis_Utils package even mentioned--for example at: http://pyxml.sourceforge.net/topics/software.html Actually, I really think that my gnosis.xml.pickle subpackage is usually a better means of serializing Python objects than are WDDX or XML-RPC; ideally I'd love to see it be part of PyXML. But that is probably a "money where my mouth is" kind of issue, and I haven't contributed to PyXML itself (even by discussing ongoing development issues). Well, I also think gnosis.xml.objectify is an easier way to just grab some data in an XML file and work with it than are the standard (heavy) APIs. Same comment there as before. Still, I'd like to ask the relevant website maintainer to add a little link for the Gnosis stuff. Btw. here's part of the current blurb on Gnosis Utils: BACKGROUND: Gnosis Utilites contains a number of Python libraries, most (but not all) related to working with XML. These include: gnosis.xml.pickle (XML pickling of Python objects) gnosis.xml.objectify (Any XML to "native" Python objects) gnosis.xml.validity (Enforce validity constraints) gnosis.xml.indexer (XPATH indexing of XML documents) gnosis.indexer (Full-text indexing/searching) [...].convert.txt2html (Convert ASCII source files to HTML) gnosis.util.dtd2sql (DTD -> SQL 'CREATE TABLE' statements) gnosis.util.sql2dtd (SQL query -> DTD for query results) gnosis.util.xml2sql (XML -> SQL 'INSERT INTO' statements) gnosis.util.combinators (Combinatorial higher-order functions) gnosis.util.introspect (Introspect Python objects) ...and so much more! :-) Details on the latest changes are at: http://gnosis.cx/download/Gnosis_XML_Util.ANNOUNCE The current release is always available as: http://gnosis.cx/download/Gnosis_Utils-current.tar.gz Yours, David... -- mertz@ | The specter of free information is haunting the `Net! All the gnosis | powers of IP- and crypto-tyranny have entered into an unholy .cx | alliance...ideas have nothing to lose but their chains. Unite | against "intellectual property" and anti-privacy regimes! ------------------------------------------------------------------------- From uche.ogbuji@fourthought.com Wed Aug 14 23:37:38 2002 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Wed, 14 Aug 2002 16:37:38 -0600 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: Message from Mike Olson of "12 Aug 2002 08:29:19 MDT." <1029162560.6760.28.camel@penny> Message-ID: Re: Alexandre's suggestion of moving to urllib2 > I think it is a good idea. So do I. I actually have a bug or two to tackle in Uri for Evan tonight.= = I'll give this a whirl as well (in 4Suite). -- = Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Track chair, XML/Web Services One Boston: http://www.xmlconference.com/ Basic XML and RDF techniques for knowledge management, Part 7 - = http://www-106.ibm.com/developerworks/xml/library/x-think12.html Keeping pace with James Clark - http://www-106.ibm.com/developerworks/xml= /libra ry/x-jclark.html Python and XML development using 4Suite, Part 3: 4RDF - = http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-byti= tle/8A 1EA5A2CF4621C386256BBB006F4CEC From uche.ogbuji@fourthought.com Wed Aug 14 23:42:04 2002 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Wed, 14 Aug 2002 16:42:04 -0600 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: Message from Alexandre of "Tue, 13 Aug 2002 11:25:24 +0200." <20020813092524.GG18060@orion.logilab.fr> Message-ID: > On Tue, Aug 13, 2002 at 09:38:51AM +0200, Alexandre wrote: > > On Mon, Aug 12, 2002 at 05:50:06PM +0200, Martin v. Loewis wrote: > > > Mike Olson writes: > > > > > > > I think it is a good idea. > > > > > > I agree. Alexandre, would you like to implement that change? > > > > I'll give it a try today. > > OK, I've got something up and running, which runs nicely with > regrtest.py. I'll be committing it very soon. > > The changes are mainly on the line of changing urllib.urlopen to > urllib2.urlopen, and urllib.basejoin to urlparse.urljoin. Hmm. Here be dragons. There are subtle but important differences between the behavior of urllib.basejoin and urlparse.urljoin, especially with file URLs in Windows. I don't remember the full details off-head but I think Tom Passim and Mike Olson had detailed messages on this in the past. I would suggest keeping urllib around just for basejoin, since it seems we've reached, after a lot of tinkering, a balance where the fewest number of platform users scream :-) -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Track chair, XML/Web Services One Boston: http://www.xmlconference.com/ Basic XML and RDF techniques for knowledge management, Part 7 - http://www-106.ibm.com/developerworks/xml/library/x-think12.html Keeping pace with James Clark - http://www-106.ibm.com/developerworks/xml/libra ry/x-jclark.html Python and XML development using 4Suite, Part 3: 4RDF - http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/8A 1EA5A2CF4621C386256BBB006F4CEC From noreply@sourceforge.net Thu Aug 15 04:31:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 14 Aug 2002 20:31:50 -0700 Subject: [XML-SIG] [ pyxml-Bugs-595376 ] Cannot Use FromXmlStream Twice on Linux Message-ID: Bugs item #595376, was opened at 2002-08-14 23:31 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=595376&group_id=6473 Category: SAX Group: None Status: Open Resolution: None Priority: 5 Submitted By: Keyton Weissinger (keytonw) Assigned to: Nobody/Anonymous (nobody) Summary: Cannot Use FromXmlStream Twice on Linux Initial Comment: If you have two XML files (valid, etc), and attempt to load them into two different DOMs using FromXmlStream() then the first works but the second does not ON LINUX (not on Windows). So the following: from xml.dom.ext.reader.Sax2 import FromXmlStream myFirstDom = FromXmlStream("/usr/files/myFirst.xml") mySecondDom = FromXmlStream("/usr/files/mySecond.xml") Results in an error every time, regardless of which file is loaded first. The first succeeds and the second does not. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=595376&group_id=6473 From noreply@sourceforge.net Thu Aug 15 21:54:58 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 15 Aug 2002 13:54:58 -0700 Subject: [XML-SIG] [ pyxml-Bugs-595729 ] 4DOM/xmlproc catalog not working Message-ID: Bugs item #595729, was opened at 2002-08-16 06:54 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=595729&group_id=6473 Category: xmlproc Group: None Status: Open Resolution: None Priority: 5 Submitted By: Alastair Rankine (alastair) Assigned to: Lars Marius Garshol (larsga) Summary: 4DOM/xmlproc catalog not working Initial Comment: (PyXML 0.8, Python 2.2.1, on Windows) The problem is that I'm trying to get catalog support, in order to validate a document whose DTD is specified by a public ID. As per the docs I am using the xml.dom.ext.reader.Sax2 module: doc = FromXmlFile(myFile, validate=1, catName="catalog") Now as I understand it from reading the source, what this does is create a Sax2-compliant validating parser according to the parser factory, which by default is xmlproc with Sax2 wrapper (ie drv_xmlproc). So it then looks to see that the catName parameter is set, and creates an SAX_catalog resolver to pass into the SAX2 parser. Now this looks fine, except that the SAX_catalog is not called by the underlying xmlproc parser when resolving entities during parsing (eg DTD public IDs). The problem seems to be in drv_xmlproc.prepareParser - we really need to convert the existing SAX entity resolver into an xmlproc resolver. There is a FIXME here - but we actually need to fix xmlproc.catalog by either merging SAX_catalog with xmlproc_catalog, or better still by allowing the construction of an xmlproc_catalog from a SAX-compliant entity parser. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=595729&group_id=6473 From Alexandre.Fayolle@logilab.fr Fri Aug 16 09:17:51 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Fri, 16 Aug 2002 10:17:51 +0200 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: References: <20020813092524.GG18060@orion.logilab.fr> Message-ID: <20020816081751.GC4795@orion.logilab.fr> On Wed, Aug 14, 2002 at 04:42:04PM -0600, Uche Ogbuji wrote: > > On Tue, Aug 13, 2002 at 09:38:51AM +0200, Alexandre wrote: > > > On Mon, Aug 12, 2002 at 05:50:06PM +0200, Martin v. Loewis wrote: > > > > Mike Olson writes: > > > > > > > > > I think it is a good idea. > > > > > > > > I agree. Alexandre, would you like to implement that change? > > > > > > I'll give it a try today. > > > > OK, I've got something up and running, which runs nicely with > > regrtest.py. I'll be committing it very soon. > > > > The changes are mainly on the line of changing urllib.urlopen to > > urllib2.urlopen, and urllib.basejoin to urlparse.urljoin. > > Hmm. Here be dragons. ;o) > There are subtle but important differences between the behavior of > urllib.basejoin and urlparse.urljoin, especially with file URLs in Windows. > I don't remember the full details off-head but I think Tom Passim and Mike > Olson had detailed messages on this in the past. If anyone has an approximative date for this one, I'd be glad to go and have a look in the archives. > I would suggest keeping urllib around just for basejoin, since it seems we've > reached, after a lot of tinkering, a balance where the fewest number of > platform users scream :-) Sounds fair. I'll do the changes today and commit them. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From Alexandre.Fayolle@logilab.fr Fri Aug 16 15:32:10 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Fri, 16 Aug 2002 16:32:10 +0200 Subject: [XML-SIG] Expat and the Python Profiler Message-ID: <20020816143210.GH4795@orion.logilab.fr> Hello, We're having problems profiling python programs using Expat. We traced the problem to the call_with_frame function in pyexpat.c. This function creates a new frame and assignes it to the thread's frame. The profiler gets confused because this frame is neither the parent nor grandparent frame of the current frame, and we get the following traceback: Fixmlreader.IncrementalParser.parse(self, source)lus/sax/expatreader.py", line File "/home/syt/lib/python2.2/site-packages/_xmlplus/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/home/syt/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py", line 207, in feed self._parser.Parse(data, isFinal) File "/home/syt/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py", line 283, in start_element def start_element(self, name, attrs): File "/usr/lib/python2.2/profile.py", line 214, in trace_dispatch_i if self.dispatch[event](self, frame,t): File "/usr/lib/python2.2/profile.py", line 260, in trace_dispatch_call assert rframe.f_back is frame.f_back, ("Bad call", rfn, AssertionError: ('Bad call', ('/home/syt/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py', 196,'feed'), , ,, ) The Frame created by pyexpat is unknown to the profiler when the callback function is executed. This looks like a recent change in pyexpat (to avoid dumping core when an exception is raised in a callback maybe?), but not being able to profile code is really annoying. So the first question is: is this a feature of expat, or a bug in the profiler? Do you have any suggestion? Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From noreply@sourceforge.net Fri Aug 16 16:45:32 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 16 Aug 2002 08:45:32 -0700 Subject: [XML-SIG] [ pyxml-Bugs-596104 ] xml.xslt.Processor does not work Message-ID: Bugs item #596104, was opened at 2002-08-16 17:45 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=596104&group_id=6473 Category: 4Suite Group: None Status: Open Resolution: None Priority: 5 Submitted By: Thomas Schroeder (redsofa) Assigned to: Nobody/Anonymous (nobody) Summary: xml.xslt.Processor does not work Initial Comment: Hi When I try to import the Processor I receive the following error: >>> from xml.xslt.Processor import Processor Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.2/site-packages/_xmlplus/xslt/Processor.py", line 24, in ? from xml.xslt import StylesheetReader, ReleaseNode File "/usr/local/lib/python2.2/site-packages/_xmlplus/xslt/StylesheetReader.py", line 53, in ? from xml.xslt.Stylesheet import StylesheetElement File "/usr/local/lib/python2.2/site-packages/_xmlplus/xslt/Stylesheet.py", line 22, in ? from xml.xslt import XsltElement, XsltException, InternalException, Error ImportError: cannot import name InternalException I use: Python 2.2.1 (#1, Aug 16 2002, 15:46:07) [GCC 2.95.3 20010315 (SuSE)] on linux2 PyXML-0.8 4Suite-0.11.1 Tom ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=596104&group_id=6473 From martin@v.loewis.de Fri Aug 16 20:24:56 2002 From: martin@v.loewis.de (Martin v. Loewis) Date: 16 Aug 2002 21:24:56 +0200 Subject: [XML-SIG] Expat and the Python Profiler In-Reply-To: <20020816143210.GH4795@orion.logilab.fr> References: <20020816143210.GH4795@orion.logilab.fr> Message-ID: Alexandre writes: > The Frame created by pyexpat is unknown to the profiler when the > callback function is executed. > > This looks like a recent change in pyexpat (to avoid dumping core when an > exception is raised in a callback maybe?), but not being able to profile > code is really annoying. My question: What version? There was a relatively recent (read: January 2001) change to pyexpat.c to add that frame, and a even more recent change (read: August 2002) to fix the problem with tracing. > So the first question is: is this a feature of expat, or a bug in the > profiler? Do you have any suggestion? It's a feature in expat to add this frame, so that a traceback in the callback won't show feed() as the topmost function. It's a bug in pyexpat.c (as shipped in PyXML 0.8) that this broke tracing; it's a fix in 2.72 of Python CVS that corrects this bug. If you need an urgent fix for your code base, feel free to backport this fix to whereever you need it; I'm leaving for two weeks, so I don't have the time to do this myself. HTH, Martin From larsga@garshol.priv.no Sat Aug 17 16:53:25 2002 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 17 Aug 2002 17:53:25 +0200 Subject: [XML-SIG] Proposal: command-line interface to parser In-Reply-To: References: Message-ID: * Matt G. | | * Print the status, warning, and error messages to stderr. I | think this is best done by having the parser throw an | exception object (with all the relevant information about | the error or warning), which the application catches. I'm | a bit lost on the benefit of registering an error handler. In most languages you can't return to where the exception was thrown and continue computing, which means that if you want to be able to continue parsing after errors/warnings this approach is much easier than using exceptions. Not that that is very relevant to command-line tools... | * Make validation an option, rather than a separate command Sounds reasonable to me. | * Supply an option to use an SGML catalog file (support | exists for both, right?), though I suppose you could | try to parse a catalog as an SGML catalog file, when it | fails validation as an XML Catalog file. xvcmd.py and xpcmd.py support both. | Is anyone unconvinced that another application is warranted or of | the value of including such an application in PyXML? Actually, I think it's a good idea. Use the existing parsers and write it, then submit it to PyXML, and see if it gets in. Another thing that might be interesting is a graphical XML parser front-end. I wrote the beginnings of one, but it needs more work to become really useful. -- Lars Marius Garshol, Ontopian ISO SC34/WG3, OASIS GeoLang TC From franks@mcs.anl.gov Sat Aug 17 20:38:01 2002 From: franks@mcs.anl.gov (Frank Siebenlist) Date: Sat, 17 Aug 2002 12:38:01 -0700 Subject: [XML-SIG] xml-signature/encryption? Message-ID: <5.1.1.2.2.20020817123542.029c3620@127.0.0.1> I haven't been able to find a python implementation of xml-signature/encryption. Any pointers would be appreciated. Thanks, Frank. From mike@skew.org Sat Aug 17 21:35:29 2002 From: mike@skew.org (Mike Brown) Date: Sat, 17 Aug 2002 14:35:29 -0600 (MDT) Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: "from Uche Ogbuji at Aug 11, 2002 00:26:06 am" Message-ID: <200208172035.g7HKZTh1053077@chilled.skew.org> Uche Ogbuji wrote: > > > > Martin v. Loewis writes: > > > > > > > xmlns:A="http://xml.python.org/a" > > > > xmlns:b="http://xml.python.org/b" > > > > a:a="a" b:b="b" > > > > /> > > > > > > > > This is just XML 1.0, no namespaces! > > > > > > Why do you say that this document has no namespaces? It looks to me > > > like it has! > > > > Because I've said this is only an XML 1.0 document only; it happens to > > attributes that would be namespace declarations and prefixes if > > namespace processing were active, but it isn't. > > Umm. Not so fast. This document is not XML 1.0 well-formed because attribute > names starting with "xml" are reserved. Bzzt. Names starting with "xml" are reserved, yes, but this is a semantic reservation ("don't use it unless it's for the purpose we say it's for), not a well-formedness constraint. The only constraint of the sort you are talking about is on a processing instruction target. That is, "" is definitely not allowed, but "" and "" are allowed. - Mike ____________________________________________________________________________ mike j. brown | xml/xslt: http://skew.org/xml/ denver/boulder, colorado, usa | resume: http://skew.org/~mike/resume/ From tpassin@comcast.net Sun Aug 18 18:58:39 2002 From: tpassin@comcast.net (Thomas B. Passin) Date: Sun, 18 Aug 2002 13:58:39 -0400 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader References: <20020813092524.GG18060@orion.logilab.fr> <20020816081751.GC4795@orion.logilab.fr> Message-ID: <005d01c246e0$e1e82020$fe193044@tbp1> [Alexandre > > > The changes are mainly on the line of changing urllib.urlopen to > > > urllib2.urlopen, and urllib.basejoin to urlparse.urljoin. > > > > Hmm. Here be dragons. > > ;o) > > > There are subtle but important differences between the behavior of > > urllib.basejoin and urlparse.urljoin, especially with file URLs in Windows. > > I don't remember the full details off-head but I think Tom Passim and Mike > > Olson had detailed messages on this in the past. > > If anyone has an approximative date for this one, I'd be glad to go and > have a look in the archives. > The whale blows from time to time, as for example - 3-7-2001 1-18-2002 7-15-2002 Cheers, Tom P From uche.ogbuji@fourthought.com Mon Aug 19 08:30:38 2002 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Mon, 19 Aug 2002 01:30:38 -0600 Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: Message from Mike Brown of "Sat, 17 Aug 2002 14:35:29 MDT." <200208172035.g7HKZTh1053077@chilled.skew.org> Message-ID: > Uche Ogbuji wrote: > > > > > > Martin v. Loewis writes: > > > > > > > > > xmlns:A="http://xml.python.org/a" > > > > > xmlns:b="http://xml.python.org/b" > > > > > a:a="a" b:b="b" > > > > > /> > > > > > > > > > > This is just XML 1.0, no namespaces! > > > > > > > > Why do you say that this document has no namespaces? It looks to me > > > > like it has! > > > > > > Because I've said this is only an XML 1.0 document only; it happens to > > > attributes that would be namespace declarations and prefixes if > > > namespace processing were active, but it isn't. > > > > Umm. Not so fast. This document is not XML 1.0 well-formed because attribute > > names starting with "xml" are reserved. > > Bzzt. Names starting with "xml" are reserved, yes, but this is a semantic > reservation ("don't use it unless it's for the purpose we say it's for), not a > well-formedness constraint. OK, you are right on the technicality (I would have jabbed you with the same ice pick ;-) ). But I don't think it changes the practical discussion. The semantic restriction is fully in place because minidom is namespace-aware, which means that it is meaningless to say that that Fred's document is not an XML namespace document. I think this is what Martin has been sayig in his cryptic way :-). -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Track chair, XML/Web Services One Boston: http://www.xmlconference.com/ Basic XML and RDF techniques for knowledge management, Part 7 - http://www-106.ibm.com/developerworks/xml/library/x-think12.html Keeping pace with James Clark - http://www-106.ibm.com/developerworks/xml/libra ry/x-jclark.html Python and XML development using 4Suite, Part 3: 4RDF - http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/8A 1EA5A2CF4621C386256BBB006F4CEC From Alexandre.Fayolle@logilab.fr Mon Aug 19 09:20:59 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Mon, 19 Aug 2002 10:20:59 +0200 Subject: [XML-SIG] Expat and the Python Profiler In-Reply-To: References: <20020816143210.GH4795@orion.logilab.fr> Message-ID: <20020819082059.GJ4795@orion.logilab.fr> On Fri, Aug 16, 2002 at 09:24:56PM +0200, Martin v. Loewis wrote: > Alexandre writes: > > > The Frame created by pyexpat is unknown to the profiler when the > > callback function is executed. > > > > This looks like a recent change in pyexpat (to avoid dumping core when an > > exception is raised in a callback maybe?), but not being able to profile > > code is really annoying. > > My question: What version? There was a relatively recent (read: > January 2001) change to pyexpat.c to add that frame, and a even more > recent change (read: August 2002) to fix the problem with tracing. as of PyXML 0.8 or current CVS, de)ênding on the machine I tested on. Sorry for not providing the information. > > > So the first question is: is this a feature of expat, or a bug in the > > profiler? Do you have any suggestion? > > It's a feature in expat to add this frame, so that a traceback in the > callback won't show feed() as the topmost function. > > It's a bug in pyexpat.c (as shipped in PyXML 0.8) that this broke > tracing; it's a fix in 2.72 of Python CVS that corrects this bug. OK. > If you need an urgent fix for your code base, feel free to backport > this fix to whereever you need it; I'm leaving for two weeks, so I > don't have the time to do this myself. See you soon then. I've extracted the patch to apply it on pyxml. The problem I have is that it doesn't seem to be compatible with python < 2.2 (it uses the PyTrace_CALL constant which is not defined in python 2.1). Is making the patch available only to users of python2.2 an acceptable solution? I'll try to do that and I'll post it on the SF patch manager for review. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From noreply@sourceforge.net Mon Aug 19 09:48:51 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 19 Aug 2002 01:48:51 -0700 Subject: [XML-SIG] [ pyxml-Patches-597052 ] pyexpat and profiler fix (Py2.2 only) Message-ID: Patches item #597052, was opened at 2002-08-19 10:48 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=306473&aid=597052&group_id=6473 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Alexandre Fayolle (afayolle) Assigned to: Nobody/Anonymous (nobody) Summary: pyexpat and profiler fix (Py2.2 only) Initial Comment: See http://mail.python.org/pipermail/xml-sig/2002-August/008244.html and following thread for description. This is mainly a backport of Martin von Löwis' patch on python's expat, with some conditional prepocessor directives because the patch only seems to work on python >=2.2. Alexandre ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=306473&aid=597052&group_id=6473 From Alexandre.Fayolle@logilab.fr Mon Aug 19 09:50:31 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Mon, 19 Aug 2002 10:50:31 +0200 Subject: [XML-SIG] Expat and the Python Profiler In-Reply-To: <20020819082059.GJ4795@orion.logilab.fr> References: <20020816143210.GH4795@orion.logilab.fr> <20020819082059.GJ4795@orion.logilab.fr> Message-ID: <20020819085031.GL4795@orion.logilab.fr> On Mon, Aug 19, 2002 at 10:20:59AM +0200, Alexandre wrote: > I've extracted the patch to apply it on pyxml. The problem I have is that > it doesn't seem to be compatible with python < 2.2 (it uses the PyTrace_CALL > constant which is not defined in python 2.1). > > Is making the patch available only to users of python2.2 an acceptable > solution? I'll try to do that and I'll post it on the SF patch manager > for review. Done. The patch is available as #597052 at (warning! Long URL) http://sourceforge.net/tracker/index.php?func=detail&aid=597052&group_id=6473&atid=306473 Thanks for your feedback. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From BudP.Bruegger Mon Aug 19 18:28:19 2002 From: BudP.Bruegger (BudP.Bruegger) Date: Mon, 19 Aug 2002 19:28:19 +0200 Subject: [XML-SIG] ANN: SLiP and SLIDE - a quick XML shorthand syntax and tool for editing Message-ID: <20020819192819.630dce5b.bud@sistema.it> hello, (A little late,) I have noted the announcement of SLiP and the followup discussion on XML shorthand on this list. Have you guys followed up on the topic and are working on a joint specification/implementation? I would be interested to join in. In the following I describe some ideas. In 1999, I did some work on an XML shorthand syntax in the context of a successor of Ian Clatworthy's SDF (see http://www.cpan.org/modules/by-authors/id/IANC/sdf-2.000.readme). Some of the ideas may be quite applicable to the discussion here. Here is a modernized/simplified version (that takes your postings into account): [see also examples below] I think of the XML document in terms of DOM nodes where nodes are either elements, attributes, text, or comments. (Maybe also processing instructions, entities, etc. should be added). Generally, each node starts on a line of its own and is often contained in a single line. (Note that PYX uses a similar approach, see for example http://www.xml.com/pub/a/2000/03/15/feature/index.html). This makes parsing easy, makes the structure more explicit, and possibly allows ad-hoc analysis using tools such as grep and similar. Most importantly, however, this approach makes it also much easier to mix text nodes with element nodes such as in the following example: "this is a boldword". The hierarchy of nodes is defined by (pythonic) indentation. An element has the format * An attribute has the format @"" (WS=white space without newline) A comment has the format #"" (WS=white space without newline) Note that this does not seem to conflict with the use of '#' in URLs. A text node has the format "" or on multiple lines: """ ... """ details: Note that occurrences of '"' or '"""' after the opening '"' or '"""' should be escaped with '\'. Similarly, the special characters "*", "@", "#", and "%" as first characters of a multi-line text need to be quoted with "\". Also, it is on purpose that the multi-line text starts only on the line below the '"""'--I like the indentation better, particularly for pre-formatted things. Note also that instead of (single or tripple) double quote characters ('"'), single quote characters ("'") could be used equivalently. Also, the indentation is stripped off in the equivalent XML document. This prevents multi-line (possibly pre-formatted) text to break the visual structure of the document. For rapid authoring, a text node is not really a strict text node but automatically should quote the common character entities. For example "1 > 0" really represents "1 > 0". Also, in single-line text, if the line breaks need to be expressed precisely, using '\n' to represent line breaks is highly beneficial (see example 2 below). If entity substitution (quoting) is not desired, and for many other uses, variations of the plain text node can be used. They have the following format: %"" or in the case of multiple lines: %""" ... While the approach really offers unlimited possibilities, I have thought of the following possible text types: * normal: i.e., equivalent to stating no % * raw: no quotation or different substitution. This allows also to embed xml into the shorthand document * structuredText or st: This makes it easy to write lists, emphasized words, etc. * stNG: same in different version * sh: the resulting text returned by some shell command * py: the resulting text returned by some python expression * incl: inclusion of the content of an external file (could be done with sh) * img: translates to an html image element but smartly computes size, changes format, creates thumbnail, or similar. * table: some smart way of making tables (see for example the several approaches in SDF that were quite easy and successful--I always used a fixed format one) Obviously, there are unlimited possibilities and the shorthand package could come with a small predefined library and a simple mechanism to add ones own. (I implemented a web template system many years ago in perl that used the same approach very successfully). While I haven't thought this through in detail, the text processing code could either expect a certain formatting convention (as in ordered lists or tables) and/or use attributes that are associated with the text node. (Note that this is an extension of the DOM model where text nodes cannot have attributes). Some useful shortcuts: While I proposed to start each DOM node on a separate line, here is a possible exception that makes it possible to be terser (but makes it impossible to analyse the document using easy grepping). In the case where an element has no attributes and contains a single, single-line text node, the text can be added to the same line as the node: *"" Similarly, in the case of a multi-line text following an attribute less element, one could write: *""" ... """ The implementation seems rather straight forward. In particularly, the type of node can be detected by a parser looking simply at the beginning of each line. One source of ambiguity that has to be solved is how to differentiate empty elements from non-empty elements without content. For example, how should the shorthand differ for and ? One possible solution would be to use "*/" instead of "*" to prefix empty elements. While converting xml to shorthand is trivial, some more challenges are to be expected when going from xml back to shorthand. There is an ambiguity of what types of text to use and how to chose. In case it is possible to come up with simple rules for the kinds of text to chose (always normal except for UL, OL, and TABLE elements), it is easy. Normal text may optionally be wrapped differently. Since I propose an open framework, the mapping is not necessarily always that well defined. Maybe each text module needs to define some rules for when it can be applied??? One strength of the approach seems that--in the case of document centric xml--it can precisely preserve line breaks and white space that, for example, make a difference in how web browsers render (x)html. Anyhow, I hope this is of interest and I would be happy to discuss more or participate in an implementation. kind regards --bud ----------- example 1 ------------------- *root # "this is an example of my xml shorthand ideas" *address @type "home" *street "123 Sesame Street" *city "Wonderland" *state "CA" *zipCode "90012" *comment """ Please leave packages with Grouch in garbage can next door.""" ----------- example 2 ------------------- *root @attr1 "cool" @attr2 'moose' *budNS:someElement "Some words are " *strong "bold" " and some are not.\n" #""" note that this is a single line consisting of two text nodes that surround a _strong_ element""" *someOtherElement %structuredText''' It would be rather more cumbersome to write * nested unordered lists * such as this * and others * and odered lists or tables without the use of _structured text_.''' *someNestedElement ''' note that the flexible use of single or double quote characters makes quoting of " and ' easier even when they are trippled as in """ or \'''.''' *foo "this is equivalent to*" *foo "this format here" *note """ this is the same (format) equivalence example on multiple lines""" *note """ with two lines of text here""" /----------------------------------------------------------------- | Bud P. Bruegger, Ph.D. | Sistema (www.sistema.it) | Via U. Bassi, 54 | 58100 Grosseto, Italy | +39-0564-411682 (voice and fax) \----------------------------------------------------------------- From mike@skew.org Mon Aug 19 19:29:44 2002 From: mike@skew.org (Mike Brown) Date: Mon, 19 Aug 2002 12:29:44 -0600 (MDT) Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: "from Uche Ogbuji at Aug 19, 2002 01:30:38 am" Message-ID: <200208191829.g7JITi5c069807@chilled.skew.org> > > > > Why do you say that this document has no namespaces? > > > > > > Because I've said this is only an XML 1.0 document only; > > > > Umm. Not so fast. This document is not XML 1.0 well-formed > > > > Names starting with "xml" are reserved, yes, but this is a semantic > > reservation ("don't use it unless it's for the purpose we say it's for), > > not a well-formedness constraint. > > OK, you are right on the technicality (I would have jabbed you with the same > ice pick ;-) ). But I don't think it changes the practical discussion. The > semantic restriction is fully in place because minidom is namespace-aware, > which means that it is meaningless to say that that Fred's document is not an > XML namespace document. I think this is what Martin has been sayig in his > cryptic way :-). I would say that a more correct rephrasing of Martin's point is that the choice of whether to use namespace-aware processing helps determine the document's status of being "a namespace document"; this status is not intrinsic to the document just because the document meets the conformance requirements of the Namespaces in XML rec. The first quote above (sorry, I gave up on attributions) implies that such an assumption was being made (conforms to rec == intrinsically namespace-aware == requires namespace-aware processing). I personally feel that under no circumstances does an XML document have any status with respect to namespaces; being conformant to the namespaces rec just makes namespace-aware processing a possibility, not a requirement ... not only because of the way the specs are layered, but also because of the fact that the document has no way of unambiguously saying "I must be processed in a [non-]namespace-aware manner". Therefore, it is up to the architect of the processing system to ensure that documents are processed in the manner that is appropriate for their system's documents. This means that if one needs to be able to disable namespace-aware processing, then they'd better not be using minidom. Minidom is under no obligation to support non-namespace-aware processing. [so we are in agreement] :) - Mike ____________________________________________________________________________ mike j. brown | xml/xslt: http://skew.org/xml/ denver/boulder, colorado, usa | resume: http://skew.org/~mike/resume/ From fdrake@acm.org Mon Aug 19 19:40:37 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 19 Aug 2002 14:40:37 -0400 Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: <200208191829.g7JITi5c069807@chilled.skew.org> References: <200208191829.g7JITi5c069807@chilled.skew.org> Message-ID: <15713.15269.98947.380831@grendel.zope.com> Mike Brown writes: > I personally feel that under no circumstances does an XML document > have any status with respect to namespaces; being conformant to the > namespaces rec just makes namespace-aware processing a possibility, > not a requirement ... not only because of the way the specs are > layered, but also because of the fact that the document has no way > of unambiguously saying "I must be processed in a > [non-]namespace-aware manner". I keep meaning to get back to this conversation (since I started it), but hadn't come up with a good way to say what you just said. I agree with this. Thanks for describing this so clearly! > Therefore, it is up to the architect of the processing system to > ensure that documents are processed in the manner that is > appropriate for their system's documents. This means that if one > needs to be able to disable namespace-aware processing, then they'd > better not be using minidom. Minidom is under no obligation to > support non-namespace-aware processing. [so we are in agreement] :) Ah, but here I disagree. minidom should support namespace-unaware processing, primarily because it is *the* DOM that is shipped as part of the Python standard library, and most simple applications of XML are namespace unaware (which is more reasonable than expecting them to become namespace aware). I consider this a substantial requirement. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From mmoales@fluent.com Mon Aug 19 22:50:13 2002 From: mmoales@fluent.com (Mark Moales) Date: Mon, 19 Aug 2002 17:50:13 -0400 Subject: [XML-SIG] Memory leak in xmlrpclib.py on Windows? Message-ID: <3D616815.5ACBBA5@fluent.com> This is a multi-part message in MIME format. --------------C58C3B95A32E7CBF7C4E2DD6 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hi, If this isn't the correct mailing list to send this type of question to, please forgive me... I'm running a very simple XML-RPC client/server on Windows 2000 using Python 2.2.1. Using perfmon, I see about a 4K increase in the size of my client and server processes after about a half dozen or so calls. It doesn't seem to matter which parser I use (expat or SlowParser). I get similar results. I also ran my test on a Linux box without expat and did NOT see the leak. Any ideas? I seem to remember seeing something about a leak in the expat parser, but, like I said, I see similar results using the SlowParser. I've attached my scripts if anyone would like to try it. Thanks! Mark Moales --------------C58C3B95A32E7CBF7C4E2DD6 Content-Type: text/plain; charset=us-ascii; name="SimpleServer.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="SimpleServer.py" from SimpleXMLRPCServer import * class SimpleServer: def processData(self): print 'SimpleServer invoked' return 1 if __name__ == '__main__': server = SimpleXMLRPCServer(('localhost', 80)) server.register_instance(SimpleServer()) server.serve_forever() --------------C58C3B95A32E7CBF7C4E2DD6 Content-Type: text/plain; charset=us-ascii; name="SimpleClient.py" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="SimpleClient.py" from xmlrpclib import ServerProxy import gc import time def runit(): sp = ServerProxy('http://localhost') for i in range(600): result = sp.processData() time.sleep(0.5) if __name__ == '__main__': runit() --------------C58C3B95A32E7CBF7C4E2DD6 Content-Type: text/x-vcard; charset=us-ascii; name="mmoales.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Mark Moales Content-Disposition: attachment; filename="mmoales.vcf" begin:vcard n:Moales;Mark tel;work:603-643-2600 x758 x-mozilla-html:FALSE url:www.fluent.com org:Fluent, Inc.;Software Development version:2.1 email;internet:mmoales@fluent.com adr;quoted-printable:;;10 Cavendish Ct.=0D=0A;Lebanon;NH;03766;USA fn:Mark Moales end:vcard --------------C58C3B95A32E7CBF7C4E2DD6-- From noreply@sourceforge.net Tue Aug 20 18:50:17 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 20 Aug 2002 10:50:17 -0700 Subject: [XML-SIG] [ pyxml-Bugs-597861 ] FromXmlFile needs try: finally: Message-ID: Bugs item #597861, was opened at 2002-08-21 03:50 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=597861&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Alastair Rankine (alastair) Assigned to: Nobody/Anonymous (nobody) Summary: FromXmlFile needs try: finally: Initial Comment: So that the file is closed in case an exception is thrown from within the parser. Inside xml.dom.ext.reader.Sax2.py: change: fp = open(fileName, 'r') rv = FromXmlStream(fp, ownerDocument, validate, keepAllWs, catName, saxHandlerClass, parser) fp.close() return rv to: fp = open(fileName, 'r') try: rv = FromXmlStream(fp, ownerDocument, validate, keepAllWs, catName, saxHandlerClass, parser) finally: fp.close() return rv ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=597861&group_id=6473 From noreply@sourceforge.net Tue Aug 20 21:04:50 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 20 Aug 2002 13:04:50 -0700 Subject: [XML-SIG] [ pyxml-Bugs-597923 ] xml.dom.minidom clone() throws NameError Message-ID: Bugs item #597923, was opened at 2002-08-20 20:04 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=597923&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: James Kew (jkew) Assigned to: Nobody/Anonymous (nobody) Summary: xml.dom.minidom clone() throws NameError Initial Comment: clone() on xml.dom.minidom.Attr fails badly in _cloneNode: PythonWin 2.2.1 (#34, Apr 15 2002, 09:51:39) [MSC 32 bit (Intel)] on win32. Portions Copyright 1994-2001 Mark Hammond (mhammond@skippinet.com.au) - see 'Help/About PythonWin' for further copyright information. >>> import xml.dom.minidom >>> doc = xml.dom.minidom.Document() >>> att = doc.createAttribute("attr") >>> att2 = att.cloneNode(deep=0) Traceback (most recent call last): File "", line 1, in ? File "C:\Python22\Lib\site- packages\_xmlplus\dom\minidom.py", line 186, in cloneNode clone = _clone_node(self, deep, self.ownerDocument or self) File "C:\Python22\Lib\site- packages\_xmlplus\dom\minidom.py", line 1248, in _clone_node elif node.nodeType == PROCESSING_INSTRUCTION_NODE: NameError: global name 'PROCESSING_INSTRUCTION_NODE' is not defined I was originally clone()ing Attr nodes in an attempt copy attributes from one element to another by iterating the attributes() NamedNodeList. I'm using the PyXML-0.8.win32-py2.2.exe release. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=597923&group_id=6473 From BudP.Bruegger Wed Aug 21 09:04:18 2002 From: BudP.Bruegger (BudP.Bruegger) Date: Wed, 21 Aug 2002 10:04:18 +0200 Subject: [XML-SIG] SAX: distinguising empty and non-empty elements? Message-ID: <20020821100418.5433eddb.bud@sistema.it> [Hope this isn't an FAQ, searching on the web I came up w/o a conclusive answer] Is there a possibility to distinguish between non-empty elements w/o content and empty elements in (Python) Sax? I found the following text in Sun's sax tutorial (http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/sax/2a_echo.html): The single-tag empty element you defined () is treated exactly the same as a two-tag empty element (). It is, for all intents and purposes, identical. (It's just easier to type and consumes less space.) Does this apply only to the example app that echoes an xml file using a sax parser or is it general? Any differences between Java and Python? Many thanks for your help --bud /----------------------------------------------------------------- | Bud P. Bruegger, Ph.D. | Sistema (www.sistema.it) | Via U. Bassi, 54 | 58100 Grosseto, Italy | +39-0564-411682 (voice and fax) \----------------------------------------------------------------- From Alexandre.Fayolle@logilab.fr Wed Aug 21 10:56:46 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Wed, 21 Aug 2002 11:56:46 +0200 Subject: [XML-SIG] Error 404 and xml.dom.ext.reader In-Reply-To: References: <20020813092524.GG18060@orion.logilab.fr> Message-ID: <20020821095646.GE17830@orion.logilab.fr> On Wed, Aug 14, 2002 at 04:42:04PM -0600, Uche Ogbuji wrote: > There are subtle but important differences between the behavior of > urllib.basejoin and urlparse.urljoin, especially with file URLs in Windows. > I don't remember the full details off-head but I think Tom Passim and Mike > Olson had detailed messages on this in the past. > > I would suggest keeping urllib around just for basejoin, since it seems we've > reached, after a lot of tinkering, a balance where the fewest number of > platform users scream :-) OK, I've just commited the changes. I haven't changed existing calls to urlparse.urljoin to urllib.basejoin, only reverted my fixes. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From larsga@garshol.priv.no Wed Aug 21 13:49:19 2002 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 21 Aug 2002 14:49:19 +0200 Subject: [XML-SIG] SAX: distinguising empty and non-empty elements? In-Reply-To: <20020821100418.5433eddb.bud@sistema.it> References: <20020821100418.5433eddb.bud@sistema.it> Message-ID: * Bud P. Bruegger | | Is there a possibility to distinguish between non-empty elements w/o | content and empty elements in (Python) Sax? No. The distinction is considered to be a purely lexical distinction with no more importance than the difference between E; and e;. | The single-tag empty element you defined () is | treated exactly the same as a two-tag empty element | (). It is, for all intents and purposes, | identical. (It's just easier to type and consumes less space.) | | | Does this apply only to the example app that echoes an xml file | using a sax parser or is it general? It is general. Notice that neither the infoset nor the DOM maintains this distinction. | Any differences between Java and Python? No. -- Lars Marius Garshol, Ontopian ISO SC34/WG3, OASIS GeoLang TC From mmoales@fluent.com Thu Aug 22 15:45:32 2002 From: mmoales@fluent.com (Mark Moales) Date: Thu, 22 Aug 2002 10:45:32 -0400 Subject: [XML-SIG] Re: Memory leak in xmlrpclib.py on Windows? References: <3D616815.5ACBBA5@fluent.com> Message-ID: <3D64F90C.2CB57C84@fluent.com> This is a multi-part message in MIME format. --------------397CDC7289865CA945976578 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit The problem appears to be in httplib.HTTPConnection. I replaced xmlrpclib.Transport with my own socket-based Transport and the memory leak goes away. So, I took the HTTPConnection sample out of the Python doc and stuck a loop around it. Sure enough, I see a 4K increase every 10 seconds or so over the life of the process. Thanks, Mark Mark Moales wrote: > > Hi, > > If this isn't the correct mailing list to send this type of question to, > please forgive me... > > I'm running a very simple XML-RPC client/server on Windows 2000 using > Python 2.2.1. Using perfmon, I see about a 4K increase in the size of > my client and server processes after about a half dozen or so calls. It > doesn't seem to matter which parser I use (expat or SlowParser). I get > similar results. I also ran my test on a Linux box without expat and > did NOT see the leak. Any ideas? I seem to remember seeing something > about a leak in the expat parser, but, like I said, I see similar > results using the SlowParser. I've attached my scripts if anyone would > like to try it. > > Thanks! > > Mark Moales > > ------------------------------------------------------------------------ > from SimpleXMLRPCServer import * > > class SimpleServer: > def processData(self): > print 'SimpleServer invoked' > return 1 > > if __name__ == '__main__': > server = SimpleXMLRPCServer(('localhost', 80)) > server.register_instance(SimpleServer()) > server.serve_forever() > > ------------------------------------------------------------------------ > from xmlrpclib import ServerProxy > import gc > import time > > def runit(): > sp = ServerProxy('http://localhost') > > for i in range(600): > result = sp.processData() > time.sleep(0.5) > > if __name__ == '__main__': > runit() --------------397CDC7289865CA945976578 Content-Type: text/x-vcard; charset=us-ascii; name="mmoales.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Mark Moales Content-Disposition: attachment; filename="mmoales.vcf" begin:vcard n:Moales;Mark tel;work:603-643-2600 x758 x-mozilla-html:FALSE url:www.fluent.com org:Fluent, Inc.;Software Development version:2.1 email;internet:mmoales@fluent.com adr;quoted-printable:;;10 Cavendish Ct.=0D=0A;Lebanon;NH;03766;USA fn:Mark Moales end:vcard --------------397CDC7289865CA945976578-- From uche.ogbuji@fourthought.com Thu Aug 22 19:28:49 2002 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 22 Aug 2002 12:28:49 -0600 Subject: [XML-SIG] Element.localName, Attr.localName In-Reply-To: Message from "Fred L. Drake, Jr." of "Mon, 19 Aug 2002 14:40:37 EDT." <15713.15269.98947.380831@grendel.zope.com> Message-ID: > > Therefore, it is up to the architect of the processing system to > > ensure that documents are processed in the manner that is > > appropriate for their system's documents. This means that if one > > needs to be able to disable namespace-aware processing, then they'd > > better not be using minidom. Minidom is under no obligation to > > support non-namespace-aware processing. [so we are in agreement] :) > > Ah, but here I disagree. minidom should support namespace-unaware > processing, primarily because it is *the* DOM that is shipped as part > of the Python standard library, and most simple applications of XML > are namespace unaware (which is more reasonable than expecting them to > become namespace aware). I consider this a substantial requirement. I think this is specious. Under what curcumstances is it useful to have relevant prefix and namespaceURI attributes to be None when the user has used namespace declarations? Do you have an actual use case or scenario wher this is useful? I certainly can't think of any. It would be one thing if DOM Level 2 "descructively" supported namespaces, but it doesn't. All namespace attributes and operations are mere additions, and the user can ignore them if he chooses to. In (Bud P.Bruegger's message of "Mon, 19 Aug 2002 19:28:19 +0200") References: <20020819192819.630dce5b.bud@sistema.it> Message-ID: Bud P.Bruegger writes: [Your message were formatted very strange...] > Anyhow, I hope this is of interest and I would be happy to discuss > more or participate in an implementation. I've the feeling you are starting to invent some SGML features like shortrefs, usemaps, tag omission, nettag, etc. ;) -- ke@suse.de (work) / keichwa@gmx.net (home): | http://www.suse.de/~ke/ | ,__o Free Translation Project: | _-\_<, http://www.iro.umontreal.ca/contrib/po/HTML/ | (*)/'(*) From dieter@handshake.de Sun Aug 25 20:28:52 2002 From: dieter@handshake.de (Dieter Maurer) Date: Sun, 25 Aug 2002 21:28:52 +0200 Subject: [XML-SIG] CGI Problem In-Reply-To: References: Message-ID: <15721.12276.537858.176079@gargle.gargle.HOWL> Michael Hall writes: > ... > I have a Python CGI script that converts XML to HTML using XSLT (PyXML). > The script produces beautiful output to standard out if called from the > command line, but will not work if called via http. Other simpler > Python CGI scripts in the same directory that simply print "Hello World" > work fine. > ... > "/usr/lib/python2.2/site-packages/_xmlplus/xslt/StylesheetReader.py", line 355, in initParser > self.parser.ExternalEntityRefHandler = self.handler.entityRef AttributeError: StylesheetReader instance has no attribute > 'entityRef' Looks like a bug in "xml2html.py" script. Apparently the registered handler (maybe the "Document" handler, but I am not sure) seems not to provide for external entitity resolution. > Where should I be looking for the problem? .xml file? .xsl file? .py CGI > script? server environment? It's not the XML nor the XSL file nor the CGI script. Some registered handler lacks support for External Entity resolution. Dieter From olc@ninti.com Mon Aug 26 13:25:40 2002 From: olc@ninti.com (Michael Hall) Date: Mon, 26 Aug 2002 21:55:40 +0930 (CST) Subject: [XML-SIG] CGI Problem solved (sort of) Message-ID: I have solved the problem described below simply by changing this line in my CGI script: #!/usr/bin/python2 to this: #!/usr/bin/python I don't know why, by python2 won't do the transformation whereas good old python 1.5.2 will. Maybe it is an issue with where or how I installed PyXML? Anyway, forward ever backward never :-) Mick ORIGINAL MESSAGE BELOW: I have a Python CGI script that converts XML to HTML using XSLT (PyXML). The script produces beautiful output to standard out if called from the command line, but will not work if called via http. Other simpler Python CGI scripts in the same directory that simply print "Hello World" work fine. This is the output I get from the xml2html.py script: Traceback (most recent call last): File "/home/olc/cgi-bin/xml2html.py", line 21, in ? xsltproc.appendStylesheetUri(stylesheet) File "/usr/lib/python2.2/site-packages/_xmlplus/xslt/Processor.py", line 95, in appendStylesheetUri sty = self._styReader.fromUri(styleSheetUri, baseUri) File "/usr/lib/python2.2/site-packages/_xmlplus/xslt/StylesheetReader.py", line 298, in fromUri ownerDoc, stripElements) File "/usr/lib/python2.2/site-packages/_xmlplus/xslt/minisupport.py", line 58, in fromUri return self.fromStream(stream, baseUri, ownerDoc, stripElements) File "/usr/lib/python2.2/site-packages/_xmlplus/xslt/StylesheetReader.py", line 305, in fromStream self.initParser() File "/usr/lib/python2.2/site-packages/_xmlplus/xslt/StylesheetReader.py", line 355, in initParser self.parser.ExternalEntityRefHandler = self.handler.entityRef AttributeError: StylesheetReader instance has no attribute 'entityRef' Where should I be looking for the problem? .xml file? .xsl file? .py CGI script? server environment? Is there something obvious I don't know about going on here? I've been following Jones & Drake's 'Python & XML' to the letter. TIA Mick _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig From Craig.Dillabaugh@CCRS.NRCan.gc.ca Mon Aug 26 19:22:57 2002 From: Craig.Dillabaugh@CCRS.NRCan.gc.ca (Dillabaugh, Craig) Date: Mon, 26 Aug 2002 14:22:57 -0400 Subject: [XML-SIG] Problem Installing PyXML Message-ID: <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> Hello, I am writing this message because I've run into some problems installing PyXML, and the README indicated I should report my problems here. My installation is on SUN Ultra running SUN OS 5.7. I am not root on this machine so I've installed Python in a non-standard directory: /genesis1/cdillaba/Python2.2.1 I've tried to put PyXML in /genesis1/cdillaba/Python2.2.1/PyXML-0.8 I am installing from the tar file, my build command is: /genesis1/cdillaba/Python2.2.1/python setup.py build The script runs successfully for some time but chokes at the following point: ... running build_ext Traceback (most recent call last) File "setup.py", line 222, in ? scripts = ['scripts/xmlproc_parse', 'scripts/xmlproc_val'] File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/core.py", line 138, in setup dist.run_commands() File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/dist.py", line 893, in run_commands self.run_command(cmd) File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/dist.py", line 913, in run_command cmd_obj.run() File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/command/build.py", line 107, in run self.run_command(cmd_name) File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/cmd.py", line 330, in run_command self.distribution.run_command(command) File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/dist.py", line 913, in run_command cmd_obj.run() File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/command/build_ext.py", line 231, in run customize_compiler(self.compiler) File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/sysconfig.py", line 126, in customize_compiler (cc, opt, ccshared, ldshared, so_ext) = \ File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/sysconfig.py", line 408, in get_config_vars func() File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/sysconfig.py", line 313, in _init_posix raise DistutilsPlatformError(my_msg) distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /genesis1/cdillaba/Python-2.2.1/lib/python2.2/config/Makefile (No such file or directory) Thanks for any help, Craig Dillabaugh From noreply@sourceforge.net Tue Aug 27 12:52:29 2002 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 27 Aug 2002 04:52:29 -0700 Subject: [XML-SIG] [ pyxml-Bugs-600745 ] cant import xml.parsers.expat Message-ID: Bugs item #600745, was opened at 2002-08-27 04:52 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=600745&group_id=6473 Category: pyexpat Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: cant import xml.parsers.expat Initial Comment: Hi, when I have PyXML installed then I cant import xml.parsers.expat and thus I cant compile libglade-2.0.0 with libglade-convert and then gdm-2.4.0.9 wont install ;) I removed pyxml to install these packages then installed pyxml again, but I guess thats not a good solution. Hope you know what the bug is and how to fix it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=600745&group_id=6473 From Alexandre.Fayolle@logilab.fr Tue Aug 27 17:04:58 2002 From: Alexandre.Fayolle@logilab.fr (Alexandre) Date: Tue, 27 Aug 2002 18:04:58 +0200 Subject: [XML-SIG] Re: [4suite] (no subject) In-Reply-To: <5.1.1.6.0.20020827170657.02dba600@mail.comedia.se> References: <5.1.1.6.0.20020827170657.02dba600@mail.comedia.se> Message-ID: <20020827160458.GA27058@orion.logilab.fr> On Tue, Aug 27, 2002 at 05:30:15PM +0200, Tommy Sundström wrote: > Newbie question. Hi, this is not the right mailing list. You should ask questions about pyxml on the xml-sig mailing list (cc'ed to this answer) > Running this code: > > --- > import xml.dom.ext.reader.Sax2 > from xml.dom.ext import PrettyPrint > > str = ''' "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> > >

Text

''' > > doc = xml.dom.ext.reader.Sax2.FromXml(str) > PrettyPrint(doc.documentElement) > --- > > gives this result: > --- >

Text

> --- This is perfectly normal > My question: where does the "shape='rect'" comes from (It's not added > unless the DOCTYPE element is there.) It comes from the DTD. Download it from the url in the doctype and see for yourself that the element has a shape attribute with a default value of 'rect'. > Can it do any harm? Is there any way of surpressing it? Not that I know of, but the gurus on xml-sig may. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From Sylvain =?iso-8859-1?Q?Th=E9nault?= Tue Aug 27 17:15:01 2002 From: Sylvain =?iso-8859-1?Q?Th=E9nault?= (Sylvain =?iso-8859-1?Q?Th=E9nault?=) Date: Tue, 27 Aug 2002 18:15:01 +0200 Subject: [XML-SIG] Re: [4suite] (no subject) In-Reply-To: <20020827160458.GA27058@orion.logilab.fr> References: <5.1.1.6.0.20020827170657.02dba600@mail.comedia.se> <20020827160458.GA27058@orion.logilab.fr> Message-ID: <20020827161501.GA27178@orion.logilab.fr> On Tuesday 27 August à 18:04, Alexandre wrote: > On Tue, Aug 27, 2002 at 05:30:15PM +0200, Tommy Sundström wrote: > > Newbie question. > > Hi, this is not the right mailing list. You should ask questions about > pyxml on the xml-sig mailing list (cc'ed to this answer) > > > Running this code: > > > > --- > > import xml.dom.ext.reader.Sax2 > > from xml.dom.ext import PrettyPrint > > > > str = ''' > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> > > > >

Text

''' > > > > doc = xml.dom.ext.reader.Sax2.FromXml(str) > > PrettyPrint(doc.documentElement) > > --- > > > > gives this result: > > --- > >

Text

> > --- > > This is perfectly normal > > > My question: where does the "shape='rect'" comes from (It's not added > > unless the DOCTYPE element is there.) > > It comes from the DTD. Download it from the url in the doctype and see > for yourself that the element has a shape attribute with a default > value of 'rect'. > > > Can it do any harm? Is there any way of surpressing it? The parser is responsible for entities substitution, so you can supressing it but you have to give a rightly configured parser instance to FromXml: parser = make_parser() parser.setFeature(feature_external_ges, 0) parser.setFeature(feature_external_pes, 0) doc = xml.dom.ext.reader.Sax2.FromXml(str, parser=parser) See http://www.python.org/doc/current/lib/module-xml.sax.handler.html for a description of the different features. Note that those features aren't recognized by all parser (pyexpat does but not xmlproc) -- Sylvain Thénault LOGILAB http://www.logilab.org From dieter@handshake.de Tue Aug 27 21:56:02 2002 From: dieter@handshake.de (Dieter Maurer) Date: Tue, 27 Aug 2002 22:56:02 +0200 Subject: [XML-SIG] Problem Installing PyXML In-Reply-To: <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> References: <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> Message-ID: <15723.59234.912291.314977@gargle.gargle.HOWL> Dillabaugh, Craig writes: > ... > 313, in _init_posix > raise DistutilsPlatformError(my_msg) > distutils.errors.DistutilsPlatformError: invalid Python installation: unable > to open /genesis1/cdillaba/Python-2.2.1/lib/python2.2/config/Makefile (No > such file or directory) Looks like you did not install a development version of Python 2.2.1. Dieter From landauer@got.net Tue Aug 27 23:10:10 2002 From: landauer@got.net (landauer@got.net) Date: Tue, 27 Aug 2002 15:10:10 -0700 Subject: [XML-SIG] Re: Problem Installing PyXML Message-ID: <1030486210.3d6bf8c29261f@webmail.got.net> In this message: http://mail.python.org/pipermail/xml-sig/2002-August/008272.html Craig Dillabaugh reported a problem installing PyXML. I have run into the exact same problem (on Solaris "SunOS 5.8", for what it's worth). The only response to Craig's message so far, http://mail.python.org/pipermail/xml-sig/2002-August/008276.html was not very helpful. In contrast to what Dieter suggests, the README file for PyXML says this: The only requirements for installing the package are Python 2.0 or later, and a C compiler. This release has been tested with Python 2.x. To compile everything, simply perform the following steps. 1) Run "python setup.py build" to copy *.py files and compile the C extensions. 2) To install everything in the site-packages directory as an xml/ package, run "python setup.py install". Does anyone know what is broken here? Thanks, -- Doug L. -=-=-=- food. shelter. clothing. net. Got.net - The Internet Connection, Inc From Matt Gushee Tue Aug 27 23:30:41 2002 From: Matt Gushee (Matt Gushee) Date: Tue, 27 Aug 2002 16:30:41 -0600 Subject: [XML-SIG] Problem Installing PyXML In-Reply-To: <1030486210.3d6bf8c29261f@webmail.got.net> <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> References: <1030486210.3d6bf8c29261f@webmail.got.net> <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> Message-ID: <20020827223040.GB1809@swordfish.havenrock.com> On Tue, Aug 27, 2002 at 03:10:10PM -0700, landauer@got.net wrote: > Craig Dillabaugh reported a problem installing PyXML. > > I have run into the exact same problem (on Solaris "SunOS 5.8", > for what it's worth). > > The only response to Craig's message so far, > http://mail.python.org/pipermail/xml-sig/2002-August/008276.html > was not very helpful. In contrast to what Dieter suggests, the > README file for PyXML says this: > > The only requirements for installing the package are Python 2.0 or > later, and a C compiler. This release has been tested with Python 2.x. This may be incorrect. You do need to have the Python Makefile installed to compile C extensions with DistUtils; I don't know about Solaris, but on Linux at least you have to build Python from source or install a development package to have the Makefile. I think most PyXML users up to now have been pretty serious Python developers, who have all that stuff anyway; that would explain its being overlooked in the documentation. > Does anyone know what is broken here? Actually, I responded to Craig yesterday, but accidentally sent my message to him rather than the list. Perhaps he'd like to forward it to the list, since I think it might be helpful for people new to DistUtils. It could be that he didn't have the Makefile installed, but one thing that caught my eye was a number of lines in the stack trace like this: File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/sysconfig.py", line That refers to an existing source file on his system. Note that 'Lib' is capitalized, which of course is unusual for a UNIX system. Contrast that with: > distutils.errors.DistutilsPlatformError: invalid Python installation: unable > to open /genesis1/cdillaba/Python-2.2.1/lib/python2.2/config/Makefile (No > such file or directory) DistUtils expects to find the Makefile under .../lib/... . It appears that Craig has a directory called 'Lib' instead of 'lib' in his Python installation. Don't know if that's a bug in the Python package he installed, or if he manually created that directory, but either way the case mismatch would probably account for the error. -- Matt Gushee Englewood, Colorado, USA mgushee@havenrock.com http://www.havenrock.com/ From fdrake@acm.org Wed Aug 28 00:23:34 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 27 Aug 2002 19:23:34 -0400 Subject: [XML-SIG] Problem Installing PyXML In-Reply-To: <20020827223040.GB1809@swordfish.havenrock.com> References: <1030486210.3d6bf8c29261f@webmail.got.net> <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> <20020827223040.GB1809@swordfish.havenrock.com> Message-ID: <15724.2550.803900.706414@grendel.zope.com> Matt Gushee writes: > This may be incorrect. You do need to have the Python Makefile installed > to compile C extensions with DistUtils; I don't know about Solaris, but > on Linux at least you have to build Python from source or install a > development package to have the Makefile. I think most PyXML users up to The fact that some distributers package the development support separately from the main Python interpreter is quite disconcerting. I rarely use the default Python installed from a vendor's packages. > It could be that he didn't have the Makefile installed, but one thing > that caught my eye was a number of lines in the stack trace like this: > > File "/genesis1/cdillaba/Python-2.2.1/Lib/distutils/sysconfig.py", line > > That refers to an existing source file on his system. Note that 'Lib' is > capitalized, which of course is unusual for a UNIX system. Contrast that It is normal for a Python being run from a build area; this looks like a Python 2.2.1 unpacked from the .tgz source archive. The distutils included in Python 2.2.1 did not support building C extensions using a Python which had not been installed. This has been fixed on the Python trunk (2.3a0); I'm not sure if this is working for the 2.2.x maintenance branch. Sorry I've not had time to pay attention to this earlier. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From Matt Gushee Wed Aug 28 00:24:19 2002 From: Matt Gushee (Matt Gushee) Date: Tue, 27 Aug 2002 17:24:19 -0600 Subject: [XML-SIG] Problem Installing PyXML In-Reply-To: <15724.2550.803900.706414@grendel.zope.com> References: <1030486210.3d6bf8c29261f@webmail.got.net> <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> <20020827223040.GB1809@swordfish.havenrock.com> <15724.2550.803900.706414@grendel.zope.com> Message-ID: <20020827232419.GC1809@swordfish.havenrock.com> On Tue, Aug 27, 2002 at 07:23:34PM -0400, Fred L. Drake, Jr. wrote: > > Matt Gushee writes: > > This may be incorrect. You do need to have the Python Makefile installed > > to compile C extensions with DistUtils; I don't know about Solaris, but > > on Linux at least you have to build Python from source or install a > > development package to have the Makefile. I think most PyXML users up to > > The fact that some distributers package the development support > separately from the main Python interpreter is quite disconcerting. I "Some distributors" means most Linux distributions, I believe--certainly RedHat and Debian, and probably all other RPM-based distros as well. You might have to get used to it ;-) -- Matt Gushee Englewood, Colorado, USA mgushee@havenrock.com http://www.havenrock.com/ From landauer@got.net Wed Aug 28 00:35:45 2002 From: landauer@got.net (landauer@got.net) Date: Tue, 27 Aug 2002 16:35:45 -0700 Subject: [XML-SIG] Problem Installing PyXML In-Reply-To: <15724.2550.803900.706414@grendel.zope.com> References: <1030486210.3d6bf8c29261f@webmail.got.net> <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> <20020827223040.GB1809@swordfish.havenrock.com> <15724.2550.803900.706414@grendel.zope.com> Message-ID: <1030491345.3d6c0cd15559f@webmail.got.net> Quoting "Fred L. Drake, Jr." : > > That refers to an existing source file on his system. Note that 'Lib' is > > capitalized, which of course is unusual for a UNIX system. > > It is normal for a Python being run from a build area; this looks like > a Python 2.2.1 unpacked from the .tgz source archive. The distutils > included in Python 2.2.1 did not support building C extensions using a > Python which had not been installed. Indeed, that appears to be what my problem is. I built Python months ago, made a link in my bin directory, and forgot that I had never actually installed it. (I'm on a big, old-fashioned time-sharing Solaris system here, where I can't write into /usr/local without filling out some online forms.) Thanks for clearing it up. Maybe if the timing is such that it will be a while before 2.3 comes out, the PyXML README could be amended to say "The only requirements for installing the package are Python 2.0 or later, and a C compiler. Note that the Python must actually be an INSTALLed python, rather than one that is being used directly from Python's build area. This release has been tested with Python 2.x" perhaps with a more precise rendering of "2.x" at the end there... Thanks, -- Doug -=-=-=- food. shelter. clothing. net. Got.net - The Internet Connection, Inc From rsalz@datapower.com Wed Aug 28 03:13:58 2002 From: rsalz@datapower.com (Rich Salz) Date: Tue, 27 Aug 2002 22:13:58 -0400 (EDT) Subject: [XML-SIG] xml-signature/encryption? In-Reply-To: <5.1.1.2.2.20020817123542.029c3620@127.0.0.1> Message-ID: > I haven't been able to find a python implementation of > xml-signature/encryption. Zolera had one (for sale) but went out of business. :( Your best bet is to roll your own. PyXML has XML canonicalization, and I'd use m2crypto to get a python API for openssl. Once you have those pieces, it's not much code do basic signature generation. Another option is to use SWIG yourself and look at xmlsec which uses the Gnome XML libraries. See http://www.aleksey.com/xmlsec/ /r$ From fdrake@acm.org Wed Aug 28 05:29:05 2002 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 28 Aug 2002 00:29:05 -0400 Subject: [XML-SIG] Problem Installing PyXML In-Reply-To: <20020827232419.GC1809@swordfish.havenrock.com> References: <1030486210.3d6bf8c29261f@webmail.got.net> <7CDD7B94357FD5119E800002A537C46E1C1AB6@s5-ccr-r1.ccrs.nrcan.gc.ca> <20020827223040.GB1809@swordfish.havenrock.com> <15724.2550.803900.706414@grendel.zope.com> <1030491345.3d6c0cd15559f@webmail.got.net> <20020827232419.GC1809@swordfish.havenrock.com> Message-ID: <15724.20881.354610.226790@grendel.zope.com> Matt Gushee writes: > "Some distributors" means most Linux distributions, I believe--certainly > RedHat and Debian, and probably all other RPM-based distros as well. You > might have to get used to it ;-) There lies the beauty of open source: I can expect someone who knows about the installation on those platforms to contribute the required information. When it comes to the python/python2 distinction, and the separation of the development support from the main package, I find it very difficult to understand how and why the lines get drawn where they do, and really can't imagine how it's served anyone well. Do I have to "get used to it"? Well, that's what's installed on my Linux machines, but I don't use that installation, because it just doesn't suit my purposes. Because I don't use it, I've little idea what needs to be done to support it. Doug said: > Thanks for clearing it up. Maybe if the timing is such that it will > be a while before 2.3 comes out, the PyXML README could be amended to > say > > "The only requirements for installing the package are Python > 2.0 or later, and a C compiler. Note that the Python must > actually be an INSTALLed python, rather than one that is being > used directly from Python's build area. This release has been > tested with Python 2.x" That sounds very reasonable, until there's a better way. > perhaps with a more precise rendering of "2.x" at the end there... Perhaps... but I *do* test changes with 2.0.1, 2.1.3, 2.2.1, and 2.3a0. So I'm not entirely sure what I should put there unless I also test with 2.0, 2.1, 2.1.1, 2.1.2, and 2.2. (Alternately, convince Guido to make it easier to install multiple Python patch releases side-by-side, and I'll be glad to test with *all* of them.) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From t.c.jones@att.net Wed Aug 28 05:30:27 2002 From: t.c.jones@att.net (t.c.jones@att.net) Date: Wed, 28 Aug 2002 04:30:27 +0000 Subject: [XML-SIG] xml-signature/encryption? Message-ID: <20020828043030.BIBF1817.mtiwmhc22.worldnet.att.net@webmail.worldnet.att.net> I wouldn't mind helping to put together xml-sig. ..tom > > I haven't been able to find a python implementation of > > xml-signature/encryption. > > Zolera had one (for sale) but went out of business. :( > > Your best bet is to roll your own. PyXML has XML canonicalization, and > I'd use m2crypto to get a python API for openssl. Once you have those > pieces, it's not much code do basic signature generation. > > Another option is to use SWIG yourself and look at xmlsec which uses the > Gnome XML libraries. See http://www.aleksey.com/xmlsec/ > /r$ > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig From patnotz@yahoo.com Fri Aug 30 13:42:34 2002 From: patnotz@yahoo.com (Pat Notz) Date: Fri, 30 Aug 2002 05:42:34 -0700 (PDT) Subject: [XML-SIG] minidom with xmlproc (pure-Python DOM) Message-ID: <20020830124234.39248.qmail@web13307.mail.yahoo.com> Hi, In the project I'm working on, we're trying to make the product work with a "standard" Python 2.2.1 installation. So, we're bundling any external Python modules with the product. To make installation easy, we'd like all external Python modules to be pure-Python. I'm currently using minidom for XML parsing and would like to include the xmlproc parser to be certain that a parser is always available (not always so with the standard Python, we've learned the hard way). Does anyone have an example of using the xmlproc parser with minidom? Thanks, pat __________________________________________________ Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes http://finance.yahoo.com