From clemmerl at gmail.com Sat May 2 18:14:35 2009 From: clemmerl at gmail.com (Lee Clemmer) Date: Sat, 2 May 2009 12:14:35 -0400 Subject: [XML-SIG] Python and XSLT 2.0 Message-ID: <88fb06540905020914k2b488810r292c393194a4799@mail.gmail.com> Hi all, I'm currently in the process of teaching myself Python for personal projects and at the same time XSLT for work. At work we run a Saxon XSLT 2.0 compliant transformer on a J2EE server that does the transformations for us. I'm looking to do transformations outside of work on a non-Java platform using Python. Unfortunately, after a couple of hours of Googling there doesn't seem to a be an XSLT 2.0 transformer available for Python, only for Java and .NET environments... is this right? Is anything in the works? Thanks in advance, - Lee -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sat May 2 20:05:13 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 02 May 2009 20:05:13 +0200 Subject: [XML-SIG] Python and XSLT 2.0 In-Reply-To: <88fb06540905020914k2b488810r292c393194a4799@mail.gmail.com> References: <88fb06540905020914k2b488810r292c393194a4799@mail.gmail.com> Message-ID: <49FC8B59.9090703@behnel.de> Lee Clemmer wrote: > I'm currently in the process of teaching myself Python for personal projects > and at the same time XSLT for work. At work we run a Saxon XSLT 2.0 > compliant transformer on a J2EE server that does the transformations for us. > I'm looking to do transformations outside of work on a non-Java platform > using Python. Unfortunately, after a couple of hours of Googling there > doesn't seem to a be an XSLT 2.0 transformer available for Python, only for > Java and .NET environments... is this right? Is anything in the works? My guess is that the advantages of XSLT 2.0 over XSLT1+EXSLT+Schema+Python simply aren't big enough to spend money on it. You will notice that XSLT 2.0 implementations are still rare in general, not only for Python. But did you check the XQuery homepage? Some of the implementations seem to have at least some kind of Python support. http://www.w3.org/XML/Query/#implementations Not sure if any of them implement XSLT, though. Note that there are also Python implementations for .NET (IronPython) and Java (Jython). You might also get lucky with running Saxon though GCJ and wrapping it with Cython (something I always wanted to try, but never found a reason to spend my time on). Or maybe take a look at JCC to use Saxon with it. http://lucene.apache.org/pylucene/jcc/index.html Stefan From d_spring at yahoo.com Wed May 13 18:09:27 2009 From: d_spring at yahoo.com (Dennis Spring) Date: Wed, 13 May 2009 09:09:27 -0700 (PDT) Subject: [XML-SIG] pyXML errors on install Message-ID: <189451.67846.qm@web110706.mail.gq1.yahoo.com> Hi; I downloaded pyXML and read the 'readMe' doc . I followed the simple instructions to build and install pyXML, but many many error messages were output to the monitor. Any ideas? I am not tied to the notion of using pyXML, however I need to be able to read and process xml strings, not xml files. And all references I can find for minidom and sax show examples of how to read a file, not a string. My coding attempts confirm that those objects must have a file to read. I know I could write the received string to a file, and then read it. However I believe that we would have time issues caused by that extra step. A fellow developer suggested using pyXML, but as I mentioned above, I got 'lotsa' errors when I followed the installation instructions. Any help would be appreciated. Thank you. E. Dennis Spring From stefan_ml at behnel.de Thu May 14 07:46:20 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 14 May 2009 07:46:20 +0200 Subject: [XML-SIG] pyXML errors on install In-Reply-To: <189451.67846.qm@web110706.mail.gq1.yahoo.com> References: <189451.67846.qm@web110706.mail.gq1.yahoo.com> Message-ID: <4A0BB02C.3000903@behnel.de> Dennis Spring wrote: > I downloaded pyXML and read the 'readMe' doc . Then you missed the fact that PyXML is unmaintained and outdated. > I am not tied to the notion of using pyXML, however I need to be able to read and process > xml strings, not xml files. Use the xml.etree.ElementTree module (in the stdlib since 2.5, available for Py2.2+). It has a "fromstring()" function that does the job. It's also a lot easier to use than anything in the xml.dom or xml.sax package. Stefan From phihag at phihag.de Sun May 24 14:40:13 2009 From: phihag at phihag.de (Philipp Hagemeister) Date: Sun, 24 May 2009 14:40:13 +0200 Subject: [XML-SIG] minidom: Genius or just plain bad? Message-ID: <4A19402D.5020001@phihag.de> I was puzzled when I tripped over the following: >>> NS = 'http://phihag.de/2009/test/python/ns' >>> s = '' >>> import xml.dom.minidom >>> doc = xml.dom.minidom.parseString(s) >>> doc.documentElement.getAttributeNS(NS, 'a') '' # wtf? >>> doc.documentElement.getAttribute('a') u'val' Looking in the implementation, it seems that minidom is essentially a DOM Level 1 implementation, with very limited support for namespaces. Wouldn't be nice to have a full-fledged XML implementation in the Python stdlib? Probably not (yet) including validation, XSLT and similar auxiliary technologies, but come on, XML namespaces and DOM 3 L/S should be supported. I noticed that important minidom features such as http://bugs.python.org/issue1621421 are not going anywhere. Is this because of performance considerations or lack of manpower? Also, it seems strange that minidom.py is full of comments referencing outdated 2002 working drafts. I'm intrigued by the idea of overriding __setattr__ to do crazy stuff (including invalidating a document-wide cache that probably stays valid in >99% of the cases although a local check for attribute name = id would improve performance here) instead of using properties, and then avoiding actually using it "for performance" reasons. Additionally, the comment "nodeValue and value are set elsewhere" in Attr.__init__ neatly conveys the intention of allowing extremly fast creation of value-less attributes. Similarly, the opening comment of expatbuilder.py is excellent of the little-known Alternative Zen of Python Ugly is better than beautiful. Implicit is better than explicit. Performance is better than anything. Code needs comments explaining and defending it. Constants are great, especially when depending on their value.? Code first, then think about the interface.? Or don't think about the interface at all. Fixing bugs in dependencies is bad. Unless you fix by changing your code. But do not allow others to do that. Modularization is good. As long as you access internals of other modules. Import from many modules. Whose names all sound the same. If self.childnodes (:return True else return False) That's how I spell pain. ? minidom.prefix ? grep "not sure this is meaningful" Regards, Philipp -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: OpenPGP digital signature URL: From Tim.Arnold at sas.com Fri May 29 16:14:10 2009 From: Tim.Arnold at sas.com (Tim Arnold) Date: Fri, 29 May 2009 10:14:10 -0400 Subject: [XML-SIG] docbook 5, lxml and rng Message-ID: Hi, this is a newbie question I'm sure. I'm trying to validate an example straight out of the docbook 5 documentation (example given on the 'inlineequation' page). As it stands, the file doesn't pass as valid. The code: ======================================= from lxml import etree import os # RNGDIR = 'path to docbook.rng' # XMLDIR = 'path to the xml file' relaxng_doc = etree.parse(os.path.join(RNGDIR,'docbook.rng')) relaxng = etree.RelaxNG(relaxng_doc) doc = etree.parse(os.path.join(XMLDIR,'myfile.xml')) print relaxng.validate(doc) ======================================= The xml file: =======================================
Example inlineequation Einstein's theory of relativity includes one of the most widely recognized formulas in the world: e=mc^2
======================================= If I remove the inlineequation subtree, it is valid. Can someone help me understand what I'm missing? python 2.5.1 lxml-2.1.2-py2.5-freebsd-6.3 thanks, --Tim Arnold From martin at v.loewis.de Sun May 31 00:30:29 2009 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 31 May 2009 00:30:29 +0200 Subject: [XML-SIG] minidom: Genius or just plain bad? In-Reply-To: <4A19402D.5020001@phihag.de> References: <4A19402D.5020001@phihag.de> Message-ID: <4A21B385.3090209@v.loewis.de> Philipp Hagemeister wrote: > I was puzzled when I tripped over the following: > >>>> NS = 'http://phihag.de/2009/test/python/ns' >>>> s = '' >>>> import xml.dom.minidom >>>> doc = xml.dom.minidom.parseString(s) >>>> doc.documentElement.getAttributeNS(NS, 'a') > '' # wtf? Why do you think this is incorrect? The root element has no attribute named 'a' in the NS namespace. Regards, Martin From phihag at phihag.de Sun May 31 00:43:44 2009 From: phihag at phihag.de (Philipp Hagemeister) Date: Sun, 31 May 2009 00:43:44 +0200 Subject: [XML-SIG] minidom: Genius or just plain bad? In-Reply-To: <4A21B385.3090209@v.loewis.de> References: <4A19402D.5020001@phihag.de> <4A21B385.3090209@v.loewis.de> Message-ID: <4A21B6A0.2040305@phihag.de> Martin v. L?wis wrote: >>>>> NS = 'http://phihag.de/2009/test/python/ns' >>>>> s = '' >>>>> import xml.dom.minidom >>>>> doc = xml.dom.minidom.parseString(s) >>>>> doc.documentElement.getAttributeNS(NS, 'a') > > Why do you think this is incorrect? The root element > has no attribute named 'a' in the NS namespace. Oops, my bad. You are perfectly right, and this part of my argument is moot. http://www.rpbourret.com/xml/NamespaceMyths.htm#myth4 refutes my misconception in-depth. minidom's code is still yucky though. Cheers, Philipp -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: OpenPGP digital signature URL: From stefan_ml at behnel.de Sun May 31 08:00:15 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 31 May 2009 08:00:15 +0200 Subject: [XML-SIG] minidom: Genius or just plain bad? In-Reply-To: <4A19402D.5020001@phihag.de> References: <4A19402D.5020001@phihag.de> Message-ID: <4A221CEF.5000202@behnel.de> Philipp Hagemeister wrote: > Wouldn't be nice to have a full-fledged XML implementation in the Python > stdlib? Probably not (yet) including validation, XSLT and similar > auxiliary technologies, but come on, XML namespaces and DOM 3 L/S should > be supported. This has been rejected on python-dev lately, given that such an implementation would almost certainly introduce a major dependency overhead if it's not written in plain Python. There's also the historical problem that the stdlib XML support is there and quite a bit of existing code depends on it. Replacing that with a new implementation would break all that. Extending it is a, well, rather large project, as would be any kind of major performance improvement. It's not too hard to install lxml these days, though. The fact that it *doesn't* use the DOM3 API is actually a major strength. http://codespeak.net/lxml/ Stefan From stefan_ml at behnel.de Sun May 31 08:04:47 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 31 May 2009 08:04:47 +0200 Subject: [XML-SIG] docbook 5, lxml and rng In-Reply-To: References: Message-ID: <4A221DFF.1080006@behnel.de> Hi, Tim Arnold wrote: > Hi, this is a newbie question I'm sure. I'm trying to validate an > example straight out of the docbook 5 documentation (example given > on the 'inlineequation' page). As it stands, the file doesn't pass > as valid. > > The code: > ======================================= > from lxml import etree > import os > # RNGDIR = 'path to docbook.rng' > # XMLDIR = 'path to the xml file' > relaxng_doc = etree.parse(os.path.join(RNGDIR,'docbook.rng')) > relaxng = etree.RelaxNG(relaxng_doc) > > doc = etree.parse(os.path.join(XMLDIR,'myfile.xml')) > print relaxng.validate(doc) What does the validator tell you why it's not considered valid? Note that there's a property "error_log" which returns a sequence of messages that were collected during validation. http://codespeak.net/lxml/validation.html#relaxng Stefan