From sap at ifp-koeln.de Mon Jan 1 15:31:03 2007 From: sap at ifp-koeln.de (customs) Date: Mon, 1 Jan 2007 15:31:03 +0100 Subject: [XML-SIG] If people are suspicious or afraid of getting flamed or put down, they won't join the community. Message-ID: <45991B27.7070601@squarepegs.biz> An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20070101/a4f55351/attachment.htm -------------- next part -------------- A non-text attachment was scrubbed... Name: checkout.gif Type: image/gif Size: 17363 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20070101/a4f55351/attachment.gif From David.Fox at nuance.com Mon Jan 1 21:09:50 2007 From: David.Fox at nuance.com (Fox, David) Date: Mon, 1 Jan 2007 15:09:50 -0500 Subject: [XML-SIG] Which python DOM to use - need EntityReference Message-ID: I've written a Python module which uses xml.dom from PyXML 0.8.4 to generate XML documents according to a particular schema (VoiceXML SRGS). Now I've seen a comment on SourceForge that pyxml.sourceforge.net that PyXML is no longer maintained (http://sourceforge.net/tracker/index.php?func=detail&aid=1562266&group_id=6 473&atid=106473). 1. Is this true? 2. Is there a recommended DOM replacement? I've looked at xml.minidom from the standard python library, but it doesn't support EntityReference, which means that I can't escape apostrophes as ' (which is allowed but not required by XML, but is required the VoiceXML SRGS standard). If I include an apostrophe in a text node, it doesn't get escaped, whereas if I include ' it gets turned into "&pos;" I've skimmed the 4Suite docs, but it isn't obvious to me whether its Domlette implementation would solve this problem. 3. Is there any prospect of supporting EntityReference in the standard python library? (I'd be happy to help, but I'm afraid I don't really have much XML expertise outside of the particulars of VoiceXML SRGS and whatever else I've learned from the http://www.w3schools.com/ site). 4. Alternatively, is there some other workaround for dealing with '? Thanks in advance, David Fox From dkuhlman at rexx.com Mon Jan 1 23:27:24 2007 From: dkuhlman at rexx.com (Dave Kuhlman) Date: Mon, 1 Jan 2007 14:27:24 -0800 Subject: [XML-SIG] Which python DOM to use - need EntityReference In-Reply-To: References: Message-ID: <20070101222724.GA48029@cutter.rexx.com> On Mon, Jan 01, 2007 at 03:09:50PM -0500, Fox, David wrote: > I've written a Python module which uses xml.dom from PyXML 0.8.4 to generate > XML documents according to a particular schema (VoiceXML SRGS). > > Now I've seen a comment on SourceForge that pyxml.sourceforge.net that PyXML > is no longer maintained > (http://sourceforge.net/tracker/index.php?func=detail&aid=1562266&group_id=6 > 473&atid=106473). > > 1. Is this true? > > 2. Is there a recommended DOM replacement? > > I've looked at xml.minidom from the standard python library, but it doesn't > support EntityReference, which means that I can't escape apostrophes as > ' (which is allowed but not required by XML, but is required the > VoiceXML SRGS standard). If I include an apostrophe in a text node, it > doesn't get escaped, whereas if I include ' it gets turned into > "&pos;" I've read good things about ElementTree. I use it and like it. It is aware of and does process entities, although I do not know enough about them well enough to know whether they're handled correctly. I believe that it un-escapes ' on the way in (parsing), but does not escape them when writing them out. Also, if I set the text of a node to text containing an entity reference, ElementTree seems to have the behavior that you do *not* want, specifically, it escapes the ampersand. Here is a small test -- r in the root element in the document d: In [19]: r.text = "aaa'bbbaaa&apos;bbb<cccIn [21]: There is also Lxml, which implements the same API as ElementTree, but requires installation of libxml. But, for what it's worth, you can find out about them here: http://effbot.org/zone/element-index.htm http://codespeak.net/lxml/ Dave -- Dave Kuhlman http://www.rexx.com/~dkuhlman From fredrik at pythonware.com Wed Jan 3 20:44:06 2007 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 03 Jan 2007 20:44:06 +0100 Subject: [XML-SIG] Which python DOM to use - need EntityReference In-Reply-To: References: Message-ID: Fox, David wrote: > I've looked at xml.minidom from the standard python library, but it doesn't > support EntityReference, which means that I can't escape apostrophes as > ' (which is allowed but not required by XML, but is required the > VoiceXML SRGS standard). what makes you think that the use of ' is *required* by VoiceXML ? From David.Fox at nuance.com Thu Jan 4 03:25:08 2007 From: David.Fox at nuance.com (Fox, David) Date: Wed, 3 Jan 2007 21:25:08 -0500 Subject: [XML-SIG] Which python DOM to use - need EntityReference Message-ID: On further checking, it may be that this is only required by my company's recognizer, and not by the VXML SRGS standard. For my purposes, it doesn't really matter which one is imposing the requirement. -----Original Message----- From: xml-sig-bounces+david.fox=nuance.com at python.org [mailto:xml-sig-bounces+david.fox=nuance.com at python.org] On Behalf Of Fredrik Lundh Sent: Wednesday, January 03, 2007 2:44 PM To: xml-sig at python.org Subject: Re: [XML-SIG] Which python DOM to use - need EntityReference Fox, David wrote: > I've looked at xml.minidom from the standard python library, but it doesn't > support EntityReference, which means that I can't escape apostrophes as > ' (which is allowed but not required by XML, but is required the > VoiceXML SRGS standard). what makes you think that the use of ' is *required* by VoiceXML ? _______________________________________________ XML-SIG maillist - XML-SIG at python.org http://mail.python.org/mailman/listinfo/xml-sig From fredrik at pythonware.com Thu Jan 4 14:27:47 2007 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 4 Jan 2007 14:27:47 +0100 Subject: [XML-SIG] Which python DOM to use - need EntityReference References: Message-ID: "Fox, David" wrote: > On further checking, it may be that this is only required by my company's > recognizer, and not by the VXML SRGS standard. For my purposes, it doesn't > really matter which one is imposing the requirement. perhaps not, but XML developers are, in general, somewhat reluctant to add features to their libraries to work around bugs in 3rd-party tools that claim to implement XML, but don't... your best bet is probably to ask your company's VXML developers to fix the bug, and fix up your XML output with a simple regexp in the meantime. From and-xml at doxdesk.com Thu Jan 4 21:03:30 2007 From: and-xml at doxdesk.com (Andrew Clover) Date: Thu, 04 Jan 2007 21:03:30 +0100 Subject: [XML-SIG] Which python DOM to use - need EntityReference In-Reply-To: References: Message-ID: <459D5D92.5030500@doxdesk.com> David Fox wrote: > 1. Is this true? Yes, but the lack of maintainer is a recent development. The lack of recent development in general is not a recent development. :-) > 2. Is there a recommended DOM replacement? Not officially. If you need more DOM than minidom there is pxdom and 4Suite. > I've looked at xml.minidom from the standard python library, but it doesn't > support EntityReference, which means that I can't escape apostrophes as > ' pxdom will allow you to do this, although it sounds like a bit of a dodgy idea to do with the predefined entities. Would be better to fix the incoming parser to correctly support XML if possible - ' and ' in content or attribute value are absolutely equivalent to a conforming XML processor. > 3. Is there any prospect of supporting EntityReference in the standard > python library? Not a hope. Supporting EntityReference properly and correctly is an absolutely shocking palaver and would require very large changes to minidom to implement - changes that would have a big impact on its performance. Having been through the pain of implementing EntityReference fully in pxdom, I'd say this is one of the very worst aspects of the DOM (and XML in general). > 4. Alternatively, is there some other workaround for dealing with '? mySerialisedXML.replace("'", ''')? Nasty, but if you can assume there aren't going to be any apostrophes in the document other than in content/attribute values... -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ From uche at ogbuji.net Fri Jan 5 21:45:35 2007 From: uche at ogbuji.net (Uche Ogbuji) Date: Fri, 05 Jan 2007 13:45:35 -0700 Subject: [XML-SIG] ANN: Amara XML Toolkit 1.2.0.1 Message-ID: <459EB8EF.20706@ogbuji.net> Amara 1.2 was released to Cheeseshop New Year's Eve, but in all the festivities and travel I forgot the announcements. Just as well because I'd forgotten to include a minor fix, and so here is 1.2.0.1. http://uche.ogbuji.net/tech/4suite/amara http://cheeseshop.python.org/pypi/Amara/ ftp://ftp.4suite.org/pub/Amara/ Changes since Amara 1.1.9: * Add omit_nodetype_rule bindery rule * Add force_nsdecls parameter to bindery node xml() method * 4Suite 1.0 compatibility (requires 4Suite 1.0.2, in fact) * Add support for attribute patterns to pushdom/pushbind * Add experimental (not very reliable) xml_xslt() method to bindery object to apply transforms * Improvements to trimxml command line * Turn off DTD validation by default * Add support for DTD validation & custom binding classes to convenience APIs * Add custom binding demo * Updates to the manual * Bug fixes Amara XML Toolkit is a collection of Python tools for XML processing-- not just tools that happen to be written in Python, but tools built from the ground up to use Python's conventions and take advantage of the many advantages of the language. Amara builds on 4Suite [http://4Suite.org], but whereas 4Suite offers more on literal implementation of XML standards in Python, Amara focuses on Pythonic idiom. It provides tools you can trust to conform with XML standards without losing the familiar Python feel. The components of Amara are: * Bindery: data binding tool (a very Pythonic XML API) * Scimitar: implementation of the ISO Schematron schema language for XML; converts Schematron files to Python scripts * domtools: set of tools to augment Python DOMs * saxtools: set of tools to make SAX easier to use in Python * Flextyper: user-defined datatypes in Python for XML processing There's a lot in Amara, but here are highlights: Amara Bindery: XML as easy as py -------------------------------- Bindery reads an XML document and it returns a data structure of Python objects corresponding to the vocabulary used in the XML document, for maximum clarity. Bindery turns the document What do you mean "bleh" But I was looking for argument Would become a set of objects so that you could write binding.monty.python.spam In order to get the value "eggs" (as a Python Unicode object) or binding.monty.python[1] In order to get the element object with the contents "But I was looking for argument". There are other such tools for Python, and what makes Bindery unique is that it's driven by a very declarative rules-based system for binding XML to the Python data. You can register rules that are triggered by XPattern expressions specialized binding behavior. It includes XPath support and is very efficient, using SAX to generate bindings. See the user documentation, manual.html, for more details. Scimitar: exceptional schema language for an exceptional programming language ----------------------------------------------------------------------------- Scimitar is an implementation of ISO Schematron that compiles a Schematron schema into a Python validator script. You typically use scimitar in two phases. Say you have a schematron schema schema1.stron and you want to validate multiple XML files against it, instance1.xml, instance2.xml, instance3.xml. First you run schema1.stron through the scimitar compiler script, scimitar.py: scimitar.py schema1.stron A file, schema1-stron.py, is generated in the current working directory. If you'd prefer a different location or file name, use the "-o" option. The generated file is a validator script in Python. It checks the schematron rules specified in schema1.stron. Run this validator on each XML file you wish to validate: python schema1.py instance1.xml The validation report is generated on standard output by default, or you can use the "-o" option to redirect it to a file. The validation report is an XML external parsed entity, a format much like a well-formed XML document, but with some restrictions relaxed. Amara DOM Tools: giving DOM a more Pythonic face ------------------------------------------------ Amara DOM Tools features pushdom, similar to xml.dom.pulldom, but easier to use, and a function to return an XPath location for any DOM node. Amara SAX Tools: SAX without the brain explosion ------------------------------------------------ Tenorsax (amara.saxtools.tenorsax) is a framework for "linerarizing" SAX logic so it flows a bit more naturally, needing much less state machine wizardry. License ------- Amara is open source, provided under the 4Suite variant of the Apache license. See the file COPYING for details. Installation ------------ Amara requires Python 2.4 or more recent and 4Suite-XML 1.0 or more recent. The easiest way to install it is: easy_install amara If this does not work you are probably not set up for easy_install and I suggest you follow the simple instructions at http://peak.telecommunity.com/DevCenter/EasyInstall easy_install will automatically take care of installing dependencies for you. If you prefer not to use easy_install, grab a 4Suite-XML package more recent than 1.0 and install that, then install the Amara package using the usual: python setup.py install Or a Windows installer, or other method. -- Uche Ogbuji Work: The Kadomo Group, Inc. http://uche.ogbuji.net http://kadomo.com http://copia.ogbuji.net Lead dev at http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ From pbrenne1 at nd.edu Wed Jan 10 17:07:07 2007 From: pbrenne1 at nd.edu (Paul R Brenner) Date: Wed, 10 Jan 2007 11:07:07 -0500 Subject: [XML-SIG] Where is the xml.dom.ext package in current python distributions? .ext documentation? Message-ID: <1168445227.45a50f2b7e5b1@webmail.nd.edu> Hello, I have been using the xml.dom.ext PrettyPrint function to output my dom objects into readable xml text files. However on a few newer machines in our pool there are more current (Python 2.4 or >) for which the .ext submodule is totally missing. More suprising is that I went to the Python documentation page and found no mention of the .ext submodule of xml for any version of Python. I searched the web and googled this mailing list with no luck. Could someone enlighten me regarding the state of the .ext submodule of xml.dom and if there is a replacement for the PrettyPrint function? Thanks, Paul -- Paul R Brenner, P.E. Computer Science and Engineering The University of Notre Dame "Computers are incredibly fast, accurate, and stupid; humans are incredibly slow, inaccurate and brilliant; together they are powerful beyond imagination." -Albert Einstein From fredrik at pythonware.com Thu Jan 11 10:20:51 2007 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu, 11 Jan 2007 10:20:51 +0100 Subject: [XML-SIG] Where is the xml.dom.ext package in current pythondistributions? .ext documentation? References: <1168445227.45a50f2b7e5b1@webmail.nd.edu> Message-ID: Paul R Brenner wrote: > I have been using the xml.dom.ext PrettyPrint function to output my dom objects > into readable xml text files. However on a few newer machines in our pool > there are more current (Python 2.4 or >) for which the .ext submodule is > totally missing. More suprising is that I went to the Python documentation > page and found no mention of the .ext submodule of xml for any version of > Python. http://pyxml.sourceforge.net/ From ahatzis at gmx.net Thu Jan 11 12:10:03 2007 From: ahatzis at gmx.net (Anastasios Hatzis) Date: Thu, 11 Jan 2007 12:10:03 +0100 Subject: [XML-SIG] Where is the xml.dom.ext package in current pythondistributions? .ext documentation? In-Reply-To: References: <1168445227.45a50f2b7e5b1@webmail.nd.edu> Message-ID: <45A61B0B.5030908@gmx.net> Fredrik Lundh wrote: > Paul R Brenner wrote: > >> I have been using the xml.dom.ext PrettyPrint function to output my dom objects >> into readable xml text files. However on a few newer machines in our pool >> there are more current (Python 2.4 or >) for which the .ext submodule is >> totally missing. More suprising is that I went to the Python documentation >> page and found no mention of the .ext submodule of xml for any version of >> Python. > > http://pyxml.sourceforge.net/ > I was so sure that PyXML was already part of the Python distribution. Unfortunately I can't remember where I this from. A few days ago an OpenSwarm user brought my attention to the same issue Paul mentioned so I realized it needs to be installed extra. Has PyXML ever been part of the official Python distribution or were there plans once to do so? Probably I'm just getting old and muddle-headed. Anastasios From pbrenne1 at nd.edu Thu Jan 11 15:17:59 2007 From: pbrenne1 at nd.edu (Paul R Brenner) Date: Thu, 11 Jan 2007 09:17:59 -0500 Subject: [XML-SIG] Where is the xml.dom.ext package in current python distributions? Message-ID: <1168525079.45a64717ce9be@webmail.nd.edu> Hello Anastasios and Fredrik, I have searched through pyxml.sourceforge.net (thats how I found this listserve). I found many examples on using xml.dom.ext and the PrettyPrint function. However my question follows Anastasios's comments. Where is xml.dom.ext in the current Python distributions. Is PyXML not the default XML package in the normal Python distribution now. When I visit http://docs.python.org/modindex.html there is no reference to xml.dom.ext at the xml or xml.dom module level. I am using Python in a scientific grid application generating xml dom objects that contain file structures for distributed storage. Everything was going well until we added machines from another campus to our computer pool and although they are using a current Python 2.4.x there is no xml.dom.ext. I thought maybe since .ext is not referenced on the Python page it is an 'optional' piece of PyXML. Fredrik Lundh wrote: > Paul R Brenner wrote: > >> I have been using the xml.dom.ext PrettyPrint function to output my dom objects >> into readable xml text files. However on a few newer machines in our pool >> there are more current (Python 2.4 or >) for which the .ext submodule is >> totally missing. More suprising is that I went to the Python documentation >> page and found no mention of the .ext submodule of xml for any version of >> Python. > > http://pyxml.sourceforge.net/ > I was so sure that PyXML was already part of the Python distribution. Unfortunately I can't remember where I this from. A few days ago an OpenSwarm user brought my attention to the same issue Paul mentioned so I realized it needs to be installed extra. Has PyXML ever been part of the official Python distribution or were there plans once to do so? Probably I'm just getting old and muddle-headed. Anastasios -- Paul R Brenner, P.E. Computer Science and Engineering The University of Notre Dame "Computers are incredibly fast, accurate, and stupid; humans are incredibly slow, inaccurate and brilliant; together they are powerful beyond imagination." -Albert Einstein From dieter at handshake.de Fri Jan 12 19:20:06 2007 From: dieter at handshake.de (Dieter Maurer) Date: Fri, 12 Jan 2007 19:20:06 +0100 Subject: [XML-SIG] Where is the xml.dom.ext package in current python distributions? In-Reply-To: <1168525079.45a64717ce9be@webmail.nd.edu> References: <1168525079.45a64717ce9be@webmail.nd.edu> Message-ID: <17831.53590.646422.252915@gargle.gargle.HOWL> Paul R Brenner wrote at 2007-1-11 09:17 -0500: >I have searched through pyxml.sourceforge.net (thats how I found this >listserve). I found many examples on using xml.dom.ext and the PrettyPrint >function. However my question follows Anastasios's comments. Where is >xml.dom.ext in the current Python distributions. Is PyXML not the default XML >package in the normal Python distribution now. The Python developpers are quite careful to keep the standard library small -- as larger software increase the maintenance burden. The Python runtime comes with some core XML support (sufficient to parse and process XML files). The xml-sig (XML Special Interest Group) has implemented additional XML processing packages: that is "PyXML". But, it never made it as a whole into the Python runtime library. And not "PyXML" is without maintainer.... -- Dieter From pbrenne1 at nd.edu Fri Jan 12 22:54:35 2007 From: pbrenne1 at nd.edu (Paul R Brenner) Date: Fri, 12 Jan 2007 16:54:35 -0500 Subject: [XML-SIG] Where is the xml.dom.ext package in current python distributions? In-Reply-To: <17831.53590.646422.252915@gargle.gargle.HOWL> References: <1168525079.45a64717ce9be@webmail.nd.edu> <17831.53590.646422.252915@gargle.gargle.HOWL> Message-ID: <1168638875.45a8039b8ef99@webmail.nd.edu> Thanks for the reply Dieter. Working with heterogenous distributed systems proves to me time and again that there are more software/hardware variations than I could imagine. The machines without xml.dom.ext are machines one which I can compute (Condor) but have only local temporary storage so I'll need to find a way to pass the module along with my distributed job. Regards, Paul Quoting Dieter Maurer : > Paul R Brenner wrote at 2007-1-11 09:17 -0500: > >I have searched through pyxml.sourceforge.net (thats how I found this > >listserve). I found many examples on using xml.dom.ext and the PrettyPrint > >function. However my question follows Anastasios's comments. Where is > >xml.dom.ext in the current Python distributions. Is PyXML not the default > XML > >package in the normal Python distribution now. > > The Python developpers are quite careful to keep the standard library > small -- as larger software increase the maintenance burden. > > The Python runtime comes with some core XML support (sufficient to parse > and process XML files). The xml-sig (XML Special Interest Group) has > implemented additional XML processing packages: that is "PyXML". > But, it never made it as a whole into the Python runtime library. > And not "PyXML" is without maintainer.... > > > > -- > Dieter > From noreply at sourceforge.net Fri Jan 19 01:49:40 2007 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 18 Jan 2007 16:49:40 -0800 Subject: [XML-SIG] [ pyxml-Bugs-1639086 ] 1) XmlProc is not "pure" python 2) Problems with name spaces Message-ID: Bugs item #1639086, was opened at 2007-01-18 19:49 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1639086&group_id=6473 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: xmlproc Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Yoav Zibin (zyoav) Assigned to: Lars Marius Garshol (larsga) Summary: 1) XmlProc is not "pure" python 2) Problems with name spaces Initial Comment: Hi, First thanks for XmlProc, we really needed a pure python parser that will work in mobile phones. However, it uses cPickle, which is not a standard Python extension. (You should just mention it in the documentation) Second issue is a problem with namespaces when using Sax2DOM. The problem has two parts: 1) In drv_xmlproc.py method def handle_start_tag(self,name,attrs): You have this code: if not self.rep_ns_attrs: del attrs[a] That deletes the xmlns attributes. However, if I write the following code: import xml.sax import xml.dom.minidom yoav_parser = xml.sax.make_parser(['xml.sax.drivers2.drv_xmlproc']) print xml.dom.minidom.parse("example.xml", yoav_parser).toxml() Then the resulting xml does not have the xmlns attributes. Therefore you should not delete those attributes. 2) The second problem is in the SAX interface for giving the qualified names: you only give the namespace for elements. However attributes can have a namespace as well: ... type="tns:ArrayOfTheater"/> And you do not supply a function to find the qualified names of attributes. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1639086&group_id=6473