From rsalz@zolera.com Wed Oct 3 15:21:13 2001 From: rsalz@zolera.com (Rich Salz) Date: Wed, 03 Oct 2001 10:21:13 -0400 Subject: [XML-SIG] Moving exceptions under Exception in yappsrt.py Message-ID: <3BBB1ED9.EB1BDB78@zolera.com> I sent mail to the YAPPS author a couple of days ago but haven't get got a reply. Any object to moving the SyntaxError classes in yappsrt.py and pyxpath.py so that they inherit from Exception? /r$ -- Zolera Systems, Your Key to Online Integrity Securing Web services: XML, SOAP, Dig-sig, Encryption http://www.zolera.com From martin@loewis.home.cs.tu-berlin.de Wed Oct 3 17:47:37 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 3 Oct 2001 18:47:37 +0200 Subject: [XML-SIG] Moving exceptions under Exception in yappsrt.py In-Reply-To: <3BBB1ED9.EB1BDB78@zolera.com> (message from Rich Salz on Wed, 03 Oct 2001 10:21:13 -0400) References: <3BBB1ED9.EB1BDB78@zolera.com> Message-ID: <200110031647.f93Glbq04571@mira.informatik.hu-berlin.de> > I sent mail to the YAPPS author a couple of days ago but haven't get got > a reply. Any object to moving the SyntaxError classes in yappsrt.py and > pyxpath.py so that they inherit from Exception? No, go ahead. Please note that I have just committed the 0.11.1 changes. In the process, I noticed two things: - the XPathParser is now a pure Python (even though a generated file); so it is debatable whether pyxpath should be maintained (although I'm in favour of that). - your changes to include CDATA_SECTION in a couple of places apparently have not been integrated into 4Suite. I'll try to maintain them after each merge, there is always the potential that they'll break unless they get synchronized with 4Suite. Regards, Martin From dmoor@technology.serco.com Thu Oct 4 12:08:19 2001 From: dmoor@technology.serco.com (David Moor) Date: Thu, 4 Oct 2001 12:08:19 +0100 Subject: [XML-SIG] PyXML Question Message-ID: <195F58F118C9D311B622009027DC812F77DC1B@mail1.technology.serco.com> Hi I have just downloaded PyXML-0.6.6.win32-py2.1.exe and tried to install it because I have some example code which contains: > from xml.dom.html_builder import HtmlBuilder > from xml.dom.walker import Walker > from xml.dom.writer import HtmlWriter I though I would then be able to run the example script. Followind the install my Python21\Lib directory has not changed although I now have a Python21\_xmlplus and a Python21\xmldoc directory. Is this correct? The script will still not run, I and using WinNT 4 and Python 2.1, do I need to copy the _xmlplus directory contents into the Lib\xml directory? This auto installer seems to have lulled me into a false sense of security, any help would be greatly appreciated. Dave Moor This message, including attachments, is intended only for the use by the person(s) to whom it is addressed. It may contain information which is privileged and confidential. Copying or use by anybody else is not authorised. If you are not the intended recipient, please contact the sender as soon as possible. The views expressed in this communication may not necessarily be the views held by Serco Integrated Transport. From larsga@garshol.priv.no Thu Oct 4 15:06:47 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 04 Oct 2001 16:06:47 +0200 Subject: [XML-SIG] drv_jython Message-ID: I finally got round to making a SAX 2.0 driver for the Java SAX 2.0 parsers, for use in Jython. It is not yet complete, but I did use it successfully last night to convert a 1.2 MB XML file into a topic map. Should I check it into the main branch, or should I put it on some other branch? Also, I guess we should use different lists of default parsers in Jython and CPython. Jython: drv_jython, drv_xmlproc CPython: expatreader, drv_xmlproc This change is more risky, in that it will make people suddenly start using drv_jython, before it has been properly tested. Comments? --Lars M. From noreply@sourceforge.net Thu Oct 4 17:35:02 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 04 Oct 2001 09:35:02 -0700 Subject: [XML-SIG] [ pyxml-Bugs-467937 ] 4DOM Events broken on setAttribute Message-ID: Bugs item #467937, was opened at 2001-10-04 09:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=467937&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Alexandre Fayolle (afayolle) Assigned to: Alexandre Fayolle (afayolle) Summary: 4DOM Events broken on setAttribute Initial Comment: Using setAttribute/setAttributeNS on 4DOM elements will not cause a mutation event to be fired if the attribute already exists. Alexandre Fayolle ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=467937&group_id=6473 From martin@loewis.home.cs.tu-berlin.de Thu Oct 4 20:33:24 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 4 Oct 2001 21:33:24 +0200 Subject: [XML-SIG] PyXML Question In-Reply-To: <195F58F118C9D311B622009027DC812F77DC1B@mail1.technology.serco.com> (message from David Moor on Thu, 4 Oct 2001 12:08:19 +0100) References: <195F58F118C9D311B622009027DC812F77DC1B@mail1.technology.serco.com> Message-ID: <200110041933.f94JXOD02080@mira.informatik.hu-berlin.de> > I have just downloaded PyXML-0.6.6.win32-py2.1.exe and tried to install it > because I have some example code which contains: > > > from xml.dom.html_builder import HtmlBuilder > > from xml.dom.walker import Walker > > from xml.dom.writer import HtmlWriter Unfortunately, that won't help you: These interfaces disappeard with PyXML 0.5. > I though I would then be able to run the example script. Followind the > install my Python21\Lib directory has not changed although I now have a > Python21\_xmlplus and a Python21\xmldoc directory. Is this correct? It is. > The script will still not run, I and using WinNT 4 and Python 2.1, > do I need to copy the _xmlplus directory contents into the Lib\xml > directory? No, you probably need to port the script to PyXML 0.6. Alternatively, you could try to install PyXML 0.5, although this is no longer supported. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Thu Oct 4 20:31:37 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 4 Oct 2001 21:31:37 +0200 Subject: [XML-SIG] drv_jython In-Reply-To: (message from Lars Marius Garshol on 04 Oct 2001 16:06:47 +0200) References: Message-ID: <200110041931.f94JVbQ02079@mira.informatik.hu-berlin.de> > I finally got round to making a SAX 2.0 driver for the Java SAX 2.0 > parsers, for use in Jython. It is not yet complete, but I did use it > successfully last night to convert a 1.2 MB XML file into a topic map. > > Should I check it into the main branch, or should I put it on some > other branch? Go ahead and check it into the mainline. PyXML 0.7.0 will have further changes that will go into the wild with it for the first time, and we can always issue 0.7.1 if we get complaints. > Also, I guess we should use different lists of default parsers in > Jython and CPython. > > Jython: drv_jython, drv_xmlproc > CPython: expatreader, drv_xmlproc > > This change is more risky, in that it will make people suddenly start > using drv_jython, before it has been properly tested. Comments? Well, PyXML will already look for the python.xml.sax.parser property to select a parser. If using the Java SAX code causes troubles, we already know a work-around. I'm not sure drv_jython is a good name, though. Shouldn't it rather indicate the specific Java API you are using, like jaxml (or what its name is)? Or perhaps even the specific parser that you use? Regards, Martin From larsga@garshol.priv.no Thu Oct 4 22:07:05 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 04 Oct 2001 23:07:05 +0200 Subject: [XML-SIG] drv_jython In-Reply-To: <200110041931.f94JVbQ02079@mira.informatik.hu-berlin.de> References: <200110041931.f94JVbQ02079@mira.informatik.hu-berlin.de> Message-ID: * Martin v. Loewis | | Go ahead and check it into the mainline. Me do. | Well, PyXML will already look for the python.xml.sax.parser property | to select a parser. If using the Java SAX code causes troubles, we | already know a work-around. True enough. I'll check it in, then. | I'm not sure drv_jython is a good name, though. Shouldn't it rather | indicate the specific Java API you are using, like jaxml (or what | its name is)? Or perhaps even the specific parser that you use? The specific Java API is SAX. The driver uses JAXP to create the parser (but not for anything else), and if JAXP doesn't find a parser it falls back to using SAX. We could always call it drv_javasax, I guess. That is perhaps a better name. --Lars M. From laurent.tardif@csse.monash.edu.au Fri Oct 5 02:37:11 2001 From: laurent.tardif@csse.monash.edu.au (Laurent Tardif) Date: Fri, 05 Oct 2001 11:37:11 +1000 Subject: [XML-SIG] hi, Message-ID: <3BBD0EC7.EFE52BB6@csse.monash.edu.au> with some friends we start to design an SVG authoring tool in python. And, of course, we use the xml library. I have some questions : - what's the current activity on xml ? can we hop some improvement the the DOM classes, some bug fix, and so one, we will be please to fix some of them. What is the way to contribute ? For the moment we have found some field in the data structure which are not up to date. - there is a plan to do some XSLT processor engine ? -how can we extend the documentation ? for a python beginer, it's quit impossible to find how to : - laod a xml document - write a Xml document - set the validating property on the parser - .... - and all the XML api is not documented, find the class in xml.dom.ext is very funny ;-) A compliment : quit suprised by the speed of the parser, very nice. -- ----------------------------------------------------------------- . .. .' @`._ Laurent Tardif ~ ...._.' ,__.- Monach University _..------/` .-'; mailBox 36 - Building 26 : __./' , .'-'- ~ School of Computer Science ~ `---(.-'''---. \`._ -.._ and Software engineering _.--'( .______.'.-' `-.` `. ~ Clayton Victoria 3168 : `-..____`-. `. Australia `. ```` ; Phone : xxx55779 `-.__ ; www.inrialpes.fr/opera ````-----.......---- __.-' From Alexandre.Fayolle@logilab.fr Fri Oct 5 08:39:52 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Fri, 5 Oct 2001 09:39:52 +0200 (CEST) Subject: [XML-SIG] hi, In-Reply-To: <3BBD0EC7.EFE52BB6@csse.monash.edu.au> Message-ID: On Fri, 5 Oct 2001, Laurent Tardif wrote: > with some friends we start to design an SVG authoring tool in python. > > And, of course, we use the xml library. > > I have some questions : > - what's the current activity on xml ? It's quite high I think. Most people here are actively using the tools in PyXML, so this means that you'll find a high level of support on this list. > can we hop some improvement the the DOM classes, some bug fix, and so > one, > we will be please to fix some of them. > What is the way to contribute ? > For the moment we have found some field in the data structure which are > not up to date. Now this is strange. I thought I had completely debugged 4DOM ;o) Which DOM implementation are you using? Please report bugs on the list (or even better on the bugtracker of the sourceforge project (http://pyxml.sf.net/), and submit patch to the patch manager, or on the list. > - there is a plan to do some XSLT processor engine ? Please check 4Suite. http://www.4suite.org/, which provides a full blown XSLT engine. There's also another project whose name I cannot remember which provides python bindings to the C++ Xalan XSLT engine (http://xml.apache.org/), and python bindings to the Sablotron XSLT engine. > -how can we extend the documentation ? > for a python beginer, it's quit impossible to find how to : > - laod a xml document > - write a Xml document > - set the validating property on the parser > - .... > - and all the XML api is not documented, find > the class in xml.dom.ext is very funny ;-) The documentation is still being written. If you check the SIG's page, and follow the documentation link, you'll eventually reach this page http://py-howto.sourceforge.net/xml-howto/DOM.html Does it answer your questions? Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From hannu@tm.ee Fri Oct 5 08:51:50 2001 From: hannu@tm.ee (Hannu Krosing) Date: Fri, 05 Oct 2001 09:51:50 +0200 Subject: [XML-SIG] hi, References: Message-ID: <3BBD6696.9F630239@tm.ee> Alexandre Fayolle wrote: > > > - there is a plan to do some XSLT processor engine ? > > Please check 4Suite. http://www.4suite.org/, which provides a full blown > XSLT engine. There's also another project whose name I cannot remember > which provides python bindings to the C++ Xalan XSLT engine > (http://xml.apache.org/), and python bindings to the Sablotron XSLT > engine. And one for libxslt too http://www.rexx.com/~dkuhlman/ just for note :) -------- Hannu From martin@loewis.home.cs.tu-berlin.de Fri Oct 5 08:54:24 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 5 Oct 2001 09:54:24 +0200 Subject: [XML-SIG] hi, In-Reply-To: <3BBD0EC7.EFE52BB6@csse.monash.edu.au> (message from Laurent Tardif on Fri, 05 Oct 2001 11:37:11 +1000) References: <3BBD0EC7.EFE52BB6@csse.monash.edu.au> Message-ID: <200110050754.f957sOS01073@mira.informatik.hu-berlin.de> > And, of course, we use the xml library. > > I have some questions : > - what's the current activity on xml ? I'm still planning to release PyXML 0.7.0 some time in the future. > can we hop some improvement the the DOM classes, some bug fix, > and so one, we will be please to fix some of them. No, unless you give more detail what kind of improvements you plan, and what kind of bugs you want to see fixed. Except for what is on SF, I'm not aware of any potential improvements or any desirable bug fixes. > What is the way to contribute ? If you want to contribute patches or report bugs, please use sf.net/projects/pyxml. For general discussions, this is the right place. > For the moment we have found some field in the data structure which are > not up to date. Can you give details? > - there is a plan to do some XSLT processor engine ? Yes, PyXML 0.7 will ship with 4XSLT. Please note that you can get 4XSLT today from www.4suite.org (as part of the 4Suite package). > -how can we extend the documentation ? Submit patches. > for a python beginer, it's quit impossible to find how to : > - laod a xml document > - write a Xml document Did you read the tutorial? This explains both aspects. Of course, contributions of documentation are greatly welcome. Please submit them to SF. > quit suprised by the speed of the parser, very nice. Thanks. I assume you are using Expat here, so the glory actually goes to James Clark and the current maintainers of Expat. Regards, Martin From dmoor@technology.serco.com Fri Oct 5 09:56:09 2001 From: dmoor@technology.serco.com (David Moor) Date: Fri, 5 Oct 2001 09:56:09 +0100 Subject: [XML-SIG] PyXML Question Message-ID: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com> > -----Original Message----- > From: Martin v. Loewis > Subject: Re: [XML-SIG] PyXML Question > > > I have just downloaded PyXML-0.6.6.win32-py2.1.exe and > tried to install it > > because I have some example code which contains: > > > > > from xml.dom.html_builder import HtmlBuilder > > > from xml.dom.walker import Walker > > > from xml.dom.writer import HtmlWriter > > > The script will still not run, I and using WinNT 4 and Python 2.1, > > do I need to copy the _xmlplus directory contents into the Lib\xml > > directory? > > No, you probably need to port the script to PyXML 0.6. Alternatively, > you could try to install PyXML 0.5, although this is no longer > supported. > > Regards, > Martin Thanks for the help Martin, I am trying to write a bot to access information from a web site which requires me to log in. Since I have just started to use Python a couple of months ago and have not used PyXML before I was hoping to use this sample script as a learning tool since it was designed to download the authors online bank statements. Because this functionality isn't supported any more I think I would be best learning the new method, which brings me to my next question. Are there any 'Introduction to PyXML' documents, describing the different parts and giving examples? I have looked in the xml-howto.txt in /xmldocs, the section I think I need is 4.5 Processing HTML, which contains 'Intro to HTML builder' :) TIA Dave Moor This message, including attachments, is intended only for the use by the person(s) to whom it is addressed. It may contain information which is privileged and confidential. Copying or use by anybody else is not authorised. If you are not the intended recipient, please contact the sender as soon as possible. The views expressed in this communication may not necessarily be the views held by Serco Integrated Transport. From Alexandre.Fayolle@logilab.fr Fri Oct 5 10:32:18 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Fri, 5 Oct 2001 11:32:18 +0200 (CEST) Subject: [XML-SIG] PyXML Question In-Reply-To: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com> Message-ID: On Fri, 5 Oct 2001, David Moor wrote: > Are there any 'Introduction to PyXML' documents, describing the different > parts and giving examples? I have looked in the xml-howto.txt in /xmldocs, > the section I think I need is 4.5 Processing HTML, which contains 'Intro to > HTML builder' :) The first thing you may want to note is that it is generally difficult to map html to xml, and even harder to extract information from the resulting xml. The reason for this is that html is too often used for presentation, meaning that you get tons of nested tables in a typical html document, quite often with badly nested elements, or misquoted attributes. This said, let's get into solving your problem: the official way of creating a DOM tree is buy using a reader class, such as xml.dom.ext.reader.Sax2.Reader class. If what you want to process html, you'll want to use xml.dom.ext.reader.HtmlLib.Reader. The first thing you want to do is build a new reader: from xml.dom.ext.HtmlLib import Reader r = Reader() Then you can use the reader to parse the tree. A reader has 3 methods to achieve this: fromString, fromUri and fromStream (which does the real work for the other 2). fromString takes a string representation of the document, fromUri takes a URL or URI string pointing to the document, and fromStream takes a File-like object. All three methods return a Document. doc = r.fromUri('http://www.logilab.org/') This was the easy part. Now you still have to figure out where the information you need is. There are no generic method for this, it all depends on the document you're processing. I can suggest you to give a good look at the DOM Traversal API from the W3C site, and at XPath, both of which can be nice tools to perform such task. Cheers, Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From akuchlin@mems-exchange.org Fri Oct 5 14:18:21 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 05 Oct 2001 09:18:21 -0400 Subject: [XML-SIG] Dropping xml.marshal Message-ID: The code in the xml.marshal package is out of date, and I've never heard of anyone using it. Therefore, I suggest it be deleted. Any objections? --amk From fdrake@acm.org Fri Oct 5 14:50:21 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 5 Oct 2001 09:50:21 -0400 Subject: [XML-SIG] Dropping xml.marshal In-Reply-To: References: Message-ID: <15293.47773.97621.370545@grendel.zope.com> Andrew Kuchling writes: > The code in the xml.marshal package is out of date, and I've never > heard of anyone using it. Therefore, I suggest it be deleted. > Any objections? Given the recent discussion on the XML-RPC list, and the availability of xmlrpclib, I'd say that certainly xml.marshal.xmlrpc can go. I don't have any opinion on the others, but see no reason to keep them if they aren't being used. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From noreply@sourceforge.net Fri Oct 5 15:42:40 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 05 Oct 2001 07:42:40 -0700 Subject: [XML-SIG] [ pyxml-Bugs-468299 ] cloneNode does not change ownerElement Message-ID: Bugs item #468299, was opened at 2001-10-05 07:42 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=468299&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 6 Submitted By: Alexandre Fayolle (afayolle) Assigned to: Alexandre Fayolle (afayolle) Summary: cloneNode does not change ownerElement Initial Comment: It's a long time since I found a bug in 4DOM! >>> from xml.dom.ext.reader.Sax2 import Reader >>> d = Reader().fromString("") >>> clone = d.documentElement.cloneNode(1) >>> print d.documentElement >>> print clone >>> print clone.attributes[0].ownerElement This causes Events on attributes not to be properly propagated if they occur on a cloned branch. I'll patch this one. Alexandre ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=468299&group_id=6473 From martin@loewis.home.cs.tu-berlin.de Fri Oct 5 19:06:30 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 5 Oct 2001 20:06:30 +0200 Subject: [XML-SIG] Dropping xml.marshal In-Reply-To: (message from Andrew Kuchling on Fri, 05 Oct 2001 09:18:21 -0400) References: Message-ID: <200110051806.f95I6Ue01130@mira.informatik.hu-berlin.de> > The code in the xml.marshal package is out of date, and I've never > heard of anyone using it. Therefore, I suggest it be deleted. > Any objections? Yes. There have been user contributions to the wddx code, so apparently some users do care atleast about wddx. I'm not so sure that the other two marshallers have any value, but atleast the generic one needs to stay to support wddx. So if you want to remove xml-rpc, go ahead. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Fri Oct 5 19:02:19 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 5 Oct 2001 20:02:19 +0200 Subject: [XML-SIG] PyXML Question In-Reply-To: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com> (message from David Moor on Fri, 5 Oct 2001 09:56:09 +0100) References: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com> Message-ID: <200110051802.f95I2Ji01127@mira.informatik.hu-berlin.de> > Are there any 'Introduction to PyXML' documents, describing the > different parts and giving examples? I have looked in the > xml-howto.txt in /xmldocs, the section I think I need is 4.5 > Processing HTML, which contains 'Intro to HTML builder' :) The XML HOWTO is the right starting point. However, that section still needs to be written/updated/replaced. You should use a xml.dom.ext.reader.Reader instance, and its from{Stream,Uri,String} method. Then, the normal DOM operations can be used on the tree. To write back the result, you should use use xml.dom.ext.XHtmlPrettyPrint. Note that processing HTML with XML libraries is always risky, as HTML documents are not XML documents (unless they comply with XHTML); often, they don't even comply with the HTML DTD. In these cases, processors can easily get confused. Regards, Martin From pyxml@xhaus.com Sat Oct 6 13:19:56 2001 From: pyxml@xhaus.com (Alan Kennedy) Date: Sat, 06 Oct 2001 13:19:56 +0100 Subject: [XML-SIG] PyXML Question References: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com> <200110051802.f95I2Ji01127@mira.informatik.hu-berlin.de> Message-ID: <3BBEF6EC.25C51FCF@xhaus.com> "Martin v. Loewis" wrote: > Note that processing HTML with XML libraries is always risky, as HTML > documents are not XML documents (unless they comply with XHTML); > often, they don't even comply with the HTML DTD. In these cases, > processors can easily get confused. Although I haven't used the Python version, Dave Raggetts excellent Tidy program will clean up malformed HTML and turn it into XHTML, which should then be parsable by XML processors. Marc-Andre Lemburg has provided a python interface to HTML tidy, which is now a part of the Egenix Experimental Package. You can find it here:- http://www.lemburg.com/files/python/index.html My memory of my use of HTML tidy is that coverage is very good of most of the common problems you would encounter processing malformed HTML as XML. For example, I think it will wrap the content of