From xml-sig@teleo.net Thu Mar 1 02:52:55 2001 From: xml-sig@teleo.net (Patrick Phalen) Date: Wed, 28 Feb 2001 18:52:55 -0800 Subject: [XML-SIG] DTD design: include categorization, or use RDF? In-Reply-To: <0102281124320Y.04301@quadra.teleo.net> References: <0102281124320Y.04301@quadra.teleo.net> Message-ID: <0102281852551G.04301@quadra.teleo.net> On Wednesday 28 February 2001 11:24, Patrick Phalen wrote: > There's now an Open Source TM engine in Python: > http://ontopia.net/software/tmproc/ oops ... make that http://www.ontopia.net/software/tmproc/ From loewis@informatik.hu-berlin.de Thu Mar 1 14:22:34 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Thu, 1 Mar 2001 15:22:34 +0100 (MET) Subject: [XML-SIG] Re: Version number question on PyXML 0.6.4 In-Reply-To: <01022521251706.28858@fermi.eeel.nist.gov> (message from Michael McLay on Sun, 25 Feb 2001 21:25:17 -0500) References: <200102260841.JAA09898@pandora> <01022521251706.28858@fermi.eeel.nist.gov> Message-ID: <200103011422.PAA04177@pandora.informatik.hu-berlin.de> > I'm begining to think someone from the Enlightenment window manager project > has been given control of the version numbering for PyXML. I don't know much about Enlightenment, so I can't tell whether this is applause or criticism - I assume it's the latter... > Version numbers are arbitrary, but some people will mistakenly read > the low number on PyXML as an inidcation of unstable and immature > software. Based on the improved level of integration of this latest > release the version number should have at least been bumped to a > 0.7.0 release number. For 0.7, I hope to provide XPath support. > What needs to be added/finished before the number can be bumped to > 1.0? If the major components have no well-known and problematic deficiencies left, I'll call it 1.0. A well-known deficiency are the Unicode problems, for example. Regards, Martin From stefan.marsiske@sysdata.siemens.hu Thu Mar 1 14:31:19 2001 From: stefan.marsiske@sysdata.siemens.hu (Marsiske Stefan - 3244) Date: Thu, 1 Mar 2001 15:31:19 +0100 Subject: [XML-SIG] Re: Version number question on PyXML 0.6.4 In-Reply-To: <200103011422.PAA04177@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Thu, Mar 01, 2001 at 03:22:34PM +0100 References: <200102260841.JAA09898@pandora> <01022521251706.28858@fermi.eeel.nist.gov> <200103011422.PAA04177@pandora.informatik.hu-berlin.de> Message-ID: <20010301153119.C12848@sysdata.siemens.hu> hi, On Thu, Mar 01, 2001 at 03:22:34PM +0100, Martin von Loewis wrote: > > I'm begining to think someone from the Enlightenment window manager project > > has been given control of the version numbering for PyXML. > > I don't know much about Enlightenment, so I can't tell whether this is > applause or criticism - I assume it's the latter... i feel offended here, since i'm involved a with E. and i agree totally with the versioning. since atm we're at our 4 rewrite of the whole app. so the low version numbering is ok. though maybe for each rewrite we could also choose a new name, and start over from 0.1. e-0.16.5 could be considered a major release. but we scrapped that, and started over. once again. always improving. :) ---end quoted text--- -- Stefan [http://web.interware.hu/stef] UPDATED:001031 quote: "happy(y2k++)" gpg-key: http://web.interware.hu/stef/gpg.txt From uche.ogbuji@fourthought.com Thu Mar 1 17:37:38 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 01 Mar 2001 10:37:38 -0700 Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and References: Message-ID: <3A9E88E2.D0E836B5@fourthought.com> Alexandre Fayolle wrote: > > Our dear friend Uche is quoted on > http://www.xml.com/pub/a/2001/02/14/deviant.html about the > element. > > The article is worth reading, I think. Actually, I've gone beyond that. With Clark Evans and other concerned parties, I've set up a petition against the xsl:script nonsense and language bindings. Please see http://uche.ogbuji.net:8000/etc/no-xsl-script.xhtml I think Python XML users in should be worried about the W3C's continual efforts to enshrine particular languages as first-class XML-processing environments. It wouldn't be so bad if things such as xsl:script were not so bloody unnecessary. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From mclay@nist.gov Thu Mar 1 05:44:05 2001 From: mclay@nist.gov (Michael McLay) Date: Thu, 1 Mar 2001 00:44:05 -0500 Subject: [XML-SIG] Re: Version number question on PyXML 0.6.4 In-Reply-To: <20010301153119.C12848@sysdata.siemens.hu> References: <200102260841.JAA09898@pandora> <200103011422.PAA04177@pandora.informatik.hu-berlin.de> <20010301153119.C12848@sysdata.siemens.hu> Message-ID: <0103010044050Q.28858@fermi.eeel.nist.gov> On Thursday 01 March 2001 09:31, Marsiske Stefan - 3244 wrote: > hi, > > On Thu, Mar 01, 2001 at 03:22:34PM +0100, Martin von Loewis wrote: > > > I'm begining to think someone from the Enlightenment window manager > > > project has been given control of the version numbering for PyXML. > > > > I don't know much about Enlightenment, so I can't tell whether this is > > applause or criticism - I assume it's the latter... > > i feel offended here, since i'm involved a with E. and i agree totally with > the versioning. since atm we're at our 4 rewrite of the whole app. so the > low version numbering is ok. though maybe for each rewrite we could also > choose a new name, and start over from 0.1. e-0.16.5 could be considered a > major release. but we scrapped that, and started over. once again. always > improving. :) No offense was intended. I used E as an example of a project that has been very conservative with version numbering increments. Python has been conservative as well. They finally bumped Python up to 2.0 for marketing purposes. If anything it should be taken as a complement. There is nothing wrong with being conservative about moving to a 1.0 release. I was just looking for some indication of when 1.0 might happen. The low version number does have a down side. Many people won't touch code below a 1.0 or 1.2 release. This may be dumb logic on their part, but it is reality. From Alexandre.Fayolle@logilab.fr Thu Mar 1 17:53:02 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Thu, 1 Mar 2001 18:53:02 +0100 (CET) Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and In-Reply-To: <3A9E88E2.D0E836B5@fourthought.com> Message-ID: On Thu, 1 Mar 2001, Uche Ogbuji wrote: > > Actually, I've gone beyond that. With Clark Evans and other concerned > parties, I've set up a petition against the xsl:script nonsense and > language bindings. Please see > > http://uche.ogbuji.net:8000/etc/no-xsl-script.xhtml The text of the petition says: "7. With [...] recent changes to the DOM specification, it appears that the W3C strongly favors Java and Javascript over other equally qualified languages." Could you please detail this? I'm interested in learning how the DOM can be language biased. Alexandre Fayolle -- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). From uche.ogbuji@fourthought.com Thu Mar 1 18:27:49 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 01 Mar 2001 11:27:49 -0700 Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and References: Message-ID: <3A9E94A5.12873CBC@fourthought.com> Alexandre Fayolle wrote: > > On Thu, 1 Mar 2001, Uche Ogbuji wrote: > > > > > Actually, I've gone beyond that. With Clark Evans and other concerned > > parties, I've set up a petition against the xsl:script nonsense and > > language bindings. Please see > > > > http://uche.ogbuji.net:8000/etc/no-xsl-script.xhtml > > The text of the petition says: > > "7. With [...] recent changes to the DOM specification, it appears that > the W3C strongly favors Java and Javascript over other equally qualified > languages." > > Could you please detail this? I'm interested in learning how the DOM can > be language biased. Ah. I'm on the spot. Note that the petition is the synthesis of the entire "gang of eight" that put it together. But I think the DOM clause is a mistake which I missed on earlier editing. It probably referes to the inclusion of the Java and ECMA bindings in level 2, which isn't all that recent, and is not, I think as bad an instance of language bias as the XSLT 1.1 language binding section. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From guido@digicool.com Thu Mar 1 18:38:17 2001 From: guido@digicool.com (Guido van Rossum) Date: Thu, 01 Mar 2001 13:38:17 -0500 Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and In-Reply-To: Your message of "Thu, 01 Mar 2001 10:37:38 MST." <3A9E88E2.D0E836B5@fourthought.com> References: <3A9E88E2.D0E836B5@fourthought.com> Message-ID: <200103011838.NAA17049@cj20424-a.reston1.va.home.com> > Alexandre Fayolle wrote: > > > > Our dear friend Uche is quoted on > > http://www.xml.com/pub/a/2001/02/14/deviant.html about the > > element. > > > > The article is worth reading, I think. Uche: > Actually, I've gone beyond that. With Clark Evans and other concerned > parties, I've set up a petition against the xsl:script nonsense and > language bindings. Please see > > http://uche.ogbuji.net:8000/etc/no-xsl-script.xhtml > > I think Python XML users in should be worried about the W3C's continual > efforts to enshrine particular languages as first-class XML-processing > environments. It wouldn't be so bad if things such as xsl:script were > not so bloody unnecessary. What does our friend Dan Connolly think of all this? He's our secret ally in the W3C, I believe! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From Alexandre.Fayolle@logilab.fr Thu Mar 1 18:50:30 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Thu, 1 Mar 2001 19:50:30 +0100 (CET) Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and In-Reply-To: <3A9E94A5.12873CBC@fourthought.com> Message-ID: On Thu, 1 Mar 2001, Uche Ogbuji wrote: > Alexandre Fayolle wrote: > > > The text of the petition says: > > > > "7. With [...] recent changes to the DOM specification, it appears that > > the W3C strongly favors Java and Javascript over other equally qualified > > languages." > > > > Could you please detail this? I'm interested in learning how the DOM can > > be language biased. > > Ah. I'm on the spot. Note that the petition is the synthesis of the > entire "gang of eight" that put it together. I was not accusing you or anything. Just being curious. I remember hearing you pestering about some stuff in numbering handling (or date handling, I'm not sure) in XSLT, which was Java biased (and this does not appear in the petition, as far as I can tell), but could not see what was the thing with DOM. Now, as for the bindings, I have to admit that it is one part of the spec that I have never looked at (I've just checked it 30 seconds ago to see what it looks like, and I really do not see the point in putting this in the spec. It brings nothing new, and the IDL is all you need.) Alexandre Fayolle -- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). From stefan.marsiske@sysdata.siemens.hu Thu Mar 1 17:58:10 2001 From: stefan.marsiske@sysdata.siemens.hu (Marsiske Stefan - 3244) Date: Thu, 1 Mar 2001 18:58:10 +0100 Subject: [XML-SIG] Re: Version number question on PyXML 0.6.4 In-Reply-To: <0103010044050Q.28858@fermi.eeel.nist.gov>; from mclay@nist.gov on Thu, Mar 01, 2001 at 12:44:05AM -0500 References: <200102260841.JAA09898@pandora> <200103011422.PAA04177@pandora.informatik.hu-berlin.de> <20010301153119.C12848@sysdata.siemens.hu> <0103010044050Q.28858@fermi.eeel.nist.gov> Message-ID: <20010301185810.F12848@sysdata.siemens.hu> On Thu, Mar 01, 2001 at 12:44:05AM -0500, Michael McLay wrote: > On Thursday 01 March 2001 09:31, Marsiske Stefan - 3244 wrote: > > hi, > > > > On Thu, Mar 01, 2001 at 03:22:34PM +0100, Martin von Loewis wrote: > > > > I'm begining to think someone from the Enlightenment window manager > > > > project has been given control of the version numbering for PyXML. > > > > > > I don't know much about Enlightenment, so I can't tell whether this is > > > applause or criticism - I assume it's the latter... > > > > i feel offended here, since i'm involved a with E. and i agree totally with > > the versioning. since atm we're at our 4 rewrite of the whole app. so the > > low version numbering is ok. though maybe for each rewrite we could also > > choose a new name, and start over from 0.1. e-0.16.5 could be considered a > > major release. but we scrapped that, and started over. once again. always > > improving. :) > > No offense was intended. I used E as an example of a project that has been > very conservative with version numbering increments. Python has been > conservative as well. They finally bumped Python up to 2.0 for marketing > purposes. If anything it should be taken as a complement. There is nothing > wrong with being conservative about moving to a 1.0 release. I was just > looking for some indication of when 1.0 might happen. > > The low version number does have a down side. Many people won't touch code > below a 1.0 or 1.2 release. This may be dumb logic on their part, but it is > reality. ok, i'll admit, i wasn't really offended, somebody just needed to defend E... :) i agree with you on low (sub 1.0) version numbers, but not in the case of E. E has quite big userbase, a long time ago when raster was working for redhat, it was the default windowmanager for gnome. and most people are aware that there will never be a 1.0 version of E. altough i for example fear anyting that has a version number ending in .0 that's a bad sign. remember linux-2.2.0? or redhat [567].0? eeek, never, even with a 100 foot pole... > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > ---end quoted text--- -- Stefan [http://web.interware.hu/stef] UPDATED:001031 quote: "happy(y2k++)" gpg-key: http://web.interware.hu/stef/gpg.txt From akuchlin@mems-exchange.org Thu Mar 1 19:07:51 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Thu, 1 Mar 2001 14:07:51 -0500 Subject: [XML-SIG] Re: Version number question on PyXML 0.6.4 In-Reply-To: <200103011422.PAA04177@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Thu, Mar 01, 2001 at 03:22:34PM +0100 References: <200102260841.JAA09898@pandora> <01022521251706.28858@fermi.eeel.nist.gov> <200103011422.PAA04177@pandora.informatik.hu-berlin.de> Message-ID: <20010301140751.B9504@ute.cnri.reston.va.us> On Thu, Mar 01, 2001 at 03:22:34PM +0100, Martin von Loewis wrote: >If the major components have no well-known and problematic >deficiencies left, I'll call it 1.0. A well-known deficiency are the >Unicode problems, for example. What problem is that? Is it that if a parser outputs regular strings, you don't know what encoding they're in? Regarding version numbers: the PyXML code base is certainly full-featured enough that it could certainly be called 1.0. We'd want to work on bringing the docs back up to date, though; I'm planning to revise the XML HOWTO next week (after I get the QEL release out). --amk From uche.ogbuji@fourthought.com Thu Mar 1 19:12:12 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 01 Mar 2001 12:12:12 -0700 Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and References: Message-ID: <3A9E9F0C.63C14E1C@fourthought.com> Alexandre Fayolle wrote: > > On Thu, 1 Mar 2001, Uche Ogbuji wrote: > > > Alexandre Fayolle wrote: > > > > > The text of the petition says: > > > > > > "7. With [...] recent changes to the DOM specification, it appears that > > > the W3C strongly favors Java and Javascript over other equally qualified > > > languages." > > > > > > Could you please detail this? I'm interested in learning how the DOM can > > > be language biased. > > > > Ah. I'm on the spot. Note that the petition is the synthesis of the > > entire "gang of eight" that put it together. > > I was not accusing you or anything. Just being curious. I remember hearing > you pestering about some stuff in numbering handling (or date handling, > I'm not sure) in XSLT, which was Java biased (and this does not appear in > the petition, as far as I can tell), but could not see what was the thing > with DOM. > > Now, as for the bindings, I have to admit that it is one part of the spec > that I have never looked at (I've just checked it 30 seconds ago to see > what it looks like, and I really do not see the point in putting this in > the spec. It brings nothing new, and the IDL is all you need.) All true. You've brought to light that the DOM clause was a mistake. Oh well. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From cce@clarkevans.com Thu Mar 1 21:01:31 2001 From: cce@clarkevans.com (Clark C. Evans) Date: Thu, 1 Mar 2001 16:01:31 -0500 (EST) Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and In-Reply-To: Message-ID: On Thu, 1 Mar 2001, Alexandre Fayolle wrote: > On Thu, 1 Mar 2001, Uche Ogbuji wrote: > > Actually, I've gone beyond that. With Clark Evans and other concerned > > parties, I've set up a petition against the xsl:script nonsense and > > language bindings. Please see > > > > http://uche.ogbuji.net:8000/etc/no-xsl-script.xhtml > > The text of the petition says: > > "7. With [...] recent changes to the DOM specification, it appears that > the W3C strongly favors Java and Javascript over other equally qualified > languages." > > Could you please detail this? I'm interested in learning how the DOM can > be language biased. I authored this clause and in the back of my head while I was writing was a message by Mike Champion (perhaps a private one) about the new working draft having Java specific stuff. I never followed up or verified the reference. So, when your post was brought to my attention I freaked out, went scurring about looking for this Java reference and didn't find it. Thus, I labeled it as a "bug" and posted to the xsl-list my apologies for the error. (IMHO, it is better to admit to a possible error before you are accused of it on a public list even if it turns out not to be an error). So, since I was the author of this paragraph, I labeled it as a bug in the draft which was probably a politic thing to do anyway. However, just for your edification, Robin Berjon posted the following to xml-dev regarding the Java litter in the DOM WG recent draft: > In fact, if you look at the WD for DOM3-Core > (http://www.w3.org/TR/2001/WD-DOM-Level-3-Core-20010126/core.html) you'll > see that Java is not at all relegated to an appendix. Section 1.2 is > *entirely* about Java. I'm certain that the intentions behind that section > are good, and I am aware that it is only a WD but that section has nothing > to do there and I nevertheless find it's presence alarming. Either it ought > to describe bindings (and in this case, implementation because it's what it > does) for all languages succeptible of supporting a DOM interface, or it > should be language independent. A DOMImplementationFactory is probably a > good idea, describing that interface as part of the DOM is certainly enough. So, I hope this will help. In any case, Uche was not responsible for this goof... it was my bad. Clark P.S. I look forward to using 4SuiteServer! Sorry this had to be my first post... From martin@loewis.home.cs.tu-berlin.de Thu Mar 1 21:46:32 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 1 Mar 2001 22:46:32 +0100 Subject: [XML-SIG] Re: Version number question on PyXML 0.6.4 In-Reply-To: <20010301140751.B9504@ute.cnri.reston.va.us> (message from Andrew Kuchling on Thu, 1 Mar 2001 14:07:51 -0500) References: <200102260841.JAA09898@pandora> <01022521251706.28858@fermi.eeel.nist.gov> <200103011422.PAA04177@pandora.informatik.hu-berlin.de> <20010301140751.B9504@ute.cnri.reston.va.us> Message-ID: <200103012146.f21LkWu01187@mira.informatik.hu-berlin.de> > On Thu, Mar 01, 2001 at 03:22:34PM +0100, Martin von Loewis wrote: > >If the major components have no well-known and problematic > >deficiencies left, I'll call it 1.0. A well-known deficiency are the > >Unicode problems, for example. > > What problem is that? Is it that if a parser outputs regular strings, > you don't know what encoding they're in? Mainly that, yes. Plus, you cannot tell what kind of string you'll get, except by trying. > Regarding version numbers: the PyXML code base is certainly > full-featured enough that it could certainly be called 1.0. We'd want > to work on bringing the docs back up to date, though; I'm planning to > revise the XML HOWTO next week (after I get the QEL release out). Actually, the reference is much more outdated than the howto. Everything in the reference is probably outdated; everything not documented in the Python library documentation is probably undocumented (with the exception of aspects of 4DOM). Regards, Martin From martin@loewis.home.cs.tu-berlin.de Thu Mar 1 21:43:34 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 1 Mar 2001 22:43:34 +0100 Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and In-Reply-To: (message from Alexandre Fayolle on Thu, 1 Mar 2001 19:50:30 +0100 (CET)) References: Message-ID: <200103012143.f21LhYh01164@mira.informatik.hu-berlin.de> > I was not accusing you or anything. Just being curious. I remember hearing > you pestering about some stuff in numbering handling (or date handling, > I'm not sure) in XSLT, which was Java biased (and this does not appear in > the petition, as far as I can tell), but could not see what was the thing > with DOM. For XSLT numbers, the spec indeed defines exactly the same floating point semantics as used in Java. That is not a bad thing in itself, as the Java meaning is a variant of IEEE 754 (i.e. selecting specific options where the spec leaves options). On the DOM, I notice a number of Java-isms, all of them minor: - naming conventions. OMG style would be has_feature and get_dom_implementation, W3C style is hasFeature and getDOMImplementation. - nesting. IMO, enums should be in module scope; W3C puts them in interface scope - presumably since Java does not allow package-level constants. > Now, as for the bindings, I have to admit that it is one part of the spec > that I have never looked at (I've just checked it 30 seconds ago to see > what it looks like, and I really do not see the point in putting this in > the spec. It brings nothing new, and the IDL is all you need.) You do need it, as it does not follow the CORBA language mappings. E.g. everything is in a module "dom", but that ends up as package org.w3c.dom in Java, and xml.dom in Python. Likewise, the "readonly attribute nodeType" maps to getNodeType() in Java, whereas the IDL mapping would produce a method named nodeType(). You probably could have done all that by spelling out the mapping rules instead of providing the mapping result; in Java, it is easier just to write down the interface definitions. Regards, Martin From frank@quantiva.com Thu Mar 1 22:02:28 2001 From: frank@quantiva.com (Frank Stolze) Date: Thu, 01 Mar 2001 17:02:28 -0500 Subject: [XML-SIG] [OT - JOB AD] Python / XML / Distributed Systems Developer Message-ID: <5.0.2.1.0.20010301163601.00a29af0@pop3.norton.antivirus> --=====================_1822150==_.ALT Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: quoted-printable Sorry for the off-topic post. We are a well-funded, stealth mode startup in= the network service management field. We are building a novel, distributed= system and service that involves Internet protocol implementations (HTTP, SMTP,= POP3, DNS, ping, etc.), statistical analysis, some AI, database storage and=20 reporting, as well as issues such as routing, load balancing, fail-over, firewalls,= etc. Almost all of the implementation is being done in Python. We also use XML, XML-RPC, HTTP tunneling and a few new things. We are looking for two enthusiastic Python & networking gurus to join a small team of hands-on people to help us implement a great vision! The "official" job description is below. Regards, Frank =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Company Profile: We're onto something really big in network service management. We have VC backing and a clear view of the future. Our stealth situation hides our long history in this area and lets us focus on the goal. Join the team of innovative, visionary, enthusiastic, and passionate people that will shape our future. Share the opportunity to work with industry leading experts on the design and creation of a next generation system. Quantiva is looking for people who can quickly grasp new concepts, develop new, original solutions to existing problems, and in general can "hit the ground running." Quantiva is located in Princeton, New Jersey. Please send resumes to techjobs@quantiva.com. Job Description: The Network Software Engineer will contribute to the design and development Quantiva's network service management software. This position will involve active participation in the design, architecture, and implementation of the= =20 product. Requirements: =B7 Strong software development skills in a Unix environment. =B7 Strong network and distributed systems programming experience.= Good=20 knowledge of network protocols such as TCP/IP, HTTP, DNS, SMTP, POP3 and=20 concepts such as firewalls, routing required. =B7 4+ years of hands-on experience with scripting languages such as= =20 Perl, Python or Tcl. Python proficiency is required. =B7 Applications programming experience in object-oriented languages= =20 such as C++ or Java. =B7 DBMS experience including database schemas, SQL, and database=20 programming. =B7 Solid Unix experience (Solaris preferred) in a commercial= environment. =B7 XML experience is a plus. =B7 BS/MS/PhD in CS, EE, CE or related field. --=====================_1822150==_.ALT Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sorry for the off-topic post. We are a well-funded, stealth mode startup in the
network service management field. We are building a novel, distributed system
and service that involves Internet protocol implementations (HTTP, SMTP, POP3,
DNS, ping, etc.), statistical analysis, some AI, database storage and reporting,
as well as issues such as routing, load balancing, fail-over, firewalls, etc.

Almost all of the implementation is being done in Python. We also use XML,
XML-RPC, HTTP tunneling and a few new things. We are looking for two
enthusiastic Python & networking gurus to join a small team of hands-on
people to help us implement a great vision!

The "official" job description is below.


Regards,
Frank

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Company Profile:

We're onto something really big in network service management.  We have
VC backing and a clear view of the future.  Our stealth situation hides our
long history in this area and lets us focus on the goal.  Join the team of
innovative, visionary, enthusiastic, and passionate people that will shape
our future.  Share the opportunity to work with industry leading experts on
the design and creation of a next generation system.

Quantiva is looking for people who can quickly grasp new concepts,
develop new, original solutions to existing problems, and in general can
"hit the ground running."

Quantiva is located in Princeton, New Jersey.

Please send resumes to techjobs@quantiva.com.


Job Description:

The Network Software Engineer will contribute to the design and development
Quantiva's network service management software. This position will involve
active participation in the design, architecture, and implementation of the product.


Requirements:
=B7       Strong software development skills in a Unix environment.
=B7       Strong network and distributed systems programming experience. Good knowledge of network protocols such as TCP/IP, HTTP, DNS, SMTP, POP3 and concepts such as firewalls, routing required.
=B7       4+ years of hands-on experience with scripting languages such as Perl, Python or Tcl. Python proficiency is required.
=B7       Applications programming experience in object-oriented languages such as C++ or Java.
=B7       DBMS experience including database schemas, SQL, and database programming.
=B7       Solid Unix experience (Solaris preferred) in a commercial environment.
=B7       XML experience is a plus.
=B7       BS/MS/PhD in CS, EE, CE or related field.
--=====================_1822150==_.ALT-- From frank@quantiva.com Thu Mar 1 22:12:45 2001 From: frank@quantiva.com (Frank Stolze) Date: Thu, 1 Mar 2001 17:12:45 -0500 (EST) Subject: [XML-SIG] [OT - JOB AD][REPOST] Python / XML / Distributed Systems Developer Message-ID: This time without the HTML nonsense... Sorry for the off-topic post. We are a well-funded, stealth mode startup = in the network service management field. We are building a novel, distributed sy= stem and service that involves Internet protocol implementations (HTTP, SMTP, = POP3, DNS, ping, etc.), statistical analysis, some AI, database storage and reporting, as well as issues such as routing, load balancing, fail-over, firewalls, etc. Almost all of the implementation is being done in Python. We also use XML= , XML-RPC, HTTP tunneling and a few new things. We are looking for two enthusiastic Python & networking gurus to join a small team of hands-on people to help us implement a great vision! The "official" job description is below. Regards, Frank =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Company Profile: We're onto something really big in network service management. We have VC backing and a clear view of the future. Our stealth situation hides ou= r long history in this area and lets us focus on the goal. Join the team of innovative, visionary, enthusiastic, and passionate people that will shap= e our future. Share the opportunity to work with industry leading experts o= n the design and creation of a next generation system. Quantiva is looking for people who can quickly grasp new concepts, develop new, original solutions to existing problems, and in general can "hit the ground running." Quantiva is located in Princeton, New Jersey. Please send resumes to techjobs@quantiva.com. Job Description: The Network Software Engineer will contribute to the design and developme= nt Quantiva's network service management software. This position will involv= e active participation in the design, architecture, and implementation of t= he product. Requirements: =B7 Strong software development skills in a Unix environment. =B7 Strong network and distributed systems programming experience. Good k= nowledge of network protocols such as TCP/IP, HTTP, DNS, SMTP, POP3 and concepts= such as firewalls, routing required. =B7 4+ years of hands-on experience with scripting languages such as Perl= , Python or Tcl. Python proficiency is required. =B7 Applications programming experience in object-oriented languages such= as C++ or Java. =B7 DBMS experience including database schemas, SQL, and database program= ming. =B7 Solid Unix experience (Solaris preferred) in a commercial environment. =B7 XML experience is a plus. =B7 BS/MS/PhD in CS, EE, CE or related field. From smith@xml-doc.org Fri Mar 2 04:37:18 2001 From: smith@xml-doc.org (Michael Smith) Date: 01 Mar 2001 20:37:18 -0800 Subject: [XML-SIG] Maintaining catalogs In-Reply-To: Andrew Kuchling's message of "Tue, 27 Feb 2001 10:11:19 -0500" References: Message-ID: Andrew Kuchling writes: > For a project, I'd like to install a DTD on the system and > automatically add its public identifier to the catalog. Is there a > standard place to put SGML/XML catalogs on Unix systems? > /usr/(local)?/lib/sgml? /etc/sgml/? I'm following up this a little late, so maybe somebody already pointed you to the SGML/XML part of the proposed Linux Standard Base (LSB) spec: http://www.linuxbase.org/spec/gLSB/gLSB/lsbsgml.html or for specifics on directory structure: http://www.linuxbase.org/spec/gLSB/gLSB/sgmlr001.html It's a proposed standard, so current distributions aren't yet necessarily consistent with it of course. From akuchlin@mems-exchange.org Fri Mar 2 06:33:25 2001 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Fri, 2 Mar 2001 01:33:25 -0500 Subject: [XML-SIG] ANN: quotation-tools 0.0.1 Message-ID: <200103020633.BAA02006@mira.erols.com> I've made a first release of quotation-tools, which contains a Python package for parsing QEL 2.0, and provides additional tools using the 'qel package'. It can be downloaded from the QEL software page at http://www.amk.ca/qel/software.html. This is a first release, and the only two tools implemented at this point are qtformat, for formatting QEL, and qtgrep, for searching through QEL files. In future releases I want to add more tools, provide convertors to QEL from other formats, and eventually produce a GUI editor, but that's some way off. --amk From uche.ogbuji@fourthought.com Fri Mar 2 07:56:38 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Fri, 02 Mar 2001 00:56:38 -0700 Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and In-Reply-To: Message from Guido van Rossum of "Thu, 01 Mar 2001 13:38:17 EST." <200103011838.NAA17049@cj20424-a.reston1.va.home.com> Message-ID: <200103020756.AAA12971@localhost.localdomain> > > I think Python XML users in should be worried about the W3C's continual > > efforts to enshrine particular languages as first-class XML-processing > > environments. It wouldn't be so bad if things such as xsl:script were > > not so bloody unnecessary. > > What does our friend Dan Connolly think of all this? He's our secret > ally in the W3C, I believe! :-) Actually, based on my correspondence, I think we have several allies in the W3C. Dan Brickley is another. Henry Thomson, Schemas WG chair made a very early Python binding for his prototype XSV schemas implementation, and Philip Le Hagar, current DOM WG chair and I have chatted about making the Python/DOM binding an official annex. However, I've often noticed that W3C staffers tend to avoid public jousting with member company reps over matters that might be considered political. In good news, though, it looks as if over a hundred people have signed our petition in under 24 hours. That should ring a bell for the W3C. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From scott snyder Sat Mar 3 01:39:59 2001 From: scott snyder (scott snyder) Date: Fri, 02 Mar 2001 19:39:59 CST Subject: [XML-SIG] 0.6.4 problem with reading DOM tree from XML with validation Message-ID: <200103030140.TAA01207@d0sgibnl1.fnal.gov> hi - Reading a DOM tree from XML with validation seems to have broken between 0.6.2 and 0.6.4. For example, if i run the following program: ------------------------------------------------------------- from xml.dom.ext.reader.Sax2 import FromXmlFile f = open ('test.xml', 'w') f.write (""" """) f.close() doc = FromXmlFile ('test.xml', None, 1) print doc ------------------------------------------------------------- with 0.6.4, it runs without error, even though the DTD referred to does not exist. $ python read.py 0.6.2, on the other hand, does give me the error i expect: [sss@karma xmltest]$ python read.py Traceback (innermost last): File "read.py", line 9, in ? doc = FromXmlFile ('test.xml', None, 1) ... (traceback trimmed) ... File "xml/dom/ext/reader/Sax2.py", line 240, in fatalError raise exception xml.sax._exceptions.SAXParseException: Unknown:2:50: Couldn't open resource 'NONEXISTENT.dtd' The immediate problem is fixed by this change: *** xml/dom/ext/reader/Sax2.py-orig Tue Feb 20 00:47:40 2001 --- xml/dom/ext/reader/Sax2.py Fri Mar 2 18:29:21 2001 *************** *** 274,279 **** --- 274,281 ---- def __init__(self, validate=0, keepAllWs=0, catName=None, saxHandlerClass=XmlDomGenerator, parser=None): self.parser = parser or (validate and sax2exts.XMLValParserFactory.make_parser()) or sax2exts.XMLParserFactory.make_parser() + if validate: + self.parser.setFeature (saxlib.feature_validation, 1) if catName: #set up the catalog, if there is one from xml.parsers.xmlproc import catalog However, with this change, i run into another bug: $ python read.py Traceback (innermost last): File "read.py", line 9, in ? doc = FromXmlFile ('test.xml', None, 1) File "xml/dom/ext/reader/Sax2.py", line 330, in FromXmlFile saxHandlerClass, parser) File "xml/dom/ext/reader/Sax2.py", line 315, in FromXmlStream return reader.fromStream(stream, ownerDocument) File "xml/dom/ext/reader/Sax2.py", line 301, in fromStream self.parser.parse(s) File "xml/sax/drivers2/drv_xmlproc.py", line 90, in parse parser.read_from(source.getByteStream(), bufsize) TypeError: too many arguments; expected 2, got 3 Pooh. The interfaces for the validating and non-validating parsers are not compatible. Patched thusly: *** xml/parsers/xmlproc/xmlval.py-orig Fri Mar 2 18:26:47 2001 --- xml/parsers/xmlproc/xmlval.py Fri Mar 2 18:26:53 2001 *************** *** 98,105 **** def parseEnd(self): self.parser.parseEnd() ! def read_from(self,file): ! self.parser.read_from(file) def flush(self): self.parser.flush() --- 98,105 ---- def parseEnd(self): self.parser.parseEnd() ! def read_from(self,file,bufsize=16384): ! self.parser.read_from(file,bufsize) def flush(self): self.parser.flush() With these changes, the example above works (i.e., gives an error). However, the following program then fails: ---------------------------------------------------------------------- from xml.dom.ext.reader.Sax2 import FromXmlFile f = open ('test2.xml', 'w') f.write (""" """) f.close() f = open ('test2.dtd', 'w') f.write ("\n") f.close () doc = FromXmlFile ('test2.xml', None, 1) print doc ---------------------------------------------------------------------- $ python read2.py Traceback (innermost last): File "read2.py", line 15, in ? doc = FromXmlFile ('test2.xml', None, 1) File "xml/dom/ext/reader/Sax2.py", line 330, in FromXmlFile saxHandlerClass, parser) File "xml/dom/ext/reader/Sax2.py", line 315, in FromXmlStream return reader.fromStream(stream, ownerDocument) File "xml/dom/ext/reader/Sax2.py", line 301, in fromStream self.parser.parse(s) File "xml/sax/drivers2/drv_xmlproc.py", line 90, in parse parser.read_from(source.getByteStream(), bufsize) File "xml/parsers/xmlproc/xmlval.py", line 102, in read_from self.parser.read_from(file,bufsize) File "xml/parsers/xmlproc/xmlutils.py", line 137, in read_from self.feed(buf) File "xml/parsers/xmlproc/xmlutils.py", line 185, in feed self.do_parse() File "xml/parsers/xmlproc/xmlproc.py", line 115, in do_parse self.parse_data() File "xml/parsers/xmlproc/xmlproc.py", line 377, in parse_data self.app.handle_data(self.data,start,end) File "xml/parsers/xmlproc/xmlval.py", line 213, in handle_data self.realapp.handle_ignorable_data(data,start,end) File "xml/sax/drivers2/drv_xmlproc.py", line 355, in handle_ignorable_data self._cont_handler.ignorableWhitespace(data, start, end) # FIXME? TypeError: too many arguments; expected 2, got 4 This patch seems to fix this: *** xml/dom/ext/reader/Sax2.py-orig Tue Feb 20 00:47:40 2001 --- xml/dom/ext/reader/Sax2.py Fri Mar 2 18:59:31 2001 *************** *** 199,205 **** self._nodeStack[-1].appendChild(new_element) return ! def ignorableWhitespace(self, chars): """ If 'keepAllWs' permits, add ignorable white-space as a text node. A Document node cannot contain text nodes directly. --- 199,205 ---- self._nodeStack[-1].appendChild(new_element) return ! def ignorableWhitespace(self, chars, start, length): """ If 'keepAllWs' permits, add ignorable white-space as a text node. A Document node cannot contain text nodes directly. *************** *** 207,213 **** for it in the DOM and it must be discarded. """ if self._keepAllWs and self._nodeStack[-1].nodeType != Node.DOCUMENT_NODE: ! self._currText = self._currText + chars return def characters(self, chars): --- 207,213 ---- for it in the DOM and it must be discarded. """ if self._keepAllWs and self._nodeStack[-1].nodeType != Node.DOCUMENT_NODE: ! self._currText = self._currText + chars[start:start+length] return def characters(self, chars): From scott snyder Sat Mar 3 02:01:02 2001 From: scott snyder (scott snyder) Date: Fri, 02 Mar 2001 20:01:02 CST Subject: [XML-SIG] 0.6.4: problems with sax exceptions Message-ID: <200103030201.UAA01583@d0sgibnl1.fnal.gov> hi - I've been having some problems with sax exceptions in 0.6.4, while trying to build DOM trees from XML. Consider this program. It creates an invalid xml file and reads it. The resulting exception is caught and printed. --------------------------------------------------------------------- from xml.dom.ext.reader.Sax2 import FromXmlFile from xml.sax import saxlib f = open ('test3.xml', 'w') f.write (""" <""") f.close() try: doc = FromXmlFile ('test3.xml') except saxlib.SAXException, e: print e --------------------------------------------------------------------- However, when i run this, i get [sss@karma xmltest]$ python read3.py Traceback (innermost last): File "read3.py", line 13, in ? print e File "xml/sax/_exceptions.py", line 83, in __str__ sysid = self.getSystemId() File "xml/sax/_exceptions.py", line 79, in getSystemId return self._locator.getSystemId() File "xml/sax/drivers2/drv_xmlproc.py", line 161, in getSystemId return self._parser.get_current_sysid() # FIXME? AttributeError: 'None' object has no attribute 'get_current_sysid' It looks like the objects that get followed to get this information get deleted during the stack unwind. Here's an attempt at a fix: *** xml/sax/_exceptions.py-orig Fri Mar 2 19:43:46 2001 --- xml/sax/_exceptions.py Fri Mar 2 19:43:59 2001 *************** *** 61,74 **** SAXException.__init__(self, msg, exception) self._locator = locator def getColumnNumber(self): """The column number of the end of the text where the exception occurred.""" ! return self._locator.getColumnNumber() def getLineNumber(self): "The line number of the end of the text where the exception occurred." ! return self._locator.getLineNumber() def getPublicId(self): "Get the public identifier of the entity where the exception occurred." --- 61,82 ---- SAXException.__init__(self, msg, exception) self._locator = locator + # We need to cache this stuff at construction time. + # If this exception is thrown, the objects through which we must + # traverse to get this information may be deleted by the time + # it gets caught. + self._systemId = self._locator.getSystemId() + self._colnum = self._locator.getColumnNumber() + self._linenum = self._locator.getLineNumber() + def getColumnNumber(self): """The column number of the end of the text where the exception occurred.""" ! return self._colnum def getLineNumber(self): "The line number of the end of the text where the exception occurred." ! return self._linenum def getPublicId(self): "Get the public identifier of the entity where the exception occurred." *************** *** 76,82 **** def getSystemId(self): "Get the system identifier of the entity where the exception occurred." ! return self._locator.getSystemId() def __str__(self): "Create a string representation of the exception." --- 84,90 ---- def getSystemId(self): "Get the system identifier of the entity where the exception occurred." ! return self._systemId def __str__(self): "Create a string representation of the exception." With this change, the program prints this: $ python read3.py test3.xml:3:1: Premature document end, no root element However, if i switch to using a validating XML parser, then i lose the file name in the exception (this assumes the patches in my last note to make the validating parser actually work are applied). --------------------------------------------------------------------- from xml.dom.ext.reader.Sax2 import FromXmlFile from xml.sax import saxlib f = open ('test3.xml', 'w') f.write (""" <""") f.close() f = open ('test4.dtd', 'w') f.write ("\n") f.close () try: doc = FromXmlFile ('test3.xml', None, 1) except saxlib.SAXException, e: print e --------------------------------------------------------------------- $ python read4.py Unknown:3:1: Premature document end, no root element The following patch seems to fix the problem. *** xml/parsers/xmlproc/xmlval.py-orig2 Fri Mar 2 19:55:03 2001 --- xml/parsers/xmlproc/xmlval.py Fri Mar 2 19:55:33 2001 *************** *** 26,31 **** --- 26,32 ---- self.app=Application() self.dtd=CompleteDTD(self.parser) self.val=ValidatingApp(self.dtd,self.parser) + self.current_sysID = "Unknown" self.reset() def parse_resource(self,sysid): *************** *** 99,104 **** --- 100,106 ---- self.parser.parseEnd() def read_from(self,file,bufsize=16384): + self.parser.current_sysID = self.current_sysID self.parser.read_from(file,bufsize) def flush(self): Now, when i run the program, i get $ python read4.py test3.xml:3:1: Premature document end, no root element From scott snyder Sat Mar 3 02:29:41 2001 From: scott snyder (scott snyder) Date: Fri, 02 Mar 2001 20:29:41 CST Subject: [XML-SIG] 0.6.4: another problem with building DOM using validating parser Message-ID: <200103030229.UAA02146@d0sgibnl1.fnal.gov> hi - Here's another problem with building DOM trees from XML with the validating parser with 0.6.4. -------------------------------------------------------------------- from xml.dom.ext.reader.Sax2 import FromXmlFile f = open ('test5.xml', 'w') f.write (""" ]> """) f.close() doc = FromXmlFile ('test5.xml', None, 1) print doc -------------------------------------------------------------------- When i run this: $ python read5.py Traceback (innermost last): File "read5.py", line 14, in ? doc = FromXmlFile ('test5.xml', None, 1) File "xml/dom/ext/reader/Sax2.py", line 330, in FromXmlFile saxHandlerClass, parser) File "xml/dom/ext/reader/Sax2.py", line 315, in FromXmlStream return reader.fromStream(stream, ownerDocument) File "xml/dom/ext/reader/Sax2.py", line 301, in fromStream self.parser.parse(s) File "xml/sax/drivers2/drv_xmlproc.py", line 90, in parse parser.read_from(source.getByteStream(), bufsize) File "xml/parsers/xmlproc/xmlval.py", line 104, in read_from self.parser.read_from(file,bufsize) File "xml/parsers/xmlproc/xmlutils.py", line 137, in read_from self.feed(buf) File "xml/parsers/xmlproc/xmlutils.py", line 185, in feed self.do_parse() File "xml/parsers/xmlproc/xmlproc.py", line 104, in do_parse self.parse_doctype() File "xml/parsers/xmlproc/xmlproc.py", line 482, in parse_doctype self.parse_internal_dtd() File "xml/parsers/xmlproc/xmlproc.py", line 532, in parse_internal_dtd self.handle_internal_dtd(line,lb,self.get_region()[:-last_part_size]) File "xml/parsers/xmlproc/xmlproc.py", line 544, in handle_internal_dtd p.feed(int_dtd) File "xml/parsers/xmlproc/xmlutils.py", line 185, in feed self.do_parse() File "xml/parsers/xmlproc/dtdparser.py", line 251, in do_parse self.parse_entity() File "xml/parsers/xmlproc/dtdparser.py", line 341, in parse_entity self.dtd_consumer.new_external_entity(ent_name,pub_id,sys_id,ndata) File "xml/parsers/xmlproc/xmldtd.py", line 151, in new_external_entity self.dtd_listener.new_external_entity(ent_name,pubid,sysid,ndata) File "xml/sax/drivers2/drv_xmlproc.py", line 239, in new_external_entity ndata) TypeError: too many arguments; expected 4, got 5 This seems to work around the problem, though i think it's probably not the correct fix. *** xml/dom/ext/reader/Sax2.py-orig2 Fri Mar 2 20:10:52 2001 --- xml/dom/ext/reader/Sax2.py Fri Mar 2 20:23:31 2001 *************** *** 255,262 **** self._ownerDoc.getDocumentType().getNotations().setNamedItem(new_notation) return ! def unparsedEntityDecl (self, publicId, systemId, notationName): ! new_notation = self._ownerDoc.getFactory().createEntity(self._ownerDoc, publicId, systemId, notationName) self._ownerDoc.getDocumentType().getEntities().setNamedItem(new_notation) return --- 255,264 ---- self._ownerDoc.getDocumentType().getNotations().setNamedItem(new_notation) return ! def unparsedEntityDecl (self, name, publicId, systemId, ndata): ! if not self._ownerDoc: ! return ! new_notation = self._ownerDoc.getFactory().createEntity(self._ownerDoc, publicId, systemId, name) self._ownerDoc.getDocumentType().getEntities().setNamedItem(new_notation) return From martin@loewis.home.cs.tu-berlin.de Sat Mar 3 08:10:01 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 3 Mar 2001 09:10:01 +0100 Subject: [XML-SIG] 0.6.4: problems with sax exceptions In-Reply-To: <200103030201.UAA01583@d0sgibnl1.fnal.gov> (message from scott snyder on Fri, 02 Mar 2001 20:01:02 CST) References: <200103030201.UAA01583@d0sgibnl1.fnal.gov> Message-ID: <200103030810.f238A1h01334@mira.informatik.hu-berlin.de> > It looks like the objects that get followed to get this information > get deleted during the stack unwind. > > Here's an attempt at a fix: Thanks, committed as-is. > *** xml/parsers/xmlproc/xmlval.py-orig2 Fri Mar 2 19:55:03 2001 > --- xml/parsers/xmlproc/xmlval.py Fri Mar 2 19:55:33 2001 > *************** > *** 26,31 **** > --- 26,32 ---- > self.app=Application() > self.dtd=CompleteDTD(self.parser) > self.val=ValidatingApp(self.dtd,self.parser) > + self.current_sysID = "Unknown" > self.reset() > > def parse_resource(self,sysid): > *************** > *** 99,104 **** > --- 100,106 ---- > self.parser.parseEnd() > > def read_from(self,file,bufsize=16384): > + self.parser.current_sysID = self.current_sysID > self.parser.read_from(file,bufsize) > > def flush(self): That did not seem right. Instead, I've used set_sysid throughout. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Sat Mar 3 08:08:27 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 3 Mar 2001 09:08:27 +0100 Subject: [XML-SIG] 0.6.4 problem with reading DOM tree from XML with validation In-Reply-To: <200103030140.TAA01207@d0sgibnl1.fnal.gov> (message from scott snyder on Fri, 02 Mar 2001 19:39:59 CST) References: <200103030140.TAA01207@d0sgibnl1.fnal.gov> Message-ID: <200103030808.f2388Rh01332@mira.informatik.hu-berlin.de> Hi Scott, Thanks for your comments and patches, they are quite helpful. > *** xml/dom/ext/reader/Sax2.py-orig Tue Feb 20 00:47:40 2001 > --- xml/dom/ext/reader/Sax2.py Fri Mar 2 18:29:21 2001 > *************** > *** 274,279 **** > --- 274,281 ---- > def __init__(self, validate=0, keepAllWs=0, catName=None, > saxHandlerClass=XmlDomGenerator, parser=None): > self.parser = parser or (validate and sax2exts.XMLValParserFactory.make_parser()) or sax2exts.XMLParserFactory.make_parser() > + if validate: > + self.parser.setFeature (saxlib.feature_validation, 1) > if catName: > #set up the catalog, if there is one > from xml.parsers.xmlproc import catalog I think the bug is actually in the XMLValParserFactory, which should return a validating parser (which validation turned on). > *** xml/parsers/xmlproc/xmlval.py-orig Fri Mar 2 18:26:47 2001 > --- xml/parsers/xmlproc/xmlval.py Fri Mar 2 18:26:53 2001 > *************** > *** 98,105 **** > def parseEnd(self): > self.parser.parseEnd() > > ! def read_from(self,file): > ! self.parser.read_from(file) > > def flush(self): > self.parser.flush() > --- 98,105 ---- > def parseEnd(self): > self.parser.parseEnd() > > ! def read_from(self,file,bufsize=16384): > ! self.parser.read_from(file,bufsize) > > def flush(self): > self.parser.flush() I've committed this as-is. More later, Martin From larsga@garshol.priv.no Sun Mar 4 12:32:14 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 04 Mar 2001 13:32:14 +0100 Subject: [XML-SIG] Re: [4suite] Article about XSLT 1.1 and In-Reply-To: References: Message-ID: * Alexandre Fayolle | | I'm interested in learning how the DOM can be language biased. The DOM is not biased towards any particular language, but it does have a strong bias towards a particular family of languages: mainstream statically typed object-oriented languages. This bias is, of course, more or less inherited from IDL. The further away you are from that core family of languages the more painful you'll find implementing and using the DOM, since its design will follow a philosophy increasingly distant from that of your language. In Python, a mainstream object-oriented language, the pain is not too great, even though it can be felt. In Common Lisp, an object-oriented language, it would be felt more strongly. In Haskell, a functional programming language, the DOM is better ignored. Ditto for Prolog, Forth and many other languages. To put it another way the question is not how the DOM can be biased towards a particular language, but more how it could possibly avoid such a bias. --Lars M. From martin@loewis.home.cs.tu-berlin.de Sun Mar 4 22:15:36 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 4 Mar 2001 23:15:36 +0100 Subject: [XML-SIG] 0.6.4 problem with reading DOM tree from XML with validation In-Reply-To: <200103030140.TAA01207@d0sgibnl1.fnal.gov> (message from scott snyder on Fri, 02 Mar 2001 19:39:59 CST) References: <200103030140.TAA01207@d0sgibnl1.fnal.gov> Message-ID: <200103042215.f24MFa902951@mira.informatik.hu-berlin.de> > File "xml/sax/drivers2/drv_xmlproc.py", line 355, in handle_ignorable_data > self._cont_handler.ignorableWhitespace(data, start, end) # FIXME? > TypeError: too many arguments; expected 2, got 4 > > > This patch seems to fix this: Thanks for the report. The patch is incorrect: The official SAX2 interface (in xml.sax.handlers) is that ignorableWhitespace gets a single data argument, so the bug was actually in drv_xmlproc. I've installed an appropriate fix. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Sun Mar 4 22:26:42 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 4 Mar 2001 23:26:42 +0100 Subject: [XML-SIG] 0.6.4: another problem with building DOM using validating parser In-Reply-To: <200103030229.UAA02146@d0sgibnl1.fnal.gov> (message from scott snyder on Fri, 02 Mar 2001 20:29:41 CST) References: <200103030229.UAA02146@d0sgibnl1.fnal.gov> Message-ID: <200103042226.f24MQgP03017@mira.informatik.hu-berlin.de> > from xml.dom.ext.reader.Sax2 import FromXmlFile > > f = open ('test5.xml', 'w') > f.write (""" > > > ]> > > > """) > f.close() > > doc = FromXmlFile ('test5.xml', None, 1) > > print doc [...] > ! def unparsedEntityDecl (self, publicId, systemId, notationName): > ! new_notation = self._ownerDoc.getFactory().createEntity(self._ownerDoc, publicId, systemId, notationName) > self._ownerDoc.getDocumentType().getEntities().setNamedItem(new_notation) > return I'm glad that others are as confused about the matter as I am. What you have in your document is not an unparsed entity, but an external one - the unparsed ones have an NDATA notation name. xmlproc detected that properly (by setting ndata to ""), but drv_xmlproc expected None as the ndata. So I changed to to invoke externalEntityDecl in that case, which is not handled by Sax2. As you found, *if* this was ever invoked, _ownerDoc will be None (since the document element has not been seen yet). Instead of ignoring the unparsed entity, it would be better to put them into the _orphanedChildren; I've changed it thus. In the process, I found that things are put into _orphanedChildren which are later not processed - I've fixed that too. I still think that the unparsedEntityDecl callback is completely broken. What is getFactory and getEntities? Also, if there is a feature for creating entities, it is surely part of a 4DOM extension - probably on the document type. However, that apparently is not capable of distinguishing between external and unparsed entities; not sure whether it should. In any case, I've applied the following patch. I'd appreciate if somebody of FourThough could take a look. Regards, Martin Index: xml/dom/ext/reader/Sax2.py =================================================================== RCS file: /cvsroot/pyxml/xml/xml/dom/ext/reader/Sax2.py,v retrieving revision 1.7 diff -u -r1.7 Sax2.py --- xml/dom/ext/reader/Sax2.py 2001/02/20 01:00:03 1.7 +++ xml/dom/ext/reader/Sax2.py 2001/03/04 22:05:59 @@ -8,7 +8,7 @@ Components for reading XML files from a SAX2 producer. WWW: http://4suite.com/4DOM e-mail: support@4suite.com -Copyright (c) 2000 Fourthought Inc, USA. All Rights Reserved. +Copyright (c) 2000, 2001 Fourthought Inc, USA. All Rights Reserved. See http://4suite.com/COPYRIGHT for license and copyright information """ @@ -148,6 +148,10 @@ self._ownerDoc.appendChild(comment) elif o_node[0] == 'doctype': before_doctype = 0 + elif o_node[0] == 'unparsedentitydecl': + apply(self.unparsedEntityDecl, o_node[1:]) + else: + raise "Unknown orphaned node:"+o_node[0] self._rootNode = self._ownerDoc self._nodeStack.append(self._rootNode) return @@ -222,7 +226,7 @@ def startDTD(self, doctype, publicID, systemID): if not self._rootNode: self._dt = implementation.createDocumentType(doctype, publicID, systemID) - self._orphanedNodes.append(('doctype')) + self._orphanedNodes.append(('doctype',)) else: raise 'Illegal DocType declaration' return @@ -255,9 +259,12 @@ self._ownerDoc.getDocumentType().getNotations().setNamedItem(new_notation) return - def unparsedEntityDecl (self, publicId, systemId, notationName): - new_notation = self._ownerDoc.getFactory().createEntity(self._ownerDoc, publicId, systemId, notationName) - self._ownerDoc.getDocumentType().getEntities().setNamedItem(new_notation) + def unparsedEntityDecl (self, name, publicId, systemId, ndata): + if self._ownerDoc: + new_notation = self._ownerDoc.getFactory().createEntity(self._ownerDoc, publicId, systemId, name) + self._ownerDoc.getDocumentType().getEntities().setNamedItem(new_notation) + else: + self._orphanedNodes.append(('unparsedentitydecl', name, publicId, systemId, ndata)) return #Overridden ErrorHandler methods From larsga@garshol.priv.no Mon Mar 5 09:44:34 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 05 Mar 2001 10:44:34 +0100 Subject: [XML-SIG] 0.6.4: another problem with building DOM using validating parser In-Reply-To: <200103042226.f24MQgP03017@mira.informatik.hu-berlin.de> References: <200103030229.UAA02146@d0sgibnl1.fnal.gov> <200103042226.f24MQgP03017@mira.informatik.hu-berlin.de> Message-ID: * Martin v. Loewis | | I'm glad that others are as confused about the matter as I am. What | you have in your document is not an unparsed entity, but an external | one - the unparsed ones have an NDATA notation name. xmlproc detected | that properly (by setting ndata to ""), but drv_xmlproc expected None | as the ndata. So I changed to to invoke externalEntityDecl in that | case, which is not handled by Sax2. Whoops. Please note that xmlproc should report None rather than "". This is one of the fixes either waiting in my CVS tree or lost in my disk crash. So thre driver was correct, and xmlproc incorrect. --Lars M. From crawford@goingware.com Mon Mar 5 08:52:19 2001 From: crawford@goingware.com (Michael D. Crawford) Date: Mon, 05 Mar 2001 05:22:19 -0330 Subject: [XML-SIG] Web App Testing article at LinuxQuality Message-ID: <3AA353C3.935C1510@goingware.com> Tonight I posted: Use Validators and Load Generators to Test Your Web Applications http://linuxquality.sunsite.dk/articles/webapptesting/ The article generally promotes the idea that one should use validators to ensure that the pages produced by a web application conform to W3C standards. I also talk about stress testing with load generators and bring up the idea of combining the two to check for document corruption from a server under heavy stress. I mention PyXML and the Python XML Sig in the section "XML Validators": http://linuxquality.sunsite.dk/articles/webapptesting/validators.html#xml where I suggest that if a web application generates XHTML rather than HTML, one can make use of one of the many available XML software packages for validating and processing one's documents. I don't say a lot specifically about PyXML (although I do say I've used it and it's good) - is there anything I should add? Do you have any comments on any part of the page? The Linux Quality Database at http://linuxquality.sunsite.dk/ has the dual purpose of promoting better quality in Free and Open Source Software programs by publishing articles like this one, and the eventual development of an easy-to-use but powerful bug database to ease widespread public quality assurance of the Linux kernel. Any articles you might like to submit yourself are appreciated. Also appreciated are helpful hands to contribute to the database project. Regards, Mike Crawford -- Michael D. Crawford GoingWare Inc. - Expert Software Development and Consulting http://www.goingware.com crawford@goingware.com Tilting at Windmills for a Better Tomorrow. From Alexandre.Fayolle@logilab.fr Mon Mar 5 10:57:40 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Mon, 5 Mar 2001 11:57:40 +0100 (CET) Subject: [XML-SIG] [ANN] PyPaSax Message-ID: I'm releasing today a utility we use at Logilab for documenting the code of Narval. We call it pypasax. It uses the parser module to extract information about classes and methods and generates an XML tree from this. We are working on XSLT to generate XMI files so that we can import the data in some UML tool such as ArgoUML. More information can be found at http://www.logilab.org/pypasax/ Alexandre Fayolle -- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). From nyenyec@mailbox.hu Mon Mar 5 14:29:54 2001 From: nyenyec@mailbox.hu (Nyenyec) Date: 5 Mar 2001 14:29:54 -0000 Subject: [XML-SIG] Missing DOCTYPE when pretty printing Message-ID: <20010305142954.16107.qmail@netfinity2.mailbox.hu> Hi, I try to pretty-print an XML file using the XML package v0.6.2. My problem is with the doctype node. I'm writing a web.xml (Java Servlet config) file and it has a DOCTYPE like this: The problematic code in xml.dom.ext.Printer.py is in the PrintVisitor class: def visitDocumentType(self, node): if node.systemId != '': ################### WHY????? self.__emptyReturn = 0 self.stream.write("') return It seems that if the DOCTYPE node does not have a SYSTEM id, it will not be printed at all? Is this deliberate or is this a bug? What is the simplest workaround? Is there another (simpler) way to pretty print XML files? Thanks, Nyenyec -------------------------------------------------- Mi az Ön MailBox címe? - http://mailbox.hu From akuchlin@mems-exchange.org Mon Mar 5 14:04:28 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 5 Mar 2001 09:04:28 -0500 Subject: [XML-SIG] Missing DOCTYPE when pretty printing In-Reply-To: <20010305142954.16107.qmail@netfinity2.mailbox.hu>; from nyenyec@mailbox.hu on Mon, Mar 05, 2001 at 02:29:54PM -0000 References: <20010305142954.16107.qmail@netfinity2.mailbox.hu> Message-ID: <20010305090428.A27565@newcnri.cnri.reston.va.us> On Mon, Mar 05, 2001 at 02:29:54PM -0000, Nyenyec wrote: >It seems that if the DOCTYPE node does not have a SYSTEM id, it will not >be printed at all? Is this deliberate or is this a bug? Likely to be deliberate, because I don't think you can have a DOCTYPE without a system ID; the two choices are: Is this actually causing a problem for you? Is pubId present while systemId is "" or None? If so, what parser aare you using, and what does your file's declaration look like? --amk From xml-sig@thewrittenword.com Mon Mar 5 16:51:42 2001 From: xml-sig@thewrittenword.com (xml-sig@thewrittenword.com) Date: Mon, 5 Mar 2001 10:51:42 -0600 Subject: [XML-SIG] Patch to 0.6.4 to add --with-libexpat and --ldflags Message-ID: <20010305105142.A23879@postal.il.thewrittenword.com> Patch to add command-line arguments --with-libexpat=PATH to specify location of libexpat include/lib files and --ldflags=STR to add arbitrary linker flags to build the resulting object file (on some systems, need to set runtime path to the libexpat shared library). -- albert chin (china@thewrittenword.com) -- snip snip --- setup.py.orig Thu Mar 1 11:45:51 2001 +++ setup.py Thu Mar 1 12:01:45 2001 @@ -35,6 +35,19 @@ def xml(s): return "_xmlplus"+s +# special command-line arguments +LIBEXPAT = None +LDFLAGS = [] + +args = sys.argv[:] +for arg in args: + if string.find(arg, '--with-libexpat=') == 0: + LIBEXPAT = string.split(arg, '=')[1] + sys.argv.remove(arg) + elif string.find(arg, '--ldflags=') == 0: + LDFLAGS = string.split(string.split(arg, '=')[1]) + sys.argv.remove(arg) + def should_build_pyexpat(): try: import pyexpat @@ -56,6 +69,9 @@ return 0 def get_expat_prefix(): + if LIBEXPAT: + return LIBEXPAT + for p in ("/usr", "/usr/local"): incs = os.path.join(p, "include") libs = os.path.join(p, "lib") @@ -100,6 +116,7 @@ include_dirs=include_dirs, library_dirs=library_dirs, libraries=libraries, + extra_link_args=LDFLAGS, sources=sources )) From martin@loewis.home.cs.tu-berlin.de Mon Mar 5 22:47:45 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 5 Mar 2001 23:47:45 +0100 Subject: [XML-SIG] Missing DOCTYPE when pretty printing In-Reply-To: <20010305142954.16107.qmail@netfinity2.mailbox.hu> (nyenyec@mailbox.hu) References: <20010305142954.16107.qmail@netfinity2.mailbox.hu> Message-ID: <200103052247.f25Mljc00873@mira.informatik.hu-berlin.de> > It seems that if the DOCTYPE node does not have a SYSTEM id, it will not > be printed at all? Is this deliberate or is this a bug? That's a bug. > What is the simplest workaround? Update to 0.6.4, which has this bug fixed. > Is there another (simpler) way to pretty print XML files? Depends on what you've got. If it is a DOM tree, then the answer is probably "no". You could traverse it yourself, but it won't be simpler. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Mon Mar 5 22:56:22 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 5 Mar 2001 23:56:22 +0100 Subject: [XML-SIG] Missing DOCTYPE when pretty printing In-Reply-To: <20010305090428.A27565@newcnri.cnri.reston.va.us> (message from Andrew Kuchling on Mon, 5 Mar 2001 09:04:28 -0500) References: <20010305142954.16107.qmail@netfinity2.mailbox.hu> <20010305090428.A27565@newcnri.cnri.reston.va.us> Message-ID: <200103052256.f25MuMw00895@mira.informatik.hu-berlin.de> > Likely to be deliberate, because I don't think you can have a DOCTYPE > without a system ID Why is that? If the doctype only consists of an internal subset, then there is nothing wrong with not having a system id, e.g. as in ]> > > I think the name of the root element is required. The syntax of doctypedecl is [28] doctypedecl ::= '' where ExternalId is [75] ExternalID ::= 'SYSTEM' S SystemLiteral | 'PUBLIC' S PubidLiteral S SystemLiteral So you can't have a public ID without a system ID; you certainly can have neither. A related question: Is it well-formed to have neither system id nor internal subset, i.e. If well-formed, can that ever appear in a valid document? Regards, Martin From jeremy.kloth@fourthought.com Tue Mar 6 00:01:23 2001 From: jeremy.kloth@fourthought.com (Jeremy Kloth) Date: Mon, 05 Mar 2001 17:01:23 -0700 Subject: [XML-SIG] Missing DOCTYPE when pretty printing References: <20010305142954.16107.qmail@netfinity2.mailbox.hu> <20010305090428.A27565@newcnri.cnri.reston.va.us> <200103052256.f25MuMw00895@mira.informatik.hu-berlin.de> Message-ID: <3AA428D3.CDEA60D9@fourthought.com> "Martin v. Loewis" wrote: > > > Likely to be deliberate, because I don't think you can have a DOCTYPE > > without a system ID > > Why is that? If the doctype only consists of an internal subset, then > there is nothing wrong with not having a system id, e.g. as in > > > > ]> > > > > > > > I think the name of the root element is required. The syntax of > doctypedecl is > > [28] doctypedecl ::= ' ('[' (markupdecl | PEReference | S)* ']' S?)? '>' > > where ExternalId is > > [75] ExternalID ::= 'SYSTEM' S SystemLiteral | > 'PUBLIC' S PubidLiteral S SystemLiteral > > So you can't have a public ID without a system ID; you certainly can > have neither. > > A related question: Is it well-formed to have neither system id nor > internal subset, i.e. > > > > If well-formed, can that ever appear in a valid document? According to the doctypedecl, the answer would be yes. The ExternalID is optional and so is the internal subset. Both are followed by a question mark. -- Jeremy Kloth Consultant jeremy.kloth@fourthought.com (303)583-9900 x 105 Fourthought, Inc. http://www.fourthought.com Software-engineering, knowledge-management, XML, CORBA, Linux, Python From martin@loewis.home.cs.tu-berlin.de Tue Mar 6 06:48:17 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 6 Mar 2001 07:48:17 +0100 Subject: [XML-SIG] Missing DOCTYPE when pretty printing In-Reply-To: <3AA428D3.CDEA60D9@fourthought.com> (message from Jeremy Kloth on Mon, 05 Mar 2001 17:01:23 -0700) References: <20010305142954.16107.qmail@netfinity2.mailbox.hu> <20010305090428.A27565@newcnri.cnri.reston.va.us> <200103052256.f25MuMw00895@mira.informatik.hu-berlin.de> <3AA428D3.CDEA60D9@fourthought.com> Message-ID: <200103060648.f266mHD00831@mira.informatik.hu-berlin.de> > According to the doctypedecl, the answer would be yes. The ExternalID > is optional and so is the internal subset. Both are followed by a > question mark. So it would be well-formed, yes. However, it would seem that this gives a document type with no element definitions. In turn, any document using that doctype will be invalid - even the root element is undefined. Regards, Martin From mda@discerning.com Tue Mar 6 22:47:49 2001 From: mda@discerning.com (Mark D. Anderson) Date: Tue, 6 Mar 2001 14:47:49 -0800 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. Message-ID: <027601c0a68f$7abee980$9200a8c0@mdaxke> this morning i decided to try out python for an xml hack, rather than my tried-and-true perl. well, that was this morning, and now it is the afternoon.... i do have tmproc working now (my first goal), but it was heavy going there because of the lack of road signs for the new person (new to python and its xml tools, but experienced with xml and other programming languages). so this is me just letting of steam (i know there is equal or greater chaos in the current state of perl xml tools, but i already know that chaos). in particular, what is the relationship between: - the saxlib available from http://www.garshol.priv.no/download/software/saxlib/ - the xml core package that comes with python 2.x - the _xmlplus package that comes with the pyxml package from the xml-sig at sourceforge i can't find any explanation accessible from various top-level pages: http://pyxml.sourceforge.net/topics/ http://www.python.org/sigs/xml-sig/ http://www.python.org/sigs/xml-sig/status.html http://www.python.org/doc/howto/xml/ . nor do any of the three packages above seem to have any obvious mention of the other two. nor can i find an "xml and python faq", though surely this issue is an example of such a faq. another would be: "will old python programs written against sax1 work with the latest pyxml?" i did find a long, confusing, and inconclusive email thread several months ago on python-dev http://mail.python.org/pipermail/python-dev/2000-September/009369.html i've also looked at the ugly hack in xml/__init__.py for loading _xmlplus, though i still don't know what the difference is between the packages. btw, http://mail.python.org/mailman/listinfo/xml-sig has dead links to http://mail.python.org/sigs/xml-sig/status.html and http://mail.python.org/sigs/xml-sig/links.html btw also, is it expected that the pyxml win32 installer for 2.0 not work with the python 2.1 beta? when i ran the installer, it didn't even find the 2.1 installation. if binary packages are obsoleted by dot revisions in the core, it is going to be painful for everyone. btw again, another faq should be how urllib deals with win32 drive letters. it barfs on things like "c:/tmp/myfile.xml" which is inconvenient but understandable, because there is no such thing as a "c" scheme. using the "|" convention works: "c|/tmp/myfile.xml". it works with "file:c:/tmp/myfile.xml" and "file:c|/tmp/myfile.xml". the strings file:///c|/tmp/myfile.xml and file://c|/tmp/myfile.xml fail but file:/c|/tmp/myfile.xml works. AFAIK this all differs slightly from java and from rfc1738. -mda From martin@loewis.home.cs.tu-berlin.de Wed Mar 7 07:16:39 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 7 Mar 2001 08:16:39 +0100 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. In-Reply-To: <027601c0a68f$7abee980$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> Message-ID: <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> > i do have tmproc working now (my first goal), but it was heavy going > there because of the lack of road signs for the new person (new to > python and its xml tools, but experienced with xml and other > programming languages). Sorry for the confusion. Please notice that you are a "rare case"; most people complaining about bad documentation are familiar with Python but new to XML, so they need to understand terms like "parser", "event-driven", "tree-based", etc. > in particular, what is the relationship between: > - the saxlib available from http://www.garshol.priv.no/download/software/saxlib/ > - the xml core package that comes with python 2.x > - the _xmlplus package that comes with the pyxml package from the xml-sig at sourceforge As you can see from the "last release" date on the saxlib page, this package is quite outdated. It has been incorporated in PyXML in the past, and is known today as "Python SAX version 1". Today, the preferred SAX API is SAX2, which is included in Python 2 and PyXML (PyXML continues to provide the SAX1 interfaces). In addition to the API spec, there is a number of SAX drivers in each package. The saxlib has the SAX1 drivers, Python 2 only has a Expat SAX2 driver, and PyXML has SAX1 and SAX2 drivers (in the latter category, only Expat and xmlproc). PyXML is meant as a strict superset of the Python 2 XML offerings; in all aspects that are present in Python 2, PyXML should behave identical (as far as possible and reasonable). > i can't find any explanation accessible from various top-level pages: > http://pyxml.sourceforge.net/topics/ > http://www.python.org/sigs/xml-sig/ > http://www.python.org/sigs/xml-sig/status.html > http://www.python.org/doc/howto/xml/ . > nor do any of the three packages above seem to have any obvious > mention of the other two. In the README of PyXML itself, you'll notice that saxlib 1.0 is included. The relationship with Python 2 should be documented better; thanks for pointing that out. > nor can i find an "xml and python faq", though surely this issue is > an example of such a faq. So far, people have been using the tutorial, and API documentation. I couldn't say that any specific question is asked frequently - this is the first time that your question comes up on this list. > another would be: "will old python programs written against sax1 > work with the latest pyxml?" Yes; people find out by trying. There is at least one minor incompatibility: In Python 2, SAX drivers may produce Unicode strings, which old applications may not expect. > i've also looked at the ugly hack in xml/__init__.py for loading > _xmlplus, though i still don't know what the difference is between > the packages. That hack is needed to provide the "strict superset" relationship between PyXML and Python 2. It allows you to think of PyXML in terms of "from xml.sax import ...", instead of "from _xmlplus.sax import ...". If PyXML is installed on top of Python 1.5.2, it will call its package directory "xml". > btw also, is it expected that the pyxml win32 installer for 2.0 not > work with the python 2.1 beta? Yes, binary modules will need recompilation - the extension modules contain references to "python20.dll", and hell breaks lose if you load conflicting python.dll into the same process (and try to access them from the same interpreter). > when i ran the installer, it didn't even find the 2.1 installation. That is intentional, yes. To use PyXML with Python 2.1b1, you'll need to compile it yourself from sources; that requires a VC++ installation. > if binary packages are obsoleted by dot revisions in the core, it is > going to be painful for everyone. Unfortunately, that is a specific form of "DLL hell"; there is not much that can be done about it except guaranteeing that conflicting things are not used together - the installer refusing to install the package anywhere else is one aspect of that. > btw again, another faq should be how urllib deals with win32 drive letters. > > it barfs on things like "c:/tmp/myfile.xml" which is inconvenient > but understandable, because there is no such thing Likely, there should be, yes - but there appears to be no expert that can say for sure what the "right way" is. In any case, you'll need to pass URLs to urllib, and as system identifiers to XML libraries. On Unix, passing file names should "work" in most cases; on Windows, things are a bit more complicated. If you can give a consistent story of how things *should* work, I'll start a FAQ list (since your message is the third instance of this question during this year - which makes it frequent :-). Out of curiosity: how do you interpret RFC 1738 with regard to drive letters? I.e. what is the URL referring to C:\autoexec.bat? Regards, Martin From nobody@usw-sf-web2.sourceforge.net Wed Mar 7 17:15:55 2001 From: nobody@usw-sf-web2.sourceforge.net (nobody) Date: Wed, 07 Mar 2001 09:15:55 -0800 Subject: [XML-SIG] [ pyxml-Patches-406732 ] --with-libexpat and --ldflags options Message-ID: Patches #406732, was updated on 2001-03-07 09:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=406732&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: The Written Word (china) Assigned to: Nobody/Anonymous Summary: --with-libexpat and --ldflags options Initial Comment: Patch to add command-line arguments --with-libexpat=PATH to specify location of libexpat include/lib files and --ldflags=STR to add arbitrary linker flags to build the resulting object file (on some systems, need to set runtime path to the libexpat shared library). ftp://ftp.thewrittenword.com/outgoing/pub/PyXML-0.6.4.patch ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=406732&group_id=6473 From mda@discerning.com Wed Mar 7 17:53:01 2001 From: mda@discerning.com (Mark D. Anderson) Date: Wed, 7 Mar 2001 09:53:01 -0800 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> Message-ID: <001101c0a72f$761f2cf0$9200a8c0@mdaxke> > Sorry for the confusion. Please notice that you are a "rare case"; i've heard that before :). > As you can see from the "last release" date on the saxlib page, this > package is quite outdated. It has been incorporated in PyXML in the > past, and is known today as "Python SAX version 1". it'd be nice if lars updated his page to note this. though it is old, there are still quite a few links pointing to his page, not to pyxml or python 2. > In addition to the API spec, there is a number of SAX drivers in each > package. The saxlib has the SAX1 drivers, Python 2 only has a Expat > SAX2 driver, and PyXML has SAX1 and SAX2 drivers (in the latter > category, only Expat and xmlproc). another useful faq somewhere would be about expat. this is actually a PITA for the perl world too right now -- apache links in expat (optional, but not if dav is linked in), and then XML::Parser pulls in another expat, and probably both are different from the latest one, things start crashing. they all used to be statically linked but can now use a separate package direct from the sourceforge expat project. (I'm one of the unfortunate few who actually understands all this, and i'm the first to admit i haven't done my part in writing it up). so for python, suppose you wanted to upgrade to the latest sourceforge expat. is that possible? is the expat dll relied upon by any core python modules? does pyexpat make any changes relative to the SF expat distribution? what happens if you want to use mod_py and an apache with expat linked in? or worse yet, suppose you wanted to link mod_py and mod_perl and mod_dav into apache? > PyXML is meant as a strict superset of the Python 2 XML offerings; in > all aspects that are present in Python 2, PyXML should behave > identical (as far as possible and reasonable). is this situation going to remain indefinitely? does this mean that any other "foo" sig who produces something part of python core is going to have to do a similarly ugly python/_fooplus ? > In the README of PyXML itself, you'll notice that saxlib 1.0 is > included. true, though generally people like to know what is in a package before they download it... > > if binary packages are obsoleted by dot revisions in the core, it is > > going to be painful for everyone. > > Unfortunately, that is a specific form of "DLL hell"; there is not > much that can be done about it except guaranteeing that conflicting > things are not used together - the installer refusing to install the > package anywhere else is one aspect of that. well, 2.1 doesn't *have* to call its dll python21.dll after all, why do we all have msvc42.dll on our windows boxes? that is just one choice about how to force incompatibility. obviously someone chose to make all the 2.1 betas and alphas share a dll name. what win32 perl is currently doing is a perl56.dll. that would seem similar to what python is doing today, except that: 1. perl changes its second version digit far less often. that one dll works with all but the earliest activestate 600 series, spanning well over a year. if python is going to be doing a dot rev every 3 months, things will be painful. 2. python distutils is not very close yet to the power and convenience of perl's ppm or "perl -MCPAN -e shell" so upgrading binary packages over the net is harder. 3. perl has a more sophisticated import search facility than python's, which attempts to pick the highest version of a module which is applicable, for lib directories structured a certain way, making it possible to have a single lib directory shared among multiple perls. > > btw again, another faq should be how urllib deals with win32 drive letters. i'll start a new thread on this. -mda From mda@discerning.com Wed Mar 7 18:25:56 2001 From: mda@discerning.com (Mark D. Anderson) Date: Wed, 7 Mar 2001 10:25:56 -0800 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> Message-ID: <003b01c0a735$b2b78210$9200a8c0@mdaxke> (was: "saxlib, xml, _xmlplus, etc.") Martin v. Loewis says: > Likely, there should be, yes - but there appears to be no expert that > can say for sure what the "right way" is. true enough. a lot has happened since rfc1738. >In any case, you'll need to > pass URLs to urllib, and as system identifiers to XML libraries. On > Unix, passing file names should "work" in most cases; on Windows, > things are a bit more complicated. and unfortunately often the effort to make the unix case "work" makes the windows case work less often. i've had the same difficulty with various java tools. they check for a leading slash or a "^\w:" match to determine whether the string which is passed in is a uri or a host path. > If you can give a consistent story of how things *should* work, I'll > start a FAQ list (since your message is the third instance of this > question during this year - which makes it frequent :-). Out of > curiosity: how do you interpret RFC 1738 with regard to drive letters? > I.e. what is the URL referring to C:\autoexec.bat? it really is a morass. here are some notes which mostly just serve to clarify how awful it is.... rfc 1738 states: A file URL takes the form: file:/// where is the fully qualified domain name of the system on which the is accessible, and is a hierarchical directory path of the form //.../. [...] As a special case, can be the string "localhost" or the empty string; this is interpreted as `the machine from which the URL is being interpreted'. So this would mean that if localhost is implied, all file urls should have (at least) three slashes. Assuming that the rfc means that the "/" is purely syntactic, what you should expect to work is: file:////etc/passwd (4 slashes, because of the leading "/") file:///c:\autoexec.bat file:///\\drv\autoexec.bat file://///drv/autoexec.bat (5 slashes, since forward slashes work on win32 too) but: - there is sometimes the convention (not rfc that i know of) of allowing "|" for ":" - there is sometimes the convention (not rfc that i know of) of allowing file: without the 3 slashes - most software gives unhelpful errors if someone attempts to specify a host in the file url - relative urls (i.e. without a scheme; see rfcs 1808 and 2396) complicate matters; in particular they indicate that absolute urls are signaled with a leading slash, suggesting "/c:/autoexec.bat", which rarely works in any software. - existing software usually treats the url "/" before the path to be part of the path, using 3 slashes, not 4 and in particular most url libraries return the leading slash in their path() function, and CGI variables like SCRIPT_PATH usually do too. It seems clear though that the original intent was for the path to not include the url syntactic separator, and in fact the rfc for NFS urls (rfc2224) makes this explicit personally, what i'd suggest is: 1. RFC-compliant urls must be handled. 2. Any code which attempts to accept a string which may be either a url or a local path, should be as flexible on win32 as unix. That is, if the code accepts "/etc/passwd", it should also accept "c:/autoexec.bat", even though "c:" might be mistaken as a url scheme. there is zero chance of a single-letter url scheme being standardized, and anyway it actually isn't ambiguous because win32 paths are never of the form "c://", so the double slash can distinguish things. 3. When not introducing conflicts with current standards or other platforms, software should match the defacto behavior of internet explorer when parsing file: urls. 4. URL libraries must at least document what they choose to return as the path for the strings file:, http://localhost, file:/, http://localhost/ Today, python urllib is not doing any of these, rejecting file:///c:\autoexec.bat and c:/autoexec.bat -mda From martin@loewis.home.cs.tu-berlin.de Wed Mar 7 21:47:58 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 7 Mar 2001 22:47:58 +0100 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. In-Reply-To: <001101c0a72f$761f2cf0$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <001101c0a72f$761f2cf0$9200a8c0@mdaxke> Message-ID: <200103072147.f27Llwm01178@mira.informatik.hu-berlin.de> > another useful faq somewhere would be about expat. this is actually > a PITA for the perl world too right now -- apache links in expat > (optional, but not if dav is linked in), and then XML::Parser pulls > in another expat, and probably both are different from the latest > one, things start crashing. they all used to be statically linked > but can now use a separate package direct from the sourceforge expat > project. (I'm one of the unfortunate few who actually understands > all this, and i'm the first to admit i haven't done my part in > writing it up). In Python, on Unix, multiple and different copies of expat are a problem only if you have one statically linked into Python; PyXML will then refuse to install. If multiple extension modules link expat, those are opened with RTLD_LOCAL, so they won't interfere with each other. Problems will occur once people decide that building expat as a shared library is a good idea; at a minimum, you need different sonames for them. Don't know what the status on Windows is - expat *is* typically a DLL there, so that is a bit more tricky; the expat maintainers better start to put a version number into the DLL name. > so for python, suppose you wanted to upgrade to the latest > sourceforge expat. is that possible? is the expat dll relied upon > by any core python modules? The pyexpat module (pyexpat.pyd) shipped with BeOpen Python 2.0 relies on the expat DLLs (multiple!). The pyexpat.pyd shipped with the PyXML binary distribution has expat linked statically, so it won't care about any expat DLLs. If PyXML is installed, the pyexpat shipped with Python won't be used anymore (unless you explicitly request it - it is not overwritten). > does pyexpat make any changes relative to the SF expat distribution? Not sure what the question means. pyexpat.c can now use features of multiple expat versions (although you can't distinguish 1.1, 1.2, and 1.95.1 programmatically - in the expat CVS, there is a version #define now). The expat version being used is used unmodified, except perhaps for the build procedure: in PyXML, we have expat 1.2 incorporated, so we build it ourselves. There are actually a few changes compared to expat 1.2, e.g. to not use C++-style comments; the stock version will work fine as well. > what happens if you want to use mod_py and an apache with expat > linked in? or worse yet, suppose you wanted to link mod_py and > mod_perl and mod_dav into apache? On Unix, nothing bad will happen. On Windows, it would be best to use the PyXML build process: link it statically. Or, if building from sources, use the same expat version to build all of them. > > PyXML is meant as a strict superset of the Python 2 XML offerings; in > > all aspects that are present in Python 2, PyXML should behave > > identical (as far as possible and reasonable). > > is this situation going to remain indefinitely? I'm not planning for the eternity. For the forseeable future, yes. > does this mean that any other "foo" sig who produces something part > of python core is going to have to do a similarly ugly > python/_fooplus ? No. Normally, you cannot replace a module from the standard library. For the XML package, there was a special exception. It is only ugly when you look at it; normally, you don't need to be concern with it. > > Unfortunately, that is a specific form of "DLL hell"; there is not > > much that can be done about it except guaranteeing that conflicting > > things are not used together - the installer refusing to install the > > package anywhere else is one aspect of that. > > well, 2.1 doesn't *have* to call its dll python21.dll > after all, why do we all have msvc42.dll on our windows boxes? Because Microsoft has frozen the API of msvc42.dll. Actually, this library is only needed for old applications; new applications link with msvcp60.dll (or msvcp60d.dll or msvc60u.dll or ...). You must rename the library if you change the API - even if it is a change to a function "that nobody uses". Such a change happened for 2.1 - the function that creates frame objects takes two additional parameters (lists of cell objects). > obviously someone chose to make all the 2.1 betas and alphas share a > dll name. Yes, that might cause binary incompatibilities - if you have build programs against the betas, you need to rebuild once the final release happens. > 1. perl changes its second version digit far less often. That is simply not true. Over a period of eight years, there where 6 Python releases (I think); only three of them resulting in Win32 DLLs. python15.dll lasted 2.8 years or so (since January 1998). > if python is going to be doing a dot rev every 3 months, things will > be painful. It won't. > 2. python distutils is not very close yet to the power and > convenience of perl's ppm or "perl -MCPAN -e shell" so upgrading > binary packages over the net is harder. Distutils is rather the equivalent of makefile.pl, not of CPAN; I agree that upgrading is harder. > 3. perl has a more sophisticated import search facility than python's, > which attempts to pick the highest version of a module which is > applicable, for lib directories structured a certain way, making it > possible to have a single lib directory shared among multiple perls. Won't this break terribly if the ABI has changed (or even if just a different set of options was used to build the previous release)? I always tell perl installations not to look for old versions of the modules; everytime I use CPAN, it essentially reinstalls my entire system (as far as perl modules go). It might be better, but it isn't perfect, either... Regards, Martin From martin@loewis.home.cs.tu-berlin.de Wed Mar 7 22:15:17 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 7 Mar 2001 23:15:17 +0100 Subject: [XML-SIG] file urls in urllib In-Reply-To: <003b01c0a735$b2b78210$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <003b01c0a735$b2b78210$9200a8c0@mdaxke> Message-ID: <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> > rfc 1738 states: > > A file URL takes the form: > file:/// > where is the fully qualified domain name of the system on > which the is accessible, and is a hierarchical > directory path of the form //.../. > [...] > As a special case, can be the string "localhost" or the empty > string; this is interpreted as `the machine from which the URL is > being interpreted'. > > > So this would mean that if localhost is implied, all file urls should have (at least) three slashes. > Assuming that the rfc means that the "/" is purely syntactic, what you should expect to work is: > file:////etc/passwd (4 slashes, because of the leading "/") > file:///c:\autoexec.bat > file:///\\drv\autoexec.bat > file://///drv/autoexec.bat (5 slashes, since forward slashes work on win32 too) That clearly is not the intention of the RFC. It "essentially" says that is a slash-separated list of directories, forming a hierarchy; ie. the intention is that it does not start with a slash. So /etc/passwd clearly is file:///etc/passwd It then gives the example of a VMS file name DISK$USER:[MY.NOTES]NOTE123456.TXT, saying that it might become (*) file://vms.host.edu/disk$user/my/notes/note12345.txt. So the intention clearly is that hierarchy is presented using /. Apparently, translation between a file name and a is meant to be executed in a system-dependent manner, but many systems failed to define a procedure for doing so. Considering that one needs to distinguish the drv case, the logical form would be file://C:/autoexec.bat Regards, Martin (*) The 'might' probably refers to the fact that the URL introduces vms.host.edu, which was not mentioned before. From mclay@nist.gov Wed Mar 7 22:44:11 2001 From: mclay@nist.gov (Michael McLay) Date: Wed, 7 Mar 2001 17:44:11 -0500 Subject: [XML-SIG] file urls in urllib In-Reply-To: <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> Message-ID: <0103071744111X.28858@fermi.eeel.nist.gov> On Wednesday 07 March 2001 17:15, Martin v. Loewis wrote: > > rfc 1738 states: > > translation between a file name and a is meant to be executed > in a system-dependent manner, but many systems failed to define a > procedure for doing so. Considering that one needs to distinguish the > drv case, the logical form would be > > file://C:/autoexec.bat This mapping skips a slash for the hostname. I'm using a commercial tool, XML Authority, that is written in Java. It maps the local file: C:/windows/command.com to: file:///C:/windows/command.com This looks consistent with the example mapping of a VMS logical drive in RFC 1738: For example, a VMS file DISK$USER:[MY.NOTES]NOTE123456.TXT might become From mda@discerning.com Thu Mar 8 00:31:31 2001 From: mda@discerning.com (Mark D. Anderson) Date: Wed, 7 Mar 2001 16:31:31 -0800 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <001101c0a72f$761f2cf0$9200a8c0@mdaxke> <200103072147.f27Llwm01178@mira.informatik.hu-berlin.de> Message-ID: <00f501c0a767$1fd75650$9200a8c0@mdaxke> > You must rename the library if you change the API - even if it is a > change to a function "that nobody uses". Such a change happened for > 2.1 - the function that creates frame objects takes two additional > parameters (lists of cell objects). true enough. i didn't know that changing function signatures was "allowed" in dot revs of python (versus only adding functions, possibly with "Ext" or "2" at the end...). > > 1. perl changes its second version digit far less often. > > That is simply not true. Over a period of eight years, there where 6 > Python releases (I think); only three of them resulting in Win32 DLLs. > python15.dll lasted 2.8 years or so (since January 1998). i meant perl changing the version digit in the dll in such a way as to invalidate existing binary modules. my perl56.dll has worked with binary modules built with other perls, and i have upgraded my perl56.dll repeatedly with different activestate releases. in retrospect, i'm not actually sure there is much different here between python and perl in binary compatibility. it is just that python is bringing out 2.1 shortly after 2.0, while the perl world has been effectively frozen for a year or so while the powers that be contemplate perl6. > > 3. perl has a more sophisticated import search facility than python's, > > which attempts to pick the highest version of a module which is > > applicable, for lib directories structured a certain way, making it > > possible to have a single lib directory shared among multiple perls. > > Won't this break terribly if the ABI has changed (or even if just a > different set of options was used to build the previous release)? I > always tell perl installations not to look for old versions of the > modules; everytime I use CPAN, it essentially reinstalls my entire > system (as far as perl modules go). It might be better, but it isn't > perfect, either... on unix, perl embeds the perl version in the site_perl hierarchy, so that multiple perl installations can share that same hierarchy. then the import search path is initialized appropriately in any perl using that site_perl so it only "sees" the site_perl branches that match its version. windows perl doesn't do that, and it should. historically activestate has horked up the install directory structure in various ways deviating from the unix one; it seems to have gotten more similar over the years. -mda From tpassin@home.com Thu Mar 8 00:48:53 2001 From: tpassin@home.com (Thomas B. Passin) Date: Wed, 7 Mar 2001 19:48:53 -0500 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <003b01c0a735$b2b78210$9200a8c0@mdaxke> Message-ID: <008a01c0a769$8d6cdb20$7cac1218@reston1.va.home.com> Mark D. Anderson writes about file: urls. Mark, here is a copy of a message I posted last month on this tricky subject. I've been hoping to get some agreement on the usage so we can start building it in. I'm glad you brought it up. Cheers, Tom P ================================================== This file: business is trickier than it seems, because the RFC is ambiguous for file: urls. A pipe character isn't in the rfc at all even though it's used by some of the browsers. I strongly suggest that when a local file is intended, that one should use the file: scheme. That way, the application doesn't have to guess and it won't try a spurious url if the file isn't found. The way it's done in this example is just asking for continuous trouble, as I guess we're seeing now. I think we should come to an agreement with the maintainer of the urllib about the allowed forms for file: schemes. It's mainly on Windows (and, perhaps, Macs) that there would be a problem. My preferred forms are these, for a file at d:\temp\python\thefile.xml - 1) file:///d:/temp/python/thefile.xml 2) file:///d:\temp\python\thefile.xml Both of these comply fully with the rfc. 2) is an "opaque" form - no further parsing would be done by the url processor, it would just pass it to the os. 1) is what you get according to the rfc when you want the url processor to be able to parse out the path parts. The processor is supposed to know to replace slashes by backslashes if appropriate for the os. Either 1) or 2) would also work for files on a network file system, if you put the host name in there - file://host/temp/python/thefile.xml 1) would be more portable, and is my preference. The processor should be able to handle both, however. For backwards compatibility, form 3) should also be accepted, I suppose: 3) file:d:\temp\python\thefile.xml This could be negotiated, though. Let's agree on this and get it working right! From mda@discerning.com Thu Mar 8 00:47:13 2001 From: mda@discerning.com (Mark D. Anderson) Date: Wed, 7 Mar 2001 16:47:13 -0800 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> Message-ID: <010901c0a769$50e0d350$9200a8c0@mdaxke> > > file://C:/autoexec.bat > > This mapping skips a slash for the hostname. I'm using a commercial tool, XML > Authority, that is written in Java. It maps the local file: > C:/windows/command.com > to: > file:///C:/windows/command.com > This looks consistent with the example mapping of a VMS logical drive in RFC > 1738: >... exactly. if file:///C:/autoexec.bat is correct then file:////etc/passwd should be, regardless of current practice. as i mentioned, this is clarified in the direction of the slash being a syntactic separator only in the nfs url rfc2224. however, i do realize current practice is for the 3-slash version for unix-style paths. -mda From tpassin@home.com Thu Mar 8 01:37:46 2001 From: tpassin@home.com (Thomas B. Passin) Date: Wed, 7 Mar 2001 20:37:46 -0500 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> Message-ID: <000a01c0a770$62c187c0$7cac1218@reston1.va.home.com> Mark D. Anderson wrote - > > > file://C:/autoexec.bat > > > > This mapping skips a slash for the hostname. I'm using a commercial tool, XML > > Authority, that is written in Java. It maps the local file: > > C:/windows/command.com > > to: > > file:///C:/windows/command.com > > This looks consistent with the example mapping of a VMS logical drive in RFC > > 1738: > >... > > exactly. if file:///C:/autoexec.bat is correct then file:////etc/passwd should be, regardless > of current practice. as i mentioned, this is clarified in the direction of the slash being > a syntactic separator only in the nfs url rfc2224. > > however, i do realize current practice is for the 3-slash version for unix-style paths. > The triple slash really comes from an abbreviation. The basic form is scheme://host/path-on-host For the file:scheme, the host is supposed to be localhost (for your own machine), or the name of a network host if you want to refer to a file on a network file system. you are allowed to replace "locahost" by an empty string, so you have either file://localhost/path-on-local-machine or file:///path-on-local-machine So far, so good. The problem comes in when you ask what is the path for windows? You could use an opaque path, wherein the entire path is not to be parsed by the url handler. This should give you file:urls like this, which is completely compatible with both the old and the new rfc: 1) file:///c:\temp\file_url.txt Or you could use the parsable form, which uses forward slashes, which is also compatible with the rfcs: 2) file:///c:/temp/file_url.txt The rfcs don't allow a form with no slashes after the scheme's colon. But it's common enough that it might be worthwhile to support it anyway. Double or quadruple slashes should be disallowed. To see this, just imagine that you restore the "localhost" host name, or some other network host name. It just doesn't work unless you have three slashes. My recommendation is to allow both 1) and 2), and also possibly (needs more discussion) to allow the form file:c:\temp\file_url.txt Cheers, Tom P From mda@discerning.com Wed Mar 7 12:22:33 2001 From: mda@discerning.com (Mark D. Anderson) Date: Wed, 7 Mar 2001 04:22:33 -0800 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> <000a01c0a770$62c187c0$7cac1218@reston1.va.home.com> Message-ID: <012301c0a701$4aa61c60$9200a8c0@mdaxke> i'm definitely getting academic here, because i think the appropriate handling for windows file: urls is fairly clear, and they are not handled properly by urllib, while the handling by urllib of unix-style paths, while not what i consider "right thing", is what everyone else does. but.... suppose we agree that file:///c:/autoexec.bat should work (this is the case of a collapsed localhost). then the processing model is that if a url starts with file:/// then remove that prefix, and consider the remainder (because /c:/autoexec.bat is not a proper local file). ok, now do that to file:///etc/passwd and you get etc/passwd. so that means a parser has to look at c:/autoexec.bat and etc/passwd and conclude that because the first segment looks like a drive letter, it is ok, while etc/passwd needs a leading slash. if the host slash separator were treated as purely a separator, then this heuristic would not be necessary. i think it is fair to say that rfc1738 is ambiguous since they only give an mvs example. but nfs urls are defined clearly to match my "cleaner" notion of purely lexical url processing, as per http://www.faqs.org/rfcs/rfc2396.html : Note that the initial "/" that introduces the of an NFS URL must not be passed to the server for multi-component lookup since the pathname is to be evaluated relative to the public filehandle directory. For example, if the public filehandle is associated with the server's directory "/a/b/c" then the URL: nfs://server/d/e/f will be evaluated with a multi-component lookup of the path "d/e/f" relative to the server's directory "/a/b/c" while the URL: nfs://server//a/b/c/d/e/f will locate the same file with an absolute multi-component lookup of the path "/a/b/c/d/e/f" relative to the server's filesystem root. Notice that a double slash is required at the beginning of the path. but wait, it gets worse. we'd like certain functions to "just work" and handle either a url or a local host path -- this is certainly what we'd like when we specify an xml source on a command line. if so, then we'd also like to sometimes specify in *relative urls* in some of those same cirmstances. and guess what? relative urls have no leading scheme and therefore are lexically indistinguishable from some local host paths. so in that case, if a processor sees etc/passwd, it should *not* add a leading slash, since it is relative to either current working directory or the current url base, whichever you like, and should instead look at /usr/etc/passwd or whatever. so if we'd like to follow the non-rfc convention that a file:foobar url is allowed, without the net_loc part of the url, then we should say that file:etc/passwd is a relative url while file:/etc/passwd is absolute. regardless, i think the policy should be independent of the operating system of the server. that is, the url file:///c:/autoexec.bat should look for the file c:/autoexec.bat on unix systems as well. It should be a purely lexical operation. this is incidentally one of the annoying features of the rfc for imap urls, where in their infinite wisdom they did not designate a standardized hierarchy separator, nor even a url parameter to indicate one -- it is entirely up to the server to interpret. this means no url processing library can do anything with an imap url by itself. -mda From martin@loewis.home.cs.tu-berlin.de Thu Mar 8 07:04:42 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 8 Mar 2001 08:04:42 +0100 Subject: [XML-SIG] file urls in urllib In-Reply-To: <010901c0a769$50e0d350$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> Message-ID: <200103080704.f2874g501415@mira.informatik.hu-berlin.de> > > > file://C:/autoexec.bat > > > > This mapping skips a slash for the hostname. Oops, yes, it should be file:///C:/autoexec.bat > exactly. if file:///C:/autoexec.bat is correct then > file:////etc/passwd should be No. The is build as a sequence of directories, with a slash between each directory, and a slash between the host and the first hierarchy component (C: in the windows case). On Unix, the first hierarchy component is etc, so it should use only three slashes. > as i mentioned, this is clarified in the direction of the slash > being a syntactic separator only in the nfs url rfc2224. Looking at rfc2224, I can find no such clarification. It mentions that the first slash is a syntactic separator in the nfs url; how does that effect the file url? Regards, Martin From martin@loewis.home.cs.tu-berlin.de Thu Mar 8 06:55:31 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 8 Mar 2001 07:55:31 +0100 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. In-Reply-To: <00f501c0a767$1fd75650$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <001101c0a72f$761f2cf0$9200a8c0@mdaxke> <200103072147.f27Llwm01178@mira.informatik.hu-berlin.de> <00f501c0a767$1fd75650$9200a8c0@mdaxke> Message-ID: <200103080655.f286tVd01338@mira.informatik.hu-berlin.de> > true enough. i didn't know that changing function signatures was > "allowed" in dot revs of python (versus only adding functions, > possibly with "Ext" or "2" at the end...). It depends on what a "dot rev" is. Python 2.0.1 would be a pure bugfix release; 2.1 isn't. > i meant perl changing the version digit in the dll in such a way as > to invalidate existing binary modules. my perl56.dll has worked with > binary modules built with other perls, and i have upgraded my > perl56.dll repeatedly with different activestate releases. How does that work? Are these other perls also using perl56.dll? If they had been using, say, perl55.dll, are the binary modules not linked with perl55.dll? If they are, how does perl manage to use perl56.dll and perl55.dll simultaneously? > in retrospect, i'm not actually sure there is much different here > between python and perl in binary compatibility. it is just that > python is bringing out 2.1 shortly after 2.0, while the perl world > has been effectively frozen for a year or so while the powers that > be contemplate perl6. Indeed. Python 2.0 was a major change over Python 1, and a number of things needed to be fixed/extended/improved/completed, which caused 2.1 being released only five months after 2.0. > > > 3. perl has a more sophisticated import search facility than python's, > > > which attempts to pick the highest version of a module which is > > > applicable, for lib directories structured a certain way, making it > > > possible to have a single lib directory shared among multiple perls. ... > on unix, perl embeds the perl version in the site_perl hierarchy, so > that multiple perl installations can share that same hierarchy. then > the import search path is initialized appropriately in any perl > using that site_perl so it only "sees" the site_perl branches that > match its version. So you are saying that different perl versions share the same toplevel lib directory, but do not share any library files? Why is that a good thing? In Python, if you want to share packages between Python installations, you can put them in /lib/site-python (instead of /lib/python/site-packages). That, of course, requires that the package actually works with all the Python versions installed. Distutils cannot know for sure, so it installs packages into site-packages by default. > windows perl doesn't do that, and it should. historically > activestate has horked up the install directory structure in various > ways deviating from the unix one; it seems to have gotten more > similar over the years. That is the case with Python also. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Thu Mar 8 07:12:59 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 8 Mar 2001 08:12:59 +0100 Subject: [XML-SIG] file urls in urllib In-Reply-To: <008a01c0a769$8d6cdb20$7cac1218@reston1.va.home.com> (tpassin@home.com) References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <008a01c0a769$8d6cdb20$7cac1218@reston1.va.home.com> Message-ID: <200103080712.f287Cx401471@mira.informatik.hu-berlin.de> > I think we should come to an agreement with the maintainer of the > urllib about the allowed forms for file: schemes. It's mainly on > Windows (and, perhaps, Macs) that there would be a problem. My > preferred forms are these, for a file > at d:\temp\python\thefile.xml - > > 1) file:///d:/temp/python/thefile.xml > > 2) file:///d:\temp\python\thefile.xml While there appears certainly to be a need to change something, it is not clear to me how we should come to an agreement. It seems that there is already agreement on the fact that file URLs have a system-specific syntax, so we can easily do NT/Win/DOS independently from Mac, and that independently from Unix. It also seems that for Unix, it "works" most of the time; focus should probably be on Windows. Now, since file: works in a system dependent manner, I'd look to the operating system manufacturer for guidance. Does MS have any documentation on how file: URLs are supposed to work? Does their software behave in a consistent way in that matter? If so, I'd say it is safest to copy what MS does. I can see the point of your proposal, and I agree it is in the spirit of the RFC. I'd avoid implementing it until it can be established that MS software works in the same way. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Thu Mar 8 07:43:35 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 8 Mar 2001 08:43:35 +0100 Subject: [XML-SIG] file urls in urllib In-Reply-To: <012301c0a701$4aa61c60$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> <000a01c0a770$62c187c0$7cac1218@reston1.va.home.com> <012301c0a701$4aa61c60$9200a8c0@mdaxke> Message-ID: <200103080743.f287hZ001784@mira.informatik.hu-berlin.de> > suppose we agree that file:///c:/autoexec.bat should work (this is > the case of a collapsed localhost). then the processing model is > that if a url starts with file:/// then remove that prefix, and > consider the remainder (because /c:/autoexec.bat is not a proper > local file). Perhaps. Processing of file: URLs happens in a system-dependent manner, so it could use one procedure on one system and another procedure on another. > ok, now do that to file:///etc/passwd and you get etc/passwd. Sure. And that denotes the file /etc/passwd, on Unix. > so that means a parser has to look at c:/autoexec.bat and etc/passwd > and conclude that because the first segment looks like a drive > letter, it is ok, while etc/passwd needs a leading slash. A different parser is used on Windows and Unix, so file:///etc/passwd could mean different things on Windows and Unix. On Windows, it might be ill-formed: for an absolute path, you need a drive letter (or else you need to learn the current drive based on some magic processing context); or it could mean \\etc\passwd (i.e. etc being the topmost hierarchy level, if you allow file: URLs to denote UNC names). On Unix, it clearly means /etc/passwd. > i think it is fair to say that rfc1738 is ambiguous since they only > give an mvs example. but nfs urls are defined clearly to match my > "cleaner" notion of purely lexical url processing, Yes, but that is for the nfs: scheme; it does not tell anything about the file: scheme. > as per http://www.faqs.org/rfcs/rfc2396.html : > Note that the initial "/" that introduces the of an NFS > URL must not be passed to the server for multi-component lookup since > the pathname is to be evaluated relative to the public filehandle > directory. For example, if the public filehandle is associated with > the server's directory "/a/b/c" then the URL: > nfs://server/d/e/f > will be evaluated with a multi-component lookup of the path > "d/e/f" relative to the server's directory That means something non-obvious: WebNFS (RFC 2054) has the notion of a "public filehandle", which is a all-null file handle in NFSv2, and a zero-length file handle in NFSv3; the directory associated with the public filehandle is a matter of server configuration. So a "relative path" starts at the directory associated with the public filehandle; an "absolute path" starts with the directory associated with / on the server. That does not readily translate to the file: scheme. > we'd like certain functions to "just work" and handle either a url > or a local host path -- this is certainly what we'd like when we > specify an xml source on a command line. Well, Guido argues that file names and URLs should not be mixed in XML processing; that there should be separate APIs for putting in file names and URLs. That is currently not the case, but it probably should be. Then it is the application's matter to decide whether a string they have is a file name or an URL. > so in that case, if a processor sees etc/passwd, it should *not* add > a leading slash, since it is relative to either current working > directory or the current url base, whichever you like It should be clear from the context whether a relative thing is a relative file name or a relative URL; e.g. when it is passed by the user, it is normally a relative file name, if it is an entity definition, it is a relative URL. > It should be a purely lexical operation. That is clearly not the intention of the RFC; the conversion in the VMS example shows that knowledge about the local file system is required to process a file: URL. Regards, Martin From larsga@garshol.priv.no Thu Mar 8 09:02:46 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 08 Mar 2001 10:02:46 +0100 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. In-Reply-To: <001101c0a72f$761f2cf0$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <001101c0a72f$761f2cf0$9200a8c0@mdaxke> Message-ID: * Mark D. Anderson | | it'd be nice if lars updated his page to note this. though it is | old, there are still quite a few links pointing to his page, not to | pyxml or python 2. I will update my page. I've had it on the todo list for a long time, but will finally do it now. Thanks for pushing me. --Lars M. From nobody@usw-sf-web3.sourceforge.net Thu Mar 8 13:18:05 2001 From: nobody@usw-sf-web3.sourceforge.net (nobody) Date: Thu, 08 Mar 2001 05:18:05 -0800 Subject: [XML-SIG] [ pyxml-Bugs-407007 ] Insane amount of memory lost in FromXml Message-ID: Bugs #407007, was updated on 2001-03-08 05:18 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407007&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: Luke Kenneth Casson Leighton Assigned to: Nobody/Anonymous Summary: Insane amount of memory lost in FromXml Initial Comment: calling FromXml uses an INSANE amount of memory. the larger the document, the more memory is lost. here is a demonstration that uses Cyclops.py (found from searches on python.org for memory usage). #!/usr/bin/env python """ """ resdata = """ """ from xml.dom.ext.reader import Sax2 from Cyclops import CycleFinder def test(): z = CycleFinder() d = Sax2.FromXml(resdata, validate=0, keepAllWs=1) z.register(d) del d z.find_cycles() z.show_stats() z.show_cycles() z.show_cycleobjs() z.show_sccs() z.show_arcs() print "dead root set objects:" for rc, cyclic, x in z.get_rootset(): if rc == 0: z.show_obj(x) z.find_cycles(1) z.show_stats() if __name__ == '__main__': test() ~ ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407007&group_id=6473 From rnd@onego.ru Thu Mar 8 14:37:51 2001 From: rnd@onego.ru (Roman Suzi) Date: Thu, 8 Mar 2001 17:37:51 +0300 (MSK) Subject: [XML-SIG] Bug in re found ( expand hangs) Message-ID: Hello! It seems I have hit a bug in sre, which could be of interest to you. Python 2.0 (#1, Oct 16 2000, 18:10:03) [GCC 2.95.2 19991024 (release)] on linux2 import re a = "abcdefghijklmnop" m = re.match("(.)"*15, a) print m.expand(r"\1") print m.expand(r"\10") ... this takes too much time (probably forever) - as I have not submitted any bugs yet, I am not sure if I did it correctly on http://sourceforge.net/tracker/? ... (Could anyone check if my bug report succeeded? I was submitting it from Lynx and probably missed something). - I have not found anything like the example above in the known bugs. If "\10" is not supported, then the correct behaviour is to return it as is (or return something!), not just hang. Sincerely yours, Roman Suzi -- _/ Russia _/ Karelia _/ Petrozavodsk _/ rnd@onego.ru _/ _/ Thursday, March 08, 2001 _/ Powered by Linux RedHat 6.2 _/ _/ "Gun Control: Keep muzzle pointed at target." _/ From mda@discerning.com Thu Mar 8 18:14:27 2001 From: mda@discerning.com (Mark D. Anderson) Date: Thu, 8 Mar 2001 10:14:27 -0800 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> <000a01c0a770$62c187c0$7cac1218@reston1.va.home.com> <012301c0a701$4aa61c60$9200a8c0@mdaxke> <200103080743.f287hZ001784@mira.informatik.hu-berlin.de> Message-ID: <01b201c0a7fb$af0ba620$9200a8c0@mdaxke> > A different parser is used on Windows and Unix, so file:///etc/passwd > could mean different things on Windows and Unix. On Windows, it might > be ill-formed: for an absolute path, you need a drive letter (or else > you need to learn the current drive based on some magic processing > context); or it could mean \\etc\passwd (i.e. etc being the topmost > hierarchy level, if you allow file: URLs to denote UNC names). On > Unix, it clearly means /etc/passwd. it is certainly the case that interpretation of the path portion is server-specific. what is bothering me is that assembly of a url from its scheme,net_loc,path components (or parsing a url into those components) would seemingly have to know about the server OS, just to know what to do with host-path separator slash, which is sometimes significant and sometimes not. but maybe it is all ok.... on the client, suppose i am given a server-specific host path (c:\autoexec.bat or /etc/passwd) and want to make a url. so i follow the rule C1. if the path starts with /, prepend file:// C2. else prepend file:/// on the server, suppose i am given a file: url. So i follow these rules: S1. if there are exactly 0 or 1 slashes after file:, remove file: and take the rest to be the path, possibly relative S2. else if there are exactly 2 slashes after file:, error S3. else if there are 3 or more slashes after file:, remove file:/// and consider the remainder: a. if the remainder starts with a system-specific file system root (such as / or c: or c| or \\), use the string as the absolute path b. else prepend "/" and use that string as the absolute path would that work? note that this treats backward slashes like any other character. server rule S1 is to allow the convenience of just prepending "file:" in front of anything, although clients obeying the client rules above would never do that. it also introduces a (non-rfc) convention for sending a relative file url. -mda From mda@discerning.com Thu Mar 8 19:43:37 2001 From: mda@discerning.com (Mark D. Anderson) Date: Thu, 8 Mar 2001 11:43:37 -0800 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <001101c0a72f$761f2cf0$9200a8c0@mdaxke> <200103072147.f27Llwm01178@mira.informatik.hu-berlin.de> <00f501c0a767$1fd75650$9200a8c0@mdaxke> <200103080655.f286tVd01338@mira.informatik.hu-berlin.de> Message-ID: <01d401c0a808$238fb840$9200a8c0@mdaxke> > How does that work? Are these other perls also using perl56.dll? If > they had been using, say, perl55.dll, are the binary modules not > linked with perl55.dll? If they are, how does perl manage to use > perl56.dll and perl55.dll simultaneously? it just works because perl hasn't changed for a while. they are all using perl56. in retrospect, i do think perl and python are not that much different here. i can't compare how much functionality perl chose to insert in the 5.6 patch series as compared to python 2.0 patch series. > So you are saying that different perl versions share the same toplevel > lib directory, but do not share any library files? Why is that a good > thing? perl has a lib/site_perl hierarchy and a lib hierarchy. both are searched, and both may be updated using cpan. nominally, the lib hierarchy is for modules that come bundled with the perl distro, and site/lib is for ones that are add-ons, although this isn't entirely true for reasons probably having to do with activestate. (yes, site_perl is located inside lib, but pretend that isn't true). in both cases, all modules embed the perl version in their path, and for binary modules (but not pure perl modules), the OS name is also embedded. here is an excerpt from a perl-5.6 installation on linux. ./lib/5.6.0/CGI.pm ./lib/5.6.0/CPAN.pm ./lib/5.6.0/i686-linux/Data/Dumper.pm ./lib/5.6.0/i686-linux/auto/Data/Dumper/Dumper.so ./lib/site_perl/5.6.0/URI/file/Base.pm ./lib/site_perl/5.6.0/URI/file/Unix.pm ./lib/site_perl/5.6.0/i686-linux/SQL/Statement.pm ./lib/site_perl/5.6.0/i686-linux/Storable.pm ./lib/site_perl/5.6.0/i686-linux/auto/Storable/Storable.so ./lib/site_perl/5.6.0/i686-linux/auto/SQL/Statement/Statement.so the separation of "lib" from "lib/site_perl" lets you separately upgrade (or rollback) your perl distribution from whatever site-specific addons you have. the embedding of the OS name allows you to overlay multiple operating systems in your site_perl area (say, hpux, linux, and freebsd) for a single perl version, which may be convenient either for a single developer or multiple developers sharing a mounted install. by default, perl of some version X will attempt to load the module in the version directory which is most recent but not more recent than X. So a 5.6.0 perl would attempt to load a 5.005 module if there was none in the 5.6.0 tree, but a 5.005 perl would ignore all 5.6.0 modules. In some cases, a binary module (or even a pure-perl module) might not be compatible with a newer perl. In that case, this algorithm would cause a runtime failure. The solution is to either make a change in the configuration for the perl, or to simply install a more recent module for that perl. this was a change from pre-5.6, and remains poorly documented. the best discussion is probably "Coexistence with earlier versions of perl5" in the INSTALL file. it also has its limitations; for example a person could build the same version of perl ("5.6.0") in multiple ways (say with threads and sfio) and probably get into trouble if some modules were shared. note that at the perl source level, perl has other versioning facilities. perl allows any program to require a minimum perl core version ("require 5.005"). you can also require a minimum version from any module being imported (assuming that module exports a version, for example "use XML::Simple qw(1.04)". the request for a particular minimum version of a module has no affect on the search, but will abort if the search doesn't find one desired. with my limited python experience i haven't yet seen analogous abilities to declare an assumed python core version, or require a minimum version from another module. but i realize the purpose of this mailing list is not to educate me in python :). -mda From martin@loewis.home.cs.tu-berlin.de Thu Mar 8 21:14:36 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 8 Mar 2001 22:14:36 +0100 Subject: [XML-SIG] saxlib, xml, _xmlplus, etc. In-Reply-To: <01d401c0a808$238fb840$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <001101c0a72f$761f2cf0$9200a8c0@mdaxke> <200103072147.f27Llwm01178@mira.informatik.hu-berlin.de> <00f501c0a767$1fd75650$9200a8c0@mdaxke> <200103080655.f286tVd01338@mira.informatik.hu-berlin.de> <01d401c0a808$238fb840$9200a8c0@mdaxke> Message-ID: <200103082114.f28LEaJ01267@mira.informatik.hu-berlin.de> Hi Mark, Thanks for your elaboration of perl versioning mechanics. I agree that the Python workings appear to be quite similar. > with my limited python experience i haven't yet seen analogous > abilities to declare an assumed python core version, or require a > minimum version from another module. Sure there is import sys assert sys.version_info > (2,0) # requires Python 2.0 or better In fact, this is how the _xmlplus hack works. xml/__init__ has _MINIMUM_XMLPLUS_VERSION = (0, 6, 1) ... v = _xmlplus.version_info if v >= _MINIMUM_XMLPLUS_VERSION: import sys sys.modules[__name__] = _xmlplus This only installs "_xmlplus" as "xml" if _xmlplus is recent enough. If you study that code, you'll notice that it also deals with the case of old PyXML versions, which did not provide version_info. > but i realize the purpose of this mailing list is not to educate me > in python :). It's ok, we are back to your original question. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Thu Mar 8 21:05:36 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 8 Mar 2001 22:05:36 +0100 Subject: [XML-SIG] file urls in urllib In-Reply-To: <01b201c0a7fb$af0ba620$9200a8c0@mdaxke> References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> <000a01c0a770$62c187c0$7cac1218@reston1.va.home.com> <012301c0a701$4aa61c60$9200a8c0@mdaxke> <200103080743.f287hZ001784@mira.informatik.hu-berlin.de> <01b201c0a7fb$af0ba620$9200a8c0@mdaxke> Message-ID: <200103082105.f28L5ai01262@mira.informatik.hu-berlin.de> > what is bothering me is that assembly of a url from its > scheme,net_loc,path components (or parsing a url into those > components) would seemingly have to know about the server OS, just > to know what to do with host-path separator slash, which is > sometimes significant and sometimes not. It is, in general, not possible to interpret the file: URL on another but the local system. In fact, I cannot think of a single system where it *is* possible. > on the client, suppose i am given a server-specific host path > (c:\autoexec.bat or /etc/passwd) Not sure what the client and the server is here. > and want to make a url. so i follow the rule > C1. if the path starts with /, prepend file:// > C2. else prepend file:/// No. On DOS, build a list of components, starting with drive, directory, ... On Unix, build a list of components, starting with directory, directory, ... Then join the components with slashes. Put your machine name in front of it if you want, or else leave it blank. If you meant to take the local filename literally, it would not work if the file name uses characters that are reserved in URLs. > on the server, suppose i am given a file: url. So i follow these rules: > S1. if there are exactly 0 or 1 slashes after file:, remove file: and take the rest to be the path, possibly relative > S2. else if there are exactly 2 slashes after file:, error > S3. else if there are 3 or more slashes after file:, remove file:/// and consider the remainder: > a. if the remainder starts with a system-specific file system root (such as / or c: or c| or \\), use the string as the > absolute path > b. else prepend "/" and use that string as the absolute path > > would that work? No. For Windows NT and Unix, it would probably work. On the Mac, it probably wouldn't - you'll have to replace slashes in the path with colons. On VMS, using the example from the RFC, it probably would fail as well. Regards, Martin From tpassin@home.com Fri Mar 9 04:34:28 2001 From: tpassin@home.com (Thomas B. Passin) Date: Thu, 8 Mar 2001 23:34:28 -0500 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <008a01c0a769$8d6cdb20$7cac1218@reston1.va.home.com> <200103080712.f287Cx401471@mira.informatik.hu-berlin.de> Message-ID: <000801c0a852$3b3cd140$7cac1218@reston1.va.home.com> Martin v. Loewis" writes, > I can see the point of your proposal, and I agree it is in the spirit > of the RFC. I'd avoid implementing it until it can be established that > MS software works in the same way. > I just tested the following combinations usng IE5.5 on Win98: OK (i.e., it works): file:///D:\temp\xxx.html D:\temp\xxx.html D:/temp/xxx.html file:/D:\temp\xxx.html file:D:/temp/xxx.html file:///D:/temp/xxx.html file:///D|/temp/xxx.html file:///D|\temp\xxx.html file://localhost/D:/temp/xxx.html file://localhost/D:\temp\xxx.html Not OK: D|\temp\xxx.html On NS4.08, OK: file:///D:\temp\xxx.html D:\temp\xxx.html D:/temp/xxx.html file:/D:\temp\xxx.html file:///D:/temp/xxx.html file:///D|/temp/xxx.html file:///D|\temp\xxx.html file://localhost/D:/temp/xxx.html file://localhost/D:\temp\xxx.html Not OK: file:D:/temp/xxx.html (doesn't work) for D|\temp\xxx.html , NS thought it was a real url and tried to do a DNS lookup on it.(Huh???) Pretty amazing, eh? Looks like they are following the maxim, write strict, accept loose. Does anyone think we should go to these extremes? Shaking-his-head-in-wonder-ly, Tom P From tpassin@home.com Fri Mar 9 04:40:47 2001 From: tpassin@home.com (Thomas B. Passin) Date: Thu, 8 Mar 2001 23:40:47 -0500 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> <000a01c0a770$62c187c0$7cac1218@reston1.va.home.com> <012301c0a701$4aa61c60$9200a8c0@mdaxke> <200103080743.f287hZ001784@mira.informatik.hu-berlin.de> <01b201c0a7fb$af0ba620$9200a8c0@mdaxke> Message-ID: <001501c0a853$1c8b0a40$7cac1218@reston1.va.home.com> Mark D. Anderson wrote - > on the server, suppose i am given a file: url. So i follow these rules: > S1. if there are exactly 0 or 1 slashes after file:, remove file: and take the rest to be the > path, possibly relative If it is a relative path, it can't have the "file:' part, since that was already established by the base url. Conversely, if the url starts with "file:", it must be absolute, as best as I can see. Cheers, Tom P From tpassin@home.com Fri Mar 9 04:48:12 2001 From: tpassin@home.com (Thomas B. Passin) Date: Thu, 8 Mar 2001 23:48:12 -0500 Subject: [XML-SIG] file urls in urllib References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> <000a01c0a770$62c187c0$7cac1218@reston1.va.home.com> <012301c0a701$4aa61c60$9200a8c0@mdaxke> <200103080743.f287hZ001784@mira.informatik.hu-berlin.de> <01b201c0a7fb$af0ba620$9200a8c0@mdaxke> Message-ID: <001b01c0a854$2625bae0$7cac1218@reston1.va.home.com> Mark D. Anderson wrote - > > what is bothering me is that assembly of a url from its scheme,net_loc,path components > (or parsing a url into those components) would seemingly have to know about the > server OS, > just to know what to do with host-path separator slash, which is sometimes significant and sometimes not. > If you are getting a file by http, you ALWAYS use forward slashes, no volume name, and the "http:" scheme. No server-os ambiguity here. The only time this issue would arise is when you want to load files on your own machine or on a networked file system connected to your machine. In this case, you presumably know the right form. The real issue, I think, is for the handler to know when it encounters an opaque file: path, so that it can send it as is to the OS. Otherwise, if the url follows the rfc for transparent file: urls, use forward slashes and the volume designator (c:/ on Windows, for example). The handler is supposed to be able to parse this and translate it for the OS it is running on. The other issue is to decide how lenient we want to be in allowing variant forms. Any one know about what works and doesn't on a Mac? And will this change with OS-X? Cheers, Tom P From martin@loewis.home.cs.tu-berlin.de Fri Mar 9 07:28:18 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 9 Mar 2001 08:28:18 +0100 Subject: [XML-SIG] file urls in urllib In-Reply-To: <001501c0a853$1c8b0a40$7cac1218@reston1.va.home.com> (tpassin@home.com) References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <200103072215.f27MFHN01380@mira.informatik.hu-berlin.de> <0103071744111X.28858@fermi.eeel.nist.gov> <010901c0a769$50e0d350$9200a8c0@mdaxke> <000a01c0a770$62c187c0$7cac1218@reston1.va.home.com> <012301c0a701$4aa61c60$9200a8c0@mdaxke> <200103080743.f287hZ001784@mira.informatik.hu-berlin.de> <01b201c0a7fb$af0ba620$9200a8c0@mdaxke> <001501c0a853$1c8b0a40$7cac1218@reston1.va.home.com> Message-ID: <200103090728.f297SIq01298@mira.informatik.hu-berlin.de> > If it is a relative path, it can't have the "file:' part, since that was > already established by the base url. Conversely, if the url starts with > "file:", it must be absolute, as best as I can see. That is my understanding as well. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Fri Mar 9 07:27:44 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 9 Mar 2001 08:27:44 +0100 Subject: [XML-SIG] file urls in urllib In-Reply-To: <000801c0a852$3b3cd140$7cac1218@reston1.va.home.com> (tpassin@home.com) References: <027601c0a68f$7abee980$9200a8c0@mdaxke> <200103070716.f277Gds01961@mira.informatik.hu-berlin.de> <003b01c0a735$b2b78210$9200a8c0@mdaxke> <008a01c0a769$8d6cdb20$7cac1218@reston1.va.home.com> <200103080712.f287Cx401471@mira.informatik.hu-berlin.de> <000801c0a852$3b3cd140$7cac1218@reston1.va.home.com> Message-ID: <200103090727.f297Rif01296@mira.informatik.hu-berlin.de> > I just tested the following combinations usng IE5.5 on Win98: > > OK (i.e., it works): > file:///D:\temp\xxx.html > D:\temp\xxx.html > D:/temp/xxx.html > file:/D:\temp\xxx.html > file:D:/temp/xxx.html > file:///D:/temp/xxx.html > file:///D|/temp/xxx.html > file:///D|\temp\xxx.html > file://localhost/D:/temp/xxx.html > file://localhost/D:\temp\xxx.html Thanks for these investigations. That seems to confirm that atleast file:///D:/temp/xxx.html is accepted as a URL, so I think urllib should accept it as well. As for the others, I noticed one aspect that seems to have escaped (pun intended) in the discussion so far: According to RFC 1738, both | and \ are *unsafe*. That means they MUST be escaped in an URL (also the rfc only writes "must"); in turn, the proper form of some of the others would be file:///D%7C/temp/xxx.html file:///D%7C%5Ctemp%5Cxxx.html > Pretty amazing, eh? Looks like they are following the maxim, write > strict, accept loose. I'd like urllib to follow that as well; the strict case probably being the one with the forward slashes (as the required escaping for the REVERSE SOLIDUS and the VERTICAL LINE looks ugly). Please note that urllib.quote quotes the COLON, although this is not required by the RFC: only if the colon was reserved by the scheme, it would need to be quoted. As for accepting: We should atleast accept what is clearly conforming to the RFC, i.e. the forms starting with file:///; we should probably accept that not everything that should be quoted is. We also need backwards compatibility, so the forms using the vertical line should be accepted. Regards, Martin From nobody@sourceforge.net Fri Mar 9 12:15:03 2001 From: nobody@sourceforge.net (nobody) Date: Fri, 09 Mar 2001 04:15:03 -0800 Subject: [XML-SIG] [ pyxml-Bugs-407288 ] tabs inside attribute values removed Message-ID: Bugs #407288, was updated on 2001-03-09 04:15 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407288&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: Luke Kenneth Casson Leighton Assigned to: Nobody/Anonymous Summary: tabs inside attribute values removed Initial Comment: i am having to pre-process all text, substituting for "\t" as a work-around for this problem. if this is not performed, then all tabs inside attribute's values, e.g. , are turned into spaces. i am storing python code in an attribute value, so i _must_ have my tabs!!! :) :) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407288&group_id=6473 From Eugene.Leitl@lrz.uni-muenchen.de Fri Mar 9 17:05:15 2001 From: Eugene.Leitl@lrz.uni-muenchen.de (Eugene Leitl) Date: Fri, 9 Mar 2001 18:05:15 +0100 (MET) Subject: [XML-SIG] dumping an XML parser skeleton from DTD input Message-ID: Excuse me if I'm on crack, but is it possible to dump a DOM (i.e. an object tree representation of the XML document) XML parser skeleton (preferably in Python, but C++ and Java would be also welcome), using a DTD as input? If it is possible, has it been done? With a free tool? TIA, -- Eugene From jeremy.kloth@fourthought.com Fri Mar 9 17:16:00 2001 From: jeremy.kloth@fourthought.com (Jeremy Kloth) Date: Fri, 09 Mar 2001 10:16:00 -0700 Subject: [XML-SIG] Re: tabs inside attribute values removed References: Message-ID: <3AA90FD0.B5777324@fourthought.com> > i am having to pre-process all text, substituting > for "\t" as a work-around for this problem. > > if this is not performed, then all tabs inside > attribute's values, e.g. > , are turned into > spaces. Using PyXML 0.6.4, I didn't see this behavior. from xml.dom.ext.reader import Sax2 doc = Sax2.FromXml('') attr = doc.documentElement.attributes.item(0) print repr(attr.value) 'a\011tab' -- Jeremy Kloth Consultant jeremy.kloth@fourthought.com (303)583-9900 x 105 Fourthought, Inc. http://www.fourthought.com Software-engineering, knowledge-management, XML, CORBA, Linux, Python From martin@loewis.home.cs.tu-berlin.de Fri Mar 9 21:27:47 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 9 Mar 2001 22:27:47 +0100 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input In-Reply-To: (message from Eugene Leitl on Fri, 9 Mar 2001 18:05:15 +0100 (MET)) References: Message-ID: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> > Excuse me if I'm on crack, but is it possible to dump a DOM (i.e. an > object tree representation of the XML document) XML parser skeleton > (preferably in Python, but C++ and Java would be also welcome), using a > DTD as input? Hard to say, I don't even understand the question. What is a "DOM XML parser skeleton"? And how would you like to "dump" it? If you are asking whether you can convert a DOM tree into an XML document - certainly, you don't even need a DTD as input. Regards, Martin From Eugene.Leitl@lrz.uni-muenchen.de Fri Mar 9 22:19:11 2001 From: Eugene.Leitl@lrz.uni-muenchen.de (Eugene.Leitl@lrz.uni-muenchen.de) Date: Fri, 09 Mar 2001 23:19:11 +0100 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> Message-ID: <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> "Martin v. Loewis" wrote: > Hard to say, I don't even understand the question. What is a "DOM XML > parser skeleton"? And how would you like to "dump" it? If you are It is a program that parses XML files in a certain fashion, by creating a tree of objects (so it has to be an OO language it dumps) representing the structure of the XML file. It is a skeleton because it just does that, as lacking true understanding of my further intentions it has no clue as what I'm going to do with the data created from the parsing of the document, so it has to leave the action field blank, to be filled out by me. (Assuming (foolishly) that I know what I'm doing). It is dumped because I'm asking for a program that will dump a program (see above), when supplied with a DTD of the XML it is supposed to be able to parse. As I said, correct me if my glass pipe has burned out. I've been only checking out the whole XML thingy for the last couple of days. > asking whether you can convert a DOM tree into an XML document - > certainly, you don't even need a DTD as input. No, I'm asking for a program that will dump a (skeleton of a, to be filled in at earliest convenience) parser program, when supplied with the DTD of the XML document. From gtn@ebt.com Sat Mar 10 00:23:05 2001 From: gtn@ebt.com (Gavin Thomas Nicol) Date: Fri, 9 Mar 2001 19:23:05 -0500 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input In-Reply-To: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> Message-ID: > Hard to say, I don't even understand the question. What is a "DOM XML > parser skeleton"? I'm not aware of anyone that has code... it probably exists somewhere though. Should be pretty trivial though. Take the DTD, compile it into a state machine, and then split the state machine back out in code. From martin@loewis.home.cs.tu-berlin.de Sat Mar 10 07:00:41 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 10 Mar 2001 08:00:41 +0100 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input In-Reply-To: <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> (Eugene.Leitl@lrz.uni-muenchen.de) References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> Message-ID: <200103100700.f2A70fK01248@mira.informatik.hu-berlin.de> > > Hard to say, I don't even understand the question. What is a "DOM XML > > parser skeleton"? And how would you like to "dump" it? If you are > > It is a program that parses XML files in a certain fashion, by creating > a tree of objects (so it has to be an OO language it dumps) representing > the structure of the XML file. I get the feeling of being dumb here, since I still cannot understand what you are asking for. Let me interpret it word-by-word. You want to program that parses XML files: Well, there are plenty of XML parsers, I can recommend PyXML. It shall create a tree of objects ... I recommend to use a parser that creates a DOM tree: That is a tree of objects. ... representing the structure of the XML file. That I cannot understand: Do you want the content of the XML file being represented by the tree of objects (i.e. the tag names of the elements, their attributes and attribute values, and strings for the text fragments in the elements)? That is what the DOM does. If this is not what you want, what is it about the "structure of the XML file" that you want to be represented. E.g. given what is the tree of objects that you want to get. > It is a skeleton because it just does that, as lacking true > understanding of my further intentions it has no clue as what I'm > going to do with the data created from the parsing of the document, > so it has to leave the action field blank, to be filled out by me. The DOM tree is good for that - it has no understanding of your plans to process the document. > It is dumped because I'm asking for a program that will dump a program > (see above), when supplied with a DTD of the XML it is supposed to be > able to parse. So you want to generate a program? Given a DTD? How about this program? print "from xml.dom.ext.reader import Sax2" print "import sys" print "doc = Sax.FromXmlFile(sys.argv[1])" When being executed, it will always generate the same program: from xml.dom.ext.reader import Sax2 import sys doc = Sax.FromXmlFile(sys.argv[1]) This is a program that can read an XML document and build a tree of objects. The tree of objects is stored in a variable named doc. You can give a DTD to the first program, but it is ignored as it is not needed. > No, I'm asking for a program that will dump a (skeleton of a, to be > filled in at earliest convenience) parser program, when supplied > with the DTD of the XML document. The nice thing about XML is that you can parse it without a DTD, and that you can furthermore use the same parser for all XML documents. Regards, Martin From Eugene.Leitl@lrz.uni-muenchen.de Sat Mar 10 10:29:34 2001 From: Eugene.Leitl@lrz.uni-muenchen.de (Eugene.Leitl@lrz.uni-muenchen.de) Date: Sat, 10 Mar 2001 11:29:34 +0100 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> <200103100700.f2A70fK01248@mira.informatik.hu-berlin.de> Message-ID: <3AAA020E.335812E@lrz.uni-muenchen.de> "Martin v. Loewis" wrote: > I get the feeling of being dumb here, since I still cannot understand > what you are asking for. Let me interpret it word-by-word. That's highly unlikely. I'm just trolling for clue, being forced to learn XML in the course of a few days. The company I'm with has the following ad hoc approach to XML: whip up some XML fitting the problem, don't bother with writing a DTD, code up a parser in an OO language, which recursively reads the tags into memory, creating a hierarchy/tree of objects. Fill in methods to deal with the data sitting in the tree, finis. I looked at the way other people parse XML, and ran into DOM, which seemed to imply the company has reinvented the wheel. I'm trying to understand what Python DOM does (the regression test I ran yesterday did dump core, so I don't have a working installation up yet). > You want to program that parses XML files: Well, there are plenty of > XML parsers, I can recommend PyXML. It shall create a tree of objects > ... I recommend to use a parser that creates a DOM tree: That is a > tree of objects. Excellent. So, DOM parses the XML file (any well-formed XML file). Because it is agnostic of what tags might be coming (since, as you say, it doesn't need a DTD), it doesn't offer any hooks, calling a matching method if a given tag is encountered. So essentially, I wind up with a representation of the XML file as tree of objects, which I process after the fact, right? Iirc, DOM offers some helpful routines, allowing me to parse the tree. So, where do I put my handler, interpreting the stuff as it passes by? Let's say I have a reaction tree (molecule A is precursor of molecule B is precursor of molecule C is educt of product Z) as result of a query. So building XML as a representation of it is quite natural, as it *is* a tree. I want to transform this into a variety of formats: mapping the tree to a number of .png images layed out in a HTML table, or use a Tree Widget to paint a large bitmap, potentially with server-side clickable maps. So, where does Python DOM offer me ways I can get at the data in the object tree? > ... representing the structure of the XML file. That I cannot > understand: Do you want the content of the XML file being represented > by the tree of objects (i.e. the tag names of the elements, their > attributes and attribute values, and strings for the text fragments in > the elements)? That is what the DOM does. If this is not what you This is what I need, yes. > > It is a skeleton because it just does that, as lacking true > > understanding of my further intentions it has no clue as what I'm > > going to do with the data created from the parsing of the document, > > so it has to leave the action field blank, to be filled out by me. > > The DOM tree is good for that - it has no understanding of your plans > to process the document. Ok, very good, but where can I get at the data sitting there? > from xml.dom.ext.reader import Sax2 > import sys > doc = Sax.FromXmlFile(sys.argv[1]) > > This is a program that can read an XML document and build a tree of > objects. The tree of objects is stored in a variable named doc. You > can give a DTD to the first program, but it is ignored as it is not > needed. > > > No, I'm asking for a program that will dump a (skeleton of a, to be > > filled in at earliest convenience) parser program, when supplied > > with the DTD of the XML document. > > The nice thing about XML is that you can parse it without a DTD, and > that you can furthermore use the same parser for all XML documents. Good, now I only need to get Python DOM pass the regression tests, and find out how I can get at the data. From martin@loewis.home.cs.tu-berlin.de Sat Mar 10 13:24:02 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 10 Mar 2001 14:24:02 +0100 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input In-Reply-To: <3AAA020E.335812E@lrz.uni-muenchen.de> (Eugene.Leitl@lrz.uni-muenchen.de) References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> <200103100700.f2A70fK01248@mira.informatik.hu-berlin.de> <3AAA020E.335812E@lrz.uni-muenchen.de> Message-ID: <200103101324.f2ADO2g03086@mira.informatik.hu-berlin.de> > I looked at the way other people parse XML, and ran into DOM, which seemed > to imply the company has reinvented the wheel. I'm trying to understand > what Python DOM does (the regression test I ran yesterday did dump core, > so I don't have a working installation up yet). What operating system, what version of Python and PyXML? Python should *never* coredump; at worst, you might get Python exceptions. > Excellent. So, DOM parses the XML file (any well-formed XML file). Indeed. You have the choice of either a validating parser (on that looks at the DOCTYPE declaration in the document, and complains when elements are used incorrectly), and a non-validating parser, one that looks only for well-formedness. In either case, you get the same DOM tree (well, almost - a validating parser may fill in DEFAULT values of attributes from the DTD; a non-validating parser won't normally). > Because it is agnostic of what tags might be coming (since, as you > say, it doesn't need a DTD), it doesn't offer any hooks, calling a > matching method if a given tag is encountered. Yes and no. The DOM does not call any callbacks. Instead, you give the parser the document URL, and it gives you back a DOM tree; no application interaction during parsing. If you want event-oriented XML processing, you should study the SAX interface. This calls your callback for every start and end tag, text nodes, and so on. It does not build any kind of tree. In many XML libraries, it is possible to implement a "DOM builder" on top of a "SAX parser"; this is in fact how PyXML operates. > So essentially, I wind up with a representation of the XML file > as tree of objects, which I process after the fact, right? Exactly. > Iirc, DOM offers some helpful routines, allowing me to parse the > tree. Yes, depending on what exactly you want to do with the tree; not all routines are helpful for all applications. > So, where do I put my handler, interpreting the stuff as it passes > by? You don't, unless you implement your own SAX content handler - which either might or might not chose to build a DOM tree. > I want to transform this into a variety of formats: mapping the > tree to a number of .png images layed out in a HTML table, or use a > Tree Widget to paint a large bitmap, potentially with server-side > clickable maps. > > So, where does Python DOM offer me ways I can get at the data in > the object tree? The DOM itself offers standard accessor functions - they are not only standard across Python DOM implementations, but also standard across programming languages. The "DOM Core" interface only provides accessor functions to "navigate" the tree: Give me the name of the element (elem.tagName); give me all the children (elem.childNodes), give me the next sibling, give me the attribute named "atomWeight". There are some query functions: give me all element nodes with a certain element name, ... "DOM 2 Navigation" offers traversal interfaces. You might be tempted to use those, but I suggest to work with the core interfaces only at first; you'll find that it is quite easy to do your own traversal with just the accessor functions. Depending on the output format, it might be easy to write a SAX ContentHandler. Alternatively, if you can describe the output in terms of "for every foo element write bar, then go to the child nodes, then write foobar", it might be that XSLT is the right transformation language. There is no single best way to process XML - the only rule is that nobody ever writes his own parser, since that's already done. > Good, now I only need to get Python DOM pass the regression tests, > and find out how I can get at the data. I'd rather recommend to look at the demos. It may be indeed that some tests fail, e.g. when running PyXML on Python 1.5, which does not support Unicode strings. Regards, Martin From tpassin@home.com Sat Mar 10 15:10:27 2001 From: tpassin@home.com (Thomas B. Passin) Date: Sat, 10 Mar 2001 10:10:27 -0500 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> <200103100700.f2A70fK01248@mira.informatik.hu-berlin.de> <3AAA020E.335812E@lrz.uni-muenchen.de> Message-ID: <001801c0a974$3d868360$7cac1218@reston1.va.home.com> wrote - You are mixing up several concepts or processing steps. 1) Parsing xml. This means to get hold of the structural elements of the xml document and give them to another application for further processing. There are many xml parsers out there, come command line and some not. It's almost certainly not worth it to roll your own. 2) Creating a tree-like structure to represent the structure of the xml document. The DOM is an API for a tree-like representation. Most major parsers out there either include a DOM api or can work with another DOM API. (SAX is a non-DOM api, but the output of a sax processsor can be used to build a tree, too). The DOM is an object oriented api. 3) DOM manipulation, using the DOM api. There are already good processors that can use the DOM api to manipulate and actual, populated DOM trees. So don't roll your own there, either. 4) You don't need a DTD, but it's a good idea to make one anyway because then you can use a validating parser to check that the first xml examples that you build are "valid" - i.e., put together correctly from a structural point of view. It's amazing how easy it is to accidently create something else besides what you thought you were making. Otherwise, you can start simple with no DTD and later define one after you have some hands-on experience working with xml. As Martin said, the Python PyXML package is very good. There's also the Microsoft xml processor, which can be written to as a COM object, in VBscript, or in Javascript. There are several good java processors, and some good Perl ones. Python would be the quickest and easiest to use, especially if you are not already up to speed in one of the other languages. Even if you are, Python will be faster and easier to use than one of the strongly typed compiled languages like java. Get a good book or two, like Wrox's Professional XML and XML in a Nutshell from O'Reilly, to mention only two of the good ones out there. > > The company I'm with has the following ad hoc approach to XML: > whip up some XML fitting the problem, don't bother with writing > a DTD, code up a parser in an OO language, which recursively > reads the tags into memory, creating a hierarchy/tree of objects. > Fill in methods to deal with the data sitting in the tree, finis. > > I looked at the way other people parse XML, and ran into DOM, which seemed > to imply the company has reinvented the wheel> > Yes, the wheel has already been invented. But core dumps aren't going to be very useful. Do examples from a book or tutorial site, fix them til they run right, then start morphing them closer to what you want to do. You don't need to try to understand a DOM tree from a core dump. Learn about the api instead. Cheers, Tom P From Eugene.Leitl@lrz.uni-muenchen.de Sat Mar 10 15:41:09 2001 From: Eugene.Leitl@lrz.uni-muenchen.de (Eugene.Leitl@lrz.uni-muenchen.de) Date: Sat, 10 Mar 2001 16:41:09 +0100 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> <200103100700.f2A70fK01248@mira.informatik.hu-berlin.de> <3AAA020E.335812E@lrz.uni-muenchen.de> <001801c0a974$3d868360$7cac1218@reston1.va.home.com> Message-ID: <3AAA4B15.E2D84D6F@lrz.uni-muenchen.de> "Thomas B. Passin" wrote: > You are mixing up several concepts or processing steps. I realize that. It comes from being a newbie with a deadline breathing down my neck. > 1) Parsing xml. > This means to get hold of the structural elements of the xml document and give > them to another application for further processing. There are many xml > parsers out there, come command line and some not. It's almost certainly not > worth it to roll your own. I know that, but apparently not my senior cow-orkers. It's a C/C++ shop with an occasional sprinking of Java, my choice of Python is purely personal (note to myself: not to goof up this one). Before I try selling them on the DOM thing, I'd rather know what I'm doing. It cost them three days to whip up their object tree XML parser in Java. > 2) Creating a tree-like structure to represent the structure of the xml > document. > The DOM is an API for a tree-like representation. Most major parsers out > there either include a DOM api or can work with another DOM API. (SAX is a > non-DOM api, but the output of a sax processsor can be used to build a tree, > too). The DOM is an object oriented api. They (said cow-orkers) insist on an object tree based approach. > 3) DOM manipulation, using the DOM api. There are already good processors that > can use the DOM api to manipulate and actual, populated DOM trees. So don't > roll your own there, either. Does http://4suite.org/download.epy fill the ticket? The regression tests of it dumped core on me at work, let's see whether I can get it running at home. > 4) You don't need a DTD, but it's a good idea to make one anyway because then > you can use a validating parser to check that the first xml examples that you > build are "valid" - i.e., put together correctly from a structural point of > view. It's amazing how easy it is to accidently create something else besides > what you thought you were making. I think Emacs psgml mode will take care of that. > Otherwise, you can start simple with no DTD and later define one after you > have some hands-on experience working with xml. > > As Martin said, the Python PyXML package is very good. There's also the Downloading it now. > Microsoft xml processor, which can be written to as a COM object, in VBscript, > or in Javascript. There are several good java processors, and some good Perl > ones. Python would be the quickest and easiest to use, especially if you are > not already up to speed in one of the other languages. Even if you are, > Python will be faster and easier to use than one of the strongly typed > compiled languages like java. > > Get a good book or two, like Wrox's Professional XML and XML in a Nutshell > from O'Reilly, to mention only two of the good ones out there. I've gotten me Learning XML from ORA, which was a fresh wind in comparision to SGML & XML Cookbook. > Yes, the wheel has already been invented. But core dumps aren't going to be > very useful. Do examples from a book or tutorial site, fix them til they run > right, then start morphing them closer to what you want to do. You don't need > to try to understand a DOM tree from a core dump. Learn about the api The 4Suite DOM package dumped core on me when I was running regression tests as part of the build. Perhaps I should try sticking with PyXML at first. > instead. Thanks for all the good advice. From martin@loewis.home.cs.tu-berlin.de Sat Mar 10 18:33:30 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 10 Mar 2001 19:33:30 +0100 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input In-Reply-To: <3AAA4B15.E2D84D6F@lrz.uni-muenchen.de> (Eugene.Leitl@lrz.uni-muenchen.de) References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> <200103100700.f2A70fK01248@mira.informatik.hu-berlin.de> <3AAA020E.335812E@lrz.uni-muenchen.de> <001801c0a974$3d868360$7cac1218@reston1.va.home.com> <3AAA4B15.E2D84D6F@lrz.uni-muenchen.de> Message-ID: <200103101833.f2AIXUB04062@mira.informatik.hu-berlin.de> > Does http://4suite.org/download.epy fill the ticket? The regression > tests of it dumped core on me at work, let's see whether I can get > it running at home. To install just PyXML, the download section (Letzte Dateireleases) on http://sourceforge.net/projects/pyxml should be sufficient; 4suite.org offers the full 4Suite set of libraries. Regards, Martin From tpassin@home.com Sat Mar 10 17:53:34 2001 From: tpassin@home.com (Thomas B. Passin) Date: Sat, 10 Mar 2001 12:53:34 -0500 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> <200103100700.f2A70fK01248@mira.informatik.hu-berlin.de> <3AAA020E.335812E@lrz.uni-muenchen.de> <001801c0a974$3d868360$7cac1218@reston1.va.home.com> <3AAA4B15.E2D84D6F@lrz.uni-muenchen.de> Message-ID: <003001c0a98b$07a9f120$7cac1218@reston1.va.home.com> wrote - > > Before I try selling them on the DOM thing, I'd rather know what I'm > doing. It cost them three days to whip up their object tree XML parser > in Java. > Yes, it's easy to make a basic xml parser, and it's easy to come up with a tree structure. Lots of us have done something like this. But there are a lot of specialized wrinkles to xml. If you are only ever going to work with your own xml, it may not matter. But if you want to work with xml produced by others, it may use features that require these wrinkles. Your home-grown parser and tree structure likely won't handle them all. Handling of external entities, namespaces, whitespace normalization, character encodings, and CDATA sections are some of these wrinkles that can get tricky. Also, if you use your own tree API, you won't be able to make use of other software that uses the DOM, like xslt, xpath,xpointer, etc. (I'm not sure how many of these are out yet in C++, but they will be coming). > > 2) Creating a tree-like structure to represent the structure of the xml > > document. > > The DOM is an API for a tree-like representation. Most major parsers out > > there either include a DOM api or can work with another DOM API. (SAX is a > > non-DOM api, but the output of a sax processsor can be used to build a tree, > > too). The DOM is an object oriented api. > > They (said cow-orkers) insist on an object tree based approach. > Oh, yes, a tree approach is fine for a lot of things. Takes a lot of memory if you have a large chunk of xml. It isn't so much the tree as the api for it that you probably want to concentrate on first. Cheers, Tom P From ken@bitsko.slc.ut.us Sat Mar 10 18:55:02 2001 From: ken@bitsko.slc.ut.us (Ken MacLeod) Date: 10 Mar 2001 12:55:02 -0600 Subject: [XML-SIG] dumping an XML parser skeleton from DTD input In-Reply-To: Eugene.Leitl@lrz.uni-muenchen.de's message of "Sat, 10 Mar 2001 16:41:09 +0100" References: <200103092127.f29LRlR00921@mira.informatik.hu-berlin.de> <3AA956DF.EAC34D7D@lrz.uni-muenchen.de> <200103100700.f2A70fK01248@mira.informatik.hu-berlin.de> <3AAA020E.335812E@lrz.uni-muenchen.de> <001801c0a974$3d868360$7cac1218@reston1.va.home.com> <3AAA4B15.E2D84D6F@lrz.uni-muenchen.de> Message-ID: Eugene.Leitl@lrz.uni-muenchen.de writes: > "Thomas B. Passin" wrote: > > > You are mixing up several concepts or processing steps. > > I realize that. It comes from being a newbie with a deadline > breathing down my neck. > > > 1) Parsing xml. > > This means to get hold of the structural elements of the xml > > document and give them to another application for further > > processing. There are many xml parsers out there, come command > > line and some not. It's almost certainly not worth it to roll > > your own. > > I know that, but apparently not my senior cow-orkers. It's a C/C++ > shop with an occasional sprinking of Java, my choice of Python is > purely personal (note to myself: not to goof up this one). > > Before I try selling them on the DOM thing, I'd rather know what I'm > doing. It cost them three days to whip up their object tree XML > parser in Java. > > > 2) Creating a tree-like structure to represent the structure of > > the xml document. The DOM is an API for a tree-like > > representation. Most major parsers out there either include a DOM > > api or can work with another DOM API. (SAX is a non-DOM api, but > > the output of a sax processsor can be used to build a tree, too). > > The DOM is an object oriented api. > > They (said cow-orkers) insist on an object tree based approach. Note that DOM objects are a raw, in-memory version of the XML document (objects representing XML elements, attributes, text nodes). What you (or your coworkers) are probably wanting are normal application objects exported and imported via XML. The way your coworkers seemed to have started is to create a unique XML format for each application object or file, and then write per-file importers and exporters for each format. As you suspected, there is probably a way to refactor this code so that you need only have one importer and exporter regardless of which application objects or file format is used. Your first post suggested having some kind of "DTD compiler" that could digest a DTD and produce a per-file "parser" for you, for reading in arbitrary XML. Practically speaking, that's a hard problem. The difficulty is that each XML format is being created "by hand" unique and tweaked to each application object, you're expecting some kind of compiler to generalize the XML and re-create usable application objects from the various uniquely designed formats. So what's the easy way? Instead of creating a unique format by hand for each application object, create a set of generic encoding rules for converting any type of object into XML, and then write a parser to read the generic XML and convert it into objects. SOAP is one such set of encoding rules (SOAP Section 5, to be exact), and if you're comfortable with using the SOAP libraries to read and write XML, I would highly recommend going that way. The problem is that most SOAP libraries are a little tedious to use for "just serializing objects" (thinking of Apache Java SOAP here in particular). To roll your own, you just need a set of simple rules for encoding. Here's an example XML: A simple value in a record, structure, or object A simple value in a list A simple value, in a strcture, in a list 12345 A simple value, in a structure, in a structure 12345 The rules are: 1) If an XML element contains subelements, then the value is an array or a structure. 2) The sub-element names of structures (objects) are the field, key, or member names of the structure or object. 3) An array is indicated by an attribute isArray="1". 4) The sub-element names of an array are arbitrary, so you can pick something like . 5) If an element has no sub-elements, then that element is a simple value (a string, integer, date, whatever). I didn't put this in the example, but it's easiest to store type information for every element, whether it be a class name on a structure or list, or a simple value type (string, integer, date) on a simple value. Use an attribute like type="someType". Here's the relevant part of a decoder for this format, converted by hand from the Orchard SOAP parser[1], it should give you a start. Note that it's not trying to decode the class names of objects, but when you want to do that, add the code to the endElement handler in the 'else' clause of the 'if utype is _CHAR'. import xml.sax # just constants _DICT = "dict" _ARRAY = "array" _CHAR = "char" class Unpickler: def __init__(self, file): self.file = file def load(self): self.parse_value_stack = [ {} ] self.parse_utype_stack = [ _DICT ] self.parse_type_stack = [ ] parser = xml.sax.make_parser() parser.setContentHandler(self) parser.setErrorHandler(self) parser.parse(file) object = self.parse_value_stack[0] delattr(self, 'parse_value_stack') return object def startElement(self, name, atts): self.chars = "" type = None if atts.has_key('type'): type = atts['type'] self.parse_type_stack.append(type) if atts.has_key('isArray'): self.parse_utype_stack.append(_ARRAY) self.parse_value_stack.append( [ ] ) else: # will be set to _DICT if a sub-element is found self.parse_utype_stack.append(_CHAR) def endElement(self, name): type = self.parse_type_stack.pop() utype = self.parse_utype_stack.pop() if utype is _CHAR: if type == 'integer': value = int(self.chars) elif type == 'float': value = float(self.chars) else: value = self.chars else: value = self.parse_value_stack.pop() # if we're in an element, and our parent element was defaulted # to _CHAR, then we're in a struct and we need to create that # dictionary. if self.parse_utype_stack[-1] is _CHAR: self.parse_value_stack.append( {} ) self.parse_utype_stack[-1] = _DICT if self.parse_utype_stack[-1] is _DICT: self.parse_value_stack[-1][name] = value else: self.parse_value_stack[-1].append(value) def characters(self, chars): self.chars = self.chars + chars.data def startDocument(self): pass def endDocument(self): pass def ignorableWhitespace(self, ch, start, length): pass def processingInstruction(self, target, data): pass def error(self, exc): raise exc def fatalError(self, exc): raise exc def warning(self, exc): pass In C++ or Java, you might consider having each class you expect to be ex/imported from XML to have a constructor that accepts a dictionary from the XML reader (to create the new object just read from XML) and a method asDictionary() that will return the representation of the object as a dictionary (to be written to XML). -- Ken [1] From nobody@sourceforge.net Sat Mar 10 20:35:36 2001 From: nobody@sourceforge.net (nobody) Date: Sat, 10 Mar 2001 12:35:36 -0800 Subject: [XML-SIG] [ pyxml-Bugs-407587 ] ns_parse.py and ampersand Message-ID: Bugs #407587, was updated on 2001-03-10 12:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407587&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: Sam Lowry Assigned to: Nobody/Anonymous Summary: ns_parse.py and ampersand Initial Comment: ns_parse.py fails converting NN bookmarks if it encounters ampersand sign in the href of a bookmark file, e.g. . ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407587&group_id=6473 From nobody@sourceforge.net Sat Mar 10 20:39:52 2001 From: nobody@sourceforge.net (nobody) Date: Sat, 10 Mar 2001 12:39:52 -0800 Subject: [XML-SIG] [ pyxml-Bugs-407588 ] broken links on pyxml homepage Message-ID: Bugs #407588, was updated on 2001-03-10 12:39 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407588&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: Sam Lowry Assigned to: Nobody/Anonymous Summary: broken links on pyxml homepage Initial Comment: The bug is self-explanatory ;-) BTW, how can I subscribe to XML-SIG group? Link at http://www.python.org/sigs/ and at http://pyxml.sourceforge.net/ leadsto nowhere... I've made a XSL for xbel that I want to share with others. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407588&group_id=6473 From nobody@sourceforge.net Sun Mar 11 04:31:06 2001 From: nobody@sourceforge.net (nobody) Date: Sat, 10 Mar 2001 20:31:06 -0800 Subject: [XML-SIG] [ pyxml-Patches-407630 ] Fix ns_parse.py from XBEL to accept ampe Message-ID: Patches #407630, was updated on 2001-03-10 20:31 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=407630&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: Uche Ogbuji Assigned to: Uche Ogbuji Summary: Fix ns_parse.py from XBEL to accept ampe Initial Comment: I submitted this patch before I was hacking at PyXML itself, but I guess it vanished into the ether. Addresses bug at http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407587&group_id=6473 ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=407630&group_id=6473 From uche.ogbuji@fourthought.com Sun Mar 11 04:49:54 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 10 Mar 2001 21:49:54 -0700 Subject: [XML-SIG] News on Sourceforge Message-ID: <200103110449.VAA16481@localhost.localdomain> The latest news on https://sourceforge.net/projects/pyxml/ Is the 0.6.1 release back in October. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From carlos@eberhardt.net Sun Mar 11 16:59:20 2001 From: carlos@eberhardt.net (Carlos Eberhardt) Date: Sun, 11 Mar 2001 10:59:20 CST Subject: [XML-SIG] PyXML-0.6.4 on BeOS Message-ID: <20010311165641.275FE813E@conn.mc.mpls.visi.com> Hello- Just wanted to drop a note mentioning that the setup.py script fails under BeOS R5.0.3 (x86) due to the expat filemap stuff. BeOS doesn't have mmap (I guess), so it needs to use the readfilemap.c (like the mac setup): # Use either unixfilemap or readfilemap depending on the platform if sys.platform == 'win32': FILEMAP_SRC = 'extensions/expat/xmlwf/win32filemap.c' elif sys.platform[:3] == 'mac': FILEMAP_SRC = 'extensions/expat/xmlwf/readfilemap.c' elif sys.platform[:4] == 'beos': FILEMAP_SRC = 'extensions/expat/xmlwf/readfilemap.c' else: # Assume all other platforms are Unix-compatible; this is almost # certainly wrong. :) FILEMAP_SRC = 'extensions/expat/xmlwf/unixfilemap.c' (actually, I cheated and just set the FILEMAP_SRC in the else block to use readfile map, but I would assume adding the check for beos would do the trick as well) ... Just FYI! Thanks for all the hard work! Carlos carlos@eberhardt.net From guido@digicool.com Sun Mar 11 21:33:59 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 11 Mar 2001 16:33:59 -0500 Subject: [XML-SIG] News on Sourceforge In-Reply-To: Your message of "Sat, 10 Mar 2001 21:49:54 MST." <200103110449.VAA16481@localhost.localdomain> References: <200103110449.VAA16481@localhost.localdomain> Message-ID: <200103112133.QAA13056@cj20424-a.reston1.va.home.com> > The latest news on > > https://sourceforge.net/projects/pyxml/ > > Is the 0.6.1 release back in October. As a sworn-in developer, you should be able to submit a news item to fix this! Go to "News" and then click on "Submit". If you can't, one of the project admins (e.g. Fred, Andrew or Martin) should do it, or they can give you permission to submit new news items by going into the Admin page. --Guido van Rossum (home page: http://www.python.org/~guido/) From uche.ogbuji@fourthought.com Sun Mar 11 21:58:01 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sun, 11 Mar 2001 14:58:01 -0700 Subject: [XML-SIG] News on Sourceforge In-Reply-To: Message from Guido van Rossum of "Sun, 11 Mar 2001 16:33:59 EST." <200103112133.QAA13056@cj20424-a.reston1.va.home.com> Message-ID: <200103112158.OAA07682@localhost.localdomain> > > The latest news on > > > > https://sourceforge.net/projects/pyxml/ > > > > Is the 0.6.1 release back in October. > > As a sworn-in developer, you should be able to submit a news item to > fix this! Go to "News" and then click on "Submit". If you can't, > one of the project admins (e.g. Fred, Andrew or Martin) should do it, > or they can give you permission to submit new news items by going into > the Admin page. Yes. I should have completed my question. I'm never sure what only admins can do and what only mere developers can. The impression I've developed is that all I can do is check in code, which is why I didn't look to add the news items myself. If I find that I do have permissions, I'll do so. More importantly, it would be nice for whoever is releasing a PyXML package to update SF at the same time. Of course it's hard to remember such things, so perhaps we need to make up a release check-list. My first attempt: * Ask all developers to check in (say 72 hours before planned release) - Note: I actually had some fixes in my local repo that would have been nice to get into 0.6.4 (they're in now). I guess I should just check in more often. * Check all test suites (all are in the test directory, except for PyXML/xml/dom/ext/reader/test_suite/Benchmark.py, which looks as if it should just be nuked) * Update any docs * Draft announcement * Update SF page Anything else? -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From guido@digicool.com Sun Mar 11 22:32:47 2001 From: guido@digicool.com (Guido van Rossum) Date: Sun, 11 Mar 2001 17:32:47 -0500 Subject: [XML-SIG] News on Sourceforge In-Reply-To: Your message of "Sun, 11 Mar 2001 14:58:01 MST." <200103112158.OAA07682@localhost.localdomain> References: <200103112158.OAA07682@localhost.localdomain> Message-ID: <200103112232.RAA13985@cj20424-a.reston1.va.home.com> > > As a sworn-in developer, you should be able to submit a news item to > > fix this! Go to "News" and then click on "Submit". If you can't, > > one of the project admins (e.g. Fred, Andrew or Martin) should do it, > > or they can give you permission to submit new news items by going into > > the Admin page. > > Yes. I should have completed my question. I'm never sure what only > admins can do and what only mere developers can. The impression > I've developed is that all I can do is check in code, which is why I > didn't look to add the news items myself. Actually, it's up to the admins to give the "mere" developers additional permissions. In the Python project, it is a policy to give all developers all permissions -- because in our view checkin permission (which every developer has) is more powerful than any of the sourceforge admin things, so why not give everybody all permissions! --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@loewis.home.cs.tu-berlin.de Sun Mar 11 18:07:27 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 11 Mar 2001 19:07:27 +0100 Subject: [XML-SIG] PyXML-0.6.4 on BeOS In-Reply-To: <20010311165641.275FE813E@conn.mc.mpls.visi.com> (carlos@eberhardt.net) References: <20010311165641.275FE813E@conn.mc.mpls.visi.com> Message-ID: <200103111807.f2BI7Rn03851@mira.informatik.hu-berlin.de> > elif sys.platform[:4] == 'beos': > FILEMAP_SRC = 'extensions/expat/xmlwf/readfilemap.c' Thanks, I've added this to my local copy of setup.py. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Sun Mar 11 22:57:41 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 11 Mar 2001 23:57:41 +0100 Subject: [XML-SIG] News on Sourceforge In-Reply-To: <200103112158.OAA07682@localhost.localdomain> (message from Uche Ogbuji on Sun, 11 Mar 2001 14:58:01 -0700) References: <200103112158.OAA07682@localhost.localdomain> Message-ID: <200103112257.f2BMvfh00991@mira.informatik.hu-berlin.de> > More importantly, it would be nice for whoever is releasing a PyXML > package to update SF at the same time. Of course it's hard to > remember such things, so perhaps we need to make up a release > check-list. I'm actually following a checklist; the one at the top of the ANNOUNCE file. So far, non of the 0.6.x releases did *all* of the release procedure steps; that was intentional on my part as I otherwise would have released nothing (release early, release often). E.g. in 0.6.4, for the first time, I put a notice on freshmeat. That took quite some time in itself, as I had to get a freshmeat account, find the name that freshmeat uses for the package, and update all the outdated information (the last freshmeat announcement was in the 0.5.x series, by amk). As for SF announcements, after posting the 0.6.1 one, I found that it might be pointless - only people looking at the project page will see it, and they see what the recent release is by looking just above that field. It might be useful to post other announcements there, e.g. when important check-ins occur, or related software is released :-) > * Ask all developers to check in (say 72 hours before planned release) > > - Note: I actually had some fixes in my local repo that would have > been nice to get into 0.6.4 (they're in now). I guess I should just > check in more often. For 0.6.4, I sent a message on Feb 20 that I would be releasing it a few days later. I got some useful feedback in response to that message; the release was on Feb 25. > * Check all test suites (all are in the test directory, except for > PyXML/xml/dom/ext/reader/test_suite/Benchmark.py, which looks as if > it should just be nuked) I normally run all of the test directory on Linux and Solaris; this time, I also ran in on WinNT (and noticed that the packaging would forget the output/test_ files due to a bug in distutils). > * Update any docs I normally do that before running the test suite. > * Draft announcement At least for 0.6.4, that happened quite some time before that: revisions 1.10-1.12 all deal with 0.6.4. > * Update SF page So far, this is uploading only. If people feel that I should post a news item also, I can add this to my list. > Anything else? * Place CVS tag on all files * Post announcements (the one to xml-dev always returns since only subscribers can post, and since I was not subscribed and always forgot that restriction) Regards, Martin From fdrake@acm.org Mon Mar 12 02:27:48 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sun, 11 Mar 2001 21:27:48 -0500 (EST) Subject: [XML-SIG] PyXML-0.6.4 on BeOS In-Reply-To: <200103111807.f2BI7Rn03851@mira.informatik.hu-berlin.de> References: <20010311165641.275FE813E@conn.mc.mpls.visi.com> <200103111807.f2BI7Rn03851@mira.informatik.hu-berlin.de> Message-ID: <15020.13348.988144.898070@cj42289-a.reston1.va.home.com> Martin v. Loewis writes: > Thanks, I've added this to my local copy of setup.py. Then check it in! The first thing I did when I saw the report was to check for checkins, then make the change myself. I saw your note before checking in, but ... part of "release early, release often" is "share code base updates". ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From nobody@sourceforge.net Mon Mar 12 02:42:47 2001 From: nobody@sourceforge.net (nobody) Date: Sun, 11 Mar 2001 18:42:47 -0800 Subject: [XML-SIG] [ pyxml-Bugs-407810 ] xmlproc chokes on lengthy comments Message-ID: Bugs #407810, was updated on 2001-03-11 18:42 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407810&group_id=6473 Category: xmlproc Group: None Status: Open Priority: 5 Submitted By: A.M. Kuchling Assigned to: Lars Marius Garshol Summary: xmlproc chokes on lengthy comments Initial Comment: Lengthy comment blocks cause xmlproc to raise a RuntimeError: "maximum recursion depth exceeded" error. The problem is that a group is used to match an individual character, and SRE recurses on group repeats: '([^-]|-[^-])*'. Fix: would '(.*?)--' be an equivalent pattern? ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=407810&group_id=6473 From martin@loewis.home.cs.tu-berlin.de Mon Mar 12 07:12:32 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 12 Mar 2001 08:12:32 +0100 Subject: [XML-SIG] PyXML-0.6.4 on BeOS In-Reply-To: <15020.13348.988144.898070@cj42289-a.reston1.va.home.com> (fdrake@acm.org) References: <20010311165641.275FE813E@conn.mc.mpls.visi.com> <200103111807.f2BI7Rn03851@mira.informatik.hu-berlin.de> <15020.13348.988144.898070@cj42289-a.reston1.va.home.com> Message-ID: <200103120712.f2C7CWa01347@mira.informatik.hu-berlin.de> > Then check it in! The first thing I did when I saw the report was > to check for checkins, then make the change myself. I saw your note > before checking in, but ... part of "release early, release often" is > "share code base updates". ;-) No argument about that. Committed. Martin From lkcl@samba-tng.org Mon Mar 12 13:15:46 2001 From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton) Date: Tue, 13 Mar 2001 00:15:46 +1100 Subject: [XML-SIG] Re: tabs inside attribute values removed In-Reply-To: <3AA90FD0.B5777324@fourthought.com> Message-ID: On Fri, 9 Mar 2001, Jeremy Kloth wrote: > > > > i am having to pre-process all text, substituting > > for "\t" as a work-around for this problem. > > > > if this is not performed, then all tabs inside > > attribute's values, e.g. > > , are turned into > > spaces. > > Using PyXML 0.6.4, I didn't see this behavior. > > from xml.dom.ext.reader import Sax2 > doc = Sax2.FromXml('') > attr = doc.documentElement.attributes.item(0) > print repr(attr.value) > 'a\011tab' it's the other way round [and this was with 0.6.2] doc = Sax2.FromXml('') attr = doc.documentElement.attributes.attributes['','attr'].value and should i be using doc.documentElement.attributes['ns','name'].value, is that okay? [ just checked this] it still doesn't work, and it still doesn't work with 0.6.4. so, yes: i have to pre-process all text, substituting \t with which is _not_ something i want to have to leave in the code, long-term, as you might imagine! some of the documents i am parsing are over 2.5mb in size, and other people may find larger uses (see http://sourceforge.net/projects/pyxsmqll) yes, i know: i need to move to a Sax model not a DOM one. first implementation, and all that :) all best, luke ----- Luke Kenneth Casson Leighton ----- "i want a world of dreams, run by near-sighted visionaries" "good. that's them sorted out. now, on _this_ world..." From jerome.marant@free.fr Mon Mar 12 13:30:02 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 12 Mar 2001 14:30:02 +0100 Subject: [XML-SIG] 4DOM Message-ID: <7zitlfi085.fsf@amboise.ird.idealx.com> Hi, =20 I made a diff between 4DOM in the 4Suite tarball and 4DOM in PyXML and I found many differences. What kind of changes have been made to it for its inclusion into PyXML and are theses changes to be backported the its original place ? Thanks. --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From martin@loewis.home.cs.tu-berlin.de Mon Mar 12 18:42:14 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 12 Mar 2001 19:42:14 +0100 Subject: [XML-SIG] 4DOM In-Reply-To: <7zitlfi085.fsf@amboise.ird.idealx.com> (jerome.marant@free.fr) References: <7zitlfi085.fsf@amboise.ird.idealx.com> Message-ID: <200103121842.f2CIgEF01274@mira.informatik.hu-berlin.de> > I made a diff between 4DOM in the 4Suite tarball and 4DOM > in PyXML and I found many differences. What versions exactly have you been comparing? > What kind of changes have been made to it for its inclusion > into PyXML and are theses changes to be backported the > its original place ? PyXML *is* the original place for 4DOM. Maybe I did not say it loud enough; here is the first item of the 0.6.4 ANNOUNCEMENT: * 4DOM was integrated from 4Suite 0.10.2. 4DOM is now maintained as a part of PyXML. A detailed list of changes can be found in xml/dom/ChangeLog. Regards, Martin From larsga@garshol.priv.no Mon Mar 12 20:56:14 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 12 Mar 2001 21:56:14 +0100 Subject: [XML-SIG] [ pyxml-Bugs-407288 ] tabs inside attribute values removed In-Reply-To: References: Message-ID: * nobody@sourceforge.net | | Bugs #407288, was updated on 2001-03-09 04:15 | [...] | Initial Comment: | | i am having to pre-process all text, substituting | for "\t" as a work-around for this problem. | | if this is not performed, then all tabs inside | attribute's values, e.g. | , are turned into | spaces. This is the correct behaviour for an XML parser, as mandated by the XML recommendation: | i am storing python code in an attribute value, so i | _must_ have my tabs!!! :) :) Then you must encode them correctly. :-) --Lars M. From msanborn@Adobe.COM Tue Mar 13 00:25:08 2001 From: msanborn@Adobe.COM (Michael Sanborn) Date: Mon, 12 Mar 2001 16:25:08 -0800 Subject: [XML-SIG] Problem installing PyXML-0.6.4 on W2K Message-ID: <4.3.2.7.2.20010312161802.01ed5ee8@mailsj-v1> I was pleased to see the binary installer for PyXML, but I'm finding that it comes to a screen that asks me to "Select python installation to use:" with a blank text pane and a greyed-out text box that I can't type into, so I'm stuck. I'm using a freshly installed Python 1.6.1 on d:\Python161, running Windows 2000. Anyone run into this problem before? If all else fails, can I just extract the PyXML-0.6.4.tar.gz files into a subdirectory of d:\Python161\Lib and ignore the compiling, since there's already a pyexpat.pyd? Thanks, Michael Sanborn From martin@loewis.home.cs.tu-berlin.de Tue Mar 13 04:38:25 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 13 Mar 2001 05:38:25 +0100 Subject: [XML-SIG] Problem installing PyXML-0.6.4 on W2K In-Reply-To: <4.3.2.7.2.20010312161802.01ed5ee8@mailsj-v1> (msanborn@Adobe.COM) References: <4.3.2.7.2.20010312161802.01ed5ee8@mailsj-v1> Message-ID: <200103130438.f2D4cPK00857@mira.informatik.hu-berlin.de> > I was pleased to see the binary installer for PyXML, but I'm finding that > it comes to a screen that asks me to "Select python installation to use:" > with a blank text pane and a greyed-out text box that I can't type into, so > I'm stuck. I'm using a freshly installed Python 1.6.1 on d:\Python161, > running Windows 2000. Anyone run into this problem before? That is no surprise. The binary installer works for 1.5.2, and 2.0, respectively. Nobody uses or should use Python 1.6, so I recommend to upgrade to 2.0. > If all else fails, can I just extract the PyXML-0.6.4.tar.gz files > into a subdirectory of d:\Python161\Lib and ignore the compiling, > since there's already a pyexpat.pyd? No. The expat.pyd of 1.6.1 is probably horribly broken, so PyXML will not work properly with it. Regards, Martin From frank@quantiva.com Tue Mar 13 22:08:01 2001 From: frank@quantiva.com (Frank Stolze) Date: Tue, 13 Mar 2001 17:08:01 -0500 (EST) Subject: [XML-SIG] SAX parsing Message-ID: Hi, I'm trying to parse an XML stream, i.e., an "infinitely long" XML document. I want to process XML entities in real time as they are being read. That's why I'm using the SAX approach. However, it seems that both the expat parser in Python 2.0 as well as the xmlproc parser in the latest PyXML don't even start to parse until they see an end-of-file. Is that a "known and intented behavior" (which would be a pity as it would make them unusable as stream parsers) or am I wrong? Thanks, Frank From martin@loewis.home.cs.tu-berlin.de Tue Mar 13 22:37:45 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 13 Mar 2001 23:37:45 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 Message-ID: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> Since a number of bug fixes have been committed to PyXML since 0.6.4, I plan to release 0.6.5 sometime next week. If you have any pending patches that you'd like to see, or if you know of bugs that you think should be (and can be) corrected, please let me know. This will be the last 0.6.x release, to be followed by 0.7, or by 1.0 if too many people complain :-) Regards, Martin From martin@loewis.home.cs.tu-berlin.de Tue Mar 13 22:34:45 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 13 Mar 2001 23:34:45 +0100 Subject: [XML-SIG] SAX parsing In-Reply-To: (message from Frank Stolze on Tue, 13 Mar 2001 17:08:01 -0500 (EST)) References: Message-ID: <200103132234.f2DMYjN02770@mira.informatik.hu-berlin.de> > I'm trying to parse an XML stream, i.e., an "infinitely long" XML > document. I want to process XML entities in real time as they are > being read. That's why I'm using the SAX approach. However, it seems > that both the expat parser in Python 2.0 as well as the xmlproc > parser in the latest PyXML don't even start to parse until they see > an end-of-file. > > Is that a "known and intented behavior" (which would be a pity as > it would make them unusable as stream parsers) or am I wrong? There is a SAX extension in use in PyXML, which is the incremental parser. Not all readers are incremental parsers, but the expat reader is. Please see xml.sax.xmlreader for details; the parse() function of that will invoke feed() every now and then, which in turn will result in content handler events. If you don't see this, it might be that you have to few data available. Or, you did something wrong, which is hard to say without seeing any source code. To get a more reliable behaviour, you can chose to invoke feed() yourself in a loop, by reading chunks of data from your stream. Regards, Martin P.S. If you had expected the parser to read one byte at a time, I'll have to disappoint you: that would be so unefficient that nobody has considered it. From uche.ogbuji@fourthought.com Wed Mar 14 03:06:37 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Tue, 13 Mar 2001 20:06:37 -0700 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: Message from "Martin v. Loewis" of "Tue, 13 Mar 2001 23:37:45 +0100." <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> Message-ID: <200103140306.UAA02140@localhost.localdomain> > Since a number of bug fixes have been committed to PyXML since 0.6.4, > I plan to release 0.6.5 sometime next week. If you have any pending > patches that you'd like to see, or if you know of bugs that you think > should be (and can be) corrected, please let me know. This will be the > last 0.6.x release, to be followed by 0.7, or by 1.0 if too many > people complain :-) I think it makes sense to make 0.7 the first release with 4XPath and 4XSLT built in. Then we can burn it in through an 0.7.x cycle and go 1.0 when we're happy with things? Our plans are to release 4Suite 0.10.3 this week or early next. Then it's testing, testing, testing for a month or so and 1.0 in late April. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From jerome.marant@free.fr Wed Mar 14 08:48:06 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 14 Mar 2001 09:48:06 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: Uche Ogbuji's message of "Tue, 13 Mar 2001 20:06:37 -0700" References: <200103140306.UAA02140@localhost.localdomain> Message-ID: <7zn1aopwhl.fsf@amboise.ird.idealx.com> Uche Ogbuji writes: =20 > I think it makes sense to make 0.7 the first release with 4XPath and = 4XSLT=20 > built in. Then we can burn it in through an 0.7.x cycle and go 1.0 w= hen we're=20 > happy with things? =20=20 BTW, do you plan to merge 4Suite and PyXML? It seems that a growing n= umber of 4Suite components are integrated into PyXML ... What is the future of 4Suite? Thanks. --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From uche.ogbuji@fourthought.com Wed Mar 14 13:42:05 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Wed, 14 Mar 2001 06:42:05 -0700 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: Message from jerome.marant@free.fr (J r me Marant) of "14 Mar 2001 09:48:06 +0100." <7zn1aopwhl.fsf@amboise.ird.idealx.com> Message-ID: <200103141342.GAA03805@localhost.localdomain> > Uche Ogbuji writes: > = > = > > I think it makes sense to make 0.7 the first release with 4XPath and = 4XSLT = > > built in. Then we can burn it in through an 0.7.x cycle and go 1.0 w= hen we're = > > happy with things? > = > = > BTW, do you plan to merge 4Suite and PyXML? No. But I don't think it's a good idea to do so anyway. For one thing, not all of 4Suite is relevant to PyXML. For instance, 4OD= S = probably wouldn't fit. But also, I think it has worked quite well for the technology to be incub= ated = in 4Suite, and the parts that are of broadest use for Python XML users to= = migrate to PyXML. I see 4Suite as a sort of PyXML++ for those who want t= he = kittin' kaboodle of XML tools. > It seems that a growing number > of 4Suite components are integrated into PyXML ... Yes, but in some cases there is more to it than simple migration. For = instance, we'll be moving 4XPath and 4XSLT to PyXML, but we'll be develop= ing = from scratch a new XSLT implementation that will live in 4Suite 1.1 and h= igher = as an alternative to 4XSLT. That way Python will have a mature = implementation, and an improved, but experimental implementation. > What is the future of 4Suite? 1.0 probably in late April, which will be mostly what's in CVS now with = bug-fixes. Then 4Suite 1.0.x is maintained as a bug-fix branch while 4XP= ath = and 4XSLT are removed from a 1.1 development branch and the new XSLT proc= essor = introduced. So 4Suite will keep on, although we will move over to PyXML whatever make= s = sense and has consensus (there was much discussion about moving 4XPath an= d = 4XSLT in almost a year ago, but the timing makes more sense now). One note is that since 4XSLT includes PyXML, all this migration should be= = relatively transparent to the end user (although it can make for some ext= ra = work for distributors). -- = Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com = 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From jerome.marant@free.fr Wed Mar 14 14:40:13 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 14 Mar 2001 15:40:13 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: Uche Ogbuji's message of "Wed, 14 Mar 2001 06:42:05 -0700" References: <200103141342.GAA03805@localhost.localdomain> Message-ID: <7z8zm8pg6q.fsf@amboise.ird.idealx.com> Uche Ogbuji writes: =20 > No. But I don't think it's a good idea to do so anyway. >=20 > For one thing, not all of 4Suite is relevant to PyXML. For instance,= 4ODS=20 > probably wouldn't fit. I agree. =20 > Yes, but in some cases there is more to it than simple migration. Fo= r=20 ... >=20 > So 4Suite will keep on, although we will move over to PyXML whatever = makes=20 > sense and has consensus (there was much discussion about moving 4XPat= h and=20 > 4XSLT in almost a year ago, but the timing makes more sense now). Right. I'm the Debian maintainer of the PyXML package and I'm working on packaging 4Suite for Debian (BTW, do you agree with it?). So, I have to remove PyXML and 4DOM sections as they are provided by the PyXML package (4Suite depends on PyXML) and I was wondering whether I'd have to remove more and more components :-) Thanks for these explainations. Cheers, --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From Alexandre.Fayolle@logilab.fr Wed Mar 14 14:56:19 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Wed, 14 Mar 2001 15:56:19 +0100 (CET) Subject: [XML-SIG] packaging 4Suite In-Reply-To: <7z8zm8pg6q.fsf@amboise.ird.idealx.com> Message-ID: On 14 Mar 2001, Jérôme Marant wrote: > I'm the Debian maintainer of the PyXML package and I'm working on > packaging 4Suite for Debian (BTW, do you agree with it?). So, I have > to remove PyXML and 4DOM sections as they are provided by the PyXML > package (4Suite depends on PyXML) and I was wondering whether I'd > have to remove more and more components :-) Hmm, would it not be easier to have the 4Suite debian package "provide" PyXML, and maybe make both packages conflict (so that one would have to choose between PyXML and 4Suite, knowing that the latter is a strict superset of the former). This said, I'm by no mean an expert of the Debian policies... Alexandre Fayolle -- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). From jerome.marant@free.fr Wed Mar 14 15:03:26 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 14 Mar 2001 16:03:26 +0100 Subject: [XML-SIG] Re: packaging 4Suite In-Reply-To: Alexandre Fayolle's message of "Wed, 14 Mar 2001 15:56:19 +0100 (CET)" References: Message-ID: <7zk85so0jl.fsf@amboise.ird.idealx.com> Alexandre Fayolle writes: > Hmm, would it not be easier to have the 4Suite debian package "provid= e" > PyXML, and maybe make both packages conflict (so that one would have = to > choose between PyXML and 4Suite, knowing that the latter is a strict > superset of the former). I don't agree. It is clear, according to the 4Suite documentation, that 4Suite depends on PyXML and extracting PyXML from 4Suite allows this package to use the lattest bugfixed PyXML without needing 4Suite to be updated anytime that PyXML changes. Cheers, --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From akuchlin@mems-exchange.org Wed Mar 14 15:33:17 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 14 Mar 2001 10:33:17 -0500 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Tue, Mar 13, 2001 at 11:37:45PM +0100 References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> Message-ID: <20010314103317.C15434@ute.cnri.reston.va.us> On Tue, Mar 13, 2001 at 11:37:45PM +0100, Martin v. Loewis wrote: >patches that you'd like to see, or if you know of bugs that you think >should be (and can be) corrected, please let me know. This will be the If my suggested fix for bug #407810 in xmlproc is correct, it would be trivial to fix. If it's not, this might be more difficult to fix. Lengthy comment blocks cause xmlproc to raise a RuntimeError: "maximum recursion depth exceeded" error. The problem is that a group is used to match an individual character, and SRE recurses on group repeats: '([^-]|-[^-])*'. Fix: would '(.*?)--' be an equivalent pattern? --amk From martin@loewis.home.cs.tu-berlin.de Wed Mar 14 20:47:57 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 14 Mar 2001 21:47:57 +0100 Subject: [XML-SIG] Re: packaging 4Suite In-Reply-To: <7zk85so0jl.fsf@amboise.ird.idealx.com> (jerome.marant@free.fr) References: <7zk85so0jl.fsf@amboise.ird.idealx.com> Message-ID: <200103142047.f2EKlvf01502@mira.informatik.hu-berlin.de> > I don't agree. It is clear, according to the 4Suite documentation, > that 4Suite depends on PyXML and extracting PyXML from 4Suite > allows this package to use the lattest bugfixed PyXML without > needing 4Suite to be updated anytime that PyXML changes. That is up to your packaging. In theory, you are right: 4Suite is meant as a strict superset. In practice, there is more dependence between the two than we'd like, atleast at the moment. 4Suite needs *atleast* the most recent snapshot of PyXML, and I personally cannot guarantee that future releases of PyXML won't break older 4Suite releases (although there is a clear intention to be backwards compatible if possible). Regards, Martin From martin@loewis.home.cs.tu-berlin.de Wed Mar 14 20:38:31 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 14 Mar 2001 21:38:31 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <7zn1aopwhl.fsf@amboise.ird.idealx.com> (jerome.marant@free.fr) References: <200103140306.UAA02140@localhost.localdomain> <7zn1aopwhl.fsf@amboise.ird.idealx.com> Message-ID: <200103142038.f2EKcV701477@mira.informatik.hu-berlin.de> > BTW, do you plan to merge 4Suite and PyXML? It seems that a > growing number of 4Suite components are integrated into PyXML ... > What is the future of 4Suite? Uche has already explained his view, so let me add mine. Personally, I feel that PyXML "owns" the xml package, and I see my responsibility in getting all the components in it to work together. That is what makes integrating xml.xpath and xml.xslt interesting (although there certainly also is the challenge of doing it in pure Python which makes it interesting). For everything in Ft.*, I won't push integration into PyXML. From the XML point of view, that means that the Domlettes may never show up in PyXML - unless Fourthought wants to contribute them. There is other stuff in 4Suite that clearly does not belong into PyXML, also. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Wed Mar 14 20:43:49 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 14 Mar 2001 21:43:49 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <7z8zm8pg6q.fsf@amboise.ird.idealx.com> (jerome.marant@free.fr) References: <200103141342.GAA03805@localhost.localdomain> <7z8zm8pg6q.fsf@amboise.ird.idealx.com> Message-ID: <200103142043.f2EKhn201479@mira.informatik.hu-berlin.de> > I'm the Debian maintainer of the PyXML package and I'm working on > packaging 4Suite for Debian (BTW, do you agree with it?). So, I have > to remove PyXML and 4DOM sections as they are provided by the PyXML > package (4Suite depends on PyXML) and I was wondering whether I'd > have to remove more and more components :-) At the moment, you have two options: a) you can declare PyXML as a prerequisite of 4Suite; in that case, I'd appreciate if you'd restrict to released versions of PyXML only - no matter how broken they are. b) you can declare PyXML and 4Suite to be conflicting packages (don't know whether this is possible in Debian packaging); your 4Suite package would then incorporate a copy of PyXML. If you follow this route, you can chose whatever state of PyXML that is useful; just make sure that either PyXML or 4Suite properly supercedes any Python 2 package that might be also available (but I know that Debian refuses to offer Python 2 for political reasons) Regards, Martin P.S. No, I don't mean to start a flame war on licensing :-) Python licensing will hopefully sort out with 2.1. From martin@loewis.home.cs.tu-berlin.de Wed Mar 14 21:32:44 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 14 Mar 2001 22:32:44 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <20010314103317.C15434@ute.cnri.reston.va.us> (message from Andrew Kuchling on Wed, 14 Mar 2001 10:33:17 -0500) References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <20010314103317.C15434@ute.cnri.reston.va.us> Message-ID: <200103142132.f2ELWiM02058@mira.informatik.hu-berlin.de> > If my suggested fix for bug #407810 in xmlproc is correct, it would be > trivial to fix. If it's not, this might be more difficult to fix. > > Lengthy comment blocks cause xmlproc to raise a > RuntimeError: "maximum recursion depth exceeded" > error. The problem is that a group is used to match an > individual character, and SRE recurses > on group repeats: '([^-]|-[^-])*'. > > Fix: would '(.*?)--' be an equivalent pattern? I must admit that *? was new to me, but it appears to be extremely useful, and that appears to be the right use for it. IOW, I think your fix is correct (and probably more efficient in day-to-day use, also). Regards, Martin P.S. Could you take another look at the patches that have been assigned to you; if not, can you unassign them? P.P.S. Recently, I could not assign anything to None on SF, so the last "can" is not only "are you willing to", but also "are you capable of" :-? From akuchlin@mems-exchange.org Wed Mar 14 21:41:45 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Wed, 14 Mar 2001 16:41:45 -0500 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <200103142132.f2ELWiM02058@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Wed, Mar 14, 2001 at 10:32:44PM +0100 References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <20010314103317.C15434@ute.cnri.reston.va.us> <200103142132.f2ELWiM02058@mira.informatik.hu-berlin.de> Message-ID: <20010314164145.K15434@ute.cnri.reston.va.us> On Wed, Mar 14, 2001 at 10:32:44PM +0100, Martin v. Loewis wrote: >P.S. Could you take another look at the patches that have been >assigned to you; if not, can you unassign them? I thought that "[#403408] xml/marshal/wddx.py mods" was being revised by the author. The patch is dated Jan. 24, but there are subsequent discussions about revising them further, and I thought the patch was on hold pending further changes. I've added a comment asking Robin if I should just check in the current patches. (Annoying thing about SF's new patch mailings: I have no idea who the notifications are going to; is Robin even seeing them?) >P.P.S. Recently, I could not assign anything to None on SF, so the >last "can" is not only "are you willing to", but also "are you capable >of" :-? It does work for me; I unassigned the WDDX patches, and then promptly assigned them back to me. --amk From martin@loewis.home.cs.tu-berlin.de Wed Mar 14 22:01:38 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 14 Mar 2001 23:01:38 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <20010314164145.K15434@ute.cnri.reston.va.us> (message from Andrew Kuchling on Wed, 14 Mar 2001 16:41:45 -0500) References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <20010314103317.C15434@ute.cnri.reston.va.us> <200103142132.f2ELWiM02058@mira.informatik.hu-berlin.de> <20010314164145.K15434@ute.cnri.reston.va.us> Message-ID: <200103142201.f2EM1cp02263@mira.informatik.hu-berlin.de> > Annoying thing about SF's new patch mailings: I have no idea who the > notifications are going to; is Robin even seeing them? I think so, yes: everybody who ever made a comment to the issue, plus the submitter, plus the responsible developer gets a copy (that often meant I get multiple copies - the algorithm appears to play on the safe side). I agree SF should show *whom* it send a message to. Regards, Martin From larsga@garshol.priv.no Wed Mar 14 23:11:14 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 15 Mar 2001 00:11:14 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <20010314103317.C15434@ute.cnri.reston.va.us> References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <20010314103317.C15434@ute.cnri.reston.va.us> Message-ID: * Andrew Kuchling | | If my suggested fix for bug #407810 in xmlproc is correct, it would | be trivial to fix. If it's not, this might be more difficult to | fix. I don't think it is correct, but I need to look more closely at it. I'm in the process of doing so now, but am somewhat hampered by not having my test suite working. --Lars M. From larsga@garshol.priv.no Thu Mar 15 00:17:52 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 15 Mar 2001 01:17:52 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <20010314103317.C15434@ute.cnri.reston.va.us> References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <20010314103317.C15434@ute.cnri.reston.va.us> Message-ID: * Andrew Kuchling | | If my suggested fix for bug #407810 in xmlproc is correct, it would | be trivial to fix. If it's not, this might be more difficult to | fix. The fix turned out to be wrong, but luckily the problem wasn't very hard to fix. I've fixed it now both in my CVS tree and in the PyXML CVS tree. I've also done most of the hard work in cleaning up the test suite and making it read for a move to the PyXML test suite. I hope to be able to do the rest soon. --Lars M. From uche.ogbuji@fourthought.com Thu Mar 15 00:28:19 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Wed, 14 Mar 2001 17:28:19 -0700 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: Message from jerome.marant@free.fr (J r me Marant) of "14 Mar 2001 15:40:13 +0100." <7z8zm8pg6q.fsf@amboise.ird.idealx.com> Message-ID: <200103150028.RAA18167@localhost.localdomain> > > Yes, but in some cases there is more to it than simple migration. Fo= r = > ... > > = > > So 4Suite will keep on, although we will move over to PyXML whatever = makes = > > sense and has consensus (there was much discussion about moving 4XPat= h and = > > 4XSLT in almost a year ago, but the timing makes more sense now). > = > Right. > = > I'm the Debian maintainer of the PyXML package and I'm working on > packaging 4Suite for Debian (BTW, do you agree with it?). Absolutely? I thank you. > So, I have > to remove PyXML and 4DOM sections as they are provided by the PyXML > package (4Suite depends on PyXML) and I was wondering whether I'd > have to remove more and more components :-) I'm sorry the 4Suite/PyXML combo causes headaches for distributors, but a= s = Martin suggests, I think there are workarounds. -- = Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com = 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From martin@loewis.home.cs.tu-berlin.de Thu Mar 15 06:25:36 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 15 Mar 2001 07:25:36 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: (message from Lars Marius Garshol on 15 Mar 2001 01:17:52 +0100) References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <20010314103317.C15434@ute.cnri.reston.va.us> Message-ID: <200103150625.f2F6PaT01170@mira.informatik.hu-berlin.de> > The fix turned out to be wrong, but luckily the problem wasn't very > hard to fix. Thanks for looking into this. Doing a forward string search looks like the better solution, anyway. Regards, Martin From jerome.marant@free.fr Thu Mar 15 09:31:31 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 15 Mar 2001 10:31:31 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: "Martin v. Loewis"'s message of "Wed, 14 Mar 2001 21:43:49 +0100" References: <200103141342.GAA03805@localhost.localdomain> <7z8zm8pg6q.fsf@amboise.ird.idealx.com> <200103142043.f2EKhn201479@mira.informatik.hu-berlin.de> Message-ID: <7zu24ve5u4.fsf@amboise.ird.idealx.com> "Martin v. Loewis" writes: =20 > At the moment, you have two options: >=20 > a) you can declare PyXML as a prerequisite of 4Suite; in that case, > I'd appreciate if you'd restrict to released versions of PyXML only > - no matter how broken they are. This option is the most elegant, IMHO and the one I chose. Hence, you avoid bloating by providing stricly different components, and you do not forbid 4Suite users to use the latest bugfixed version of PyXML. I'm trying to follow what happening on the list to keep informed and it is my job to make decisions when something is broken: I can easily make changes to packages. Cheers,=20=20 =20 > P.S. No, I don't mean to start a flame war on licensing :-) Python > licensing will hopefully sort out with 2.1. I don't like famewars neither. We are glad to see that this problem will be worked out in 2.1 (as it was recently with 1.6.1). Until then, we must have multiple versions of packages for both 1.5.x and 2.0. --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From jerome.marant@free.fr Fri Mar 16 08:50:24 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 16 Mar 2001 09:50:24 +0100 Subject: [XML-SIG] setup.py question In-Reply-To: "Martin v. Loewis"'s message of "Tue, 13 Mar 2001 23:37:45 +0100" References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> Message-ID: <7z8zm66qsv.fsf@amboise.ird.idealx.com> Hi, Is there a good reason for installing PyXML in the _xmlplus for Python 2.0 rather that xml for the previous versions ? This change is breaking applications which are using import xml.=20 Thanks. --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From Alexandre.Fayolle@logilab.fr Fri Mar 16 09:24:29 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Fri, 16 Mar 2001 10:24:29 +0100 (CET) Subject: [XML-SIG] setup.py question In-Reply-To: <7z8zm66qsv.fsf@amboise.ird.idealx.com> Message-ID: On 16 Mar 2001, Jérôme Marant wrote: > > Hi, > > Is there a good reason for installing PyXML in the _xmlplus > for Python 2.0 rather that xml for the previous versions ? > This change is breaking applications which are using > import xml. Using xml would conflict with the core xml module in Python 2.0. There is a change in the the __init__.py of the core xml package which checks for _xmlplus and uses it if it is found, so this should not beak Python 1.5 application using import xml to import PyXML. The issue was discussed in August (a little) and September 2000 (a lot), and kept the list busy for quite a while. You may want to check the archives (http://mail.python.org/pipermail/xml-sig/2000-September/thread.html). The threads were 'Python Package Name', 'Uniform interface with Python 2.0', 'namespace collision between lib/xml and site-packages/xml'. Cheers. Alexandre Fayolle -- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). From johann@egenetics.com Fri Mar 16 09:23:12 2001 From: johann@egenetics.com (Johann Visagie) Date: Fri, 16 Mar 2001 11:23:12 +0200 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <7z8zm8pg6q.fsf@amboise.ird.idealx.com>; from jerome.marant@free.fr on Wed, Mar 14, 2001 at 03:40:13PM +0100 References: <200103141342.GAA03805@localhost.localdomain> <7z8zm8pg6q.fsf@amboise.ird.idealx.com> Message-ID: <20010316112311.E4464@fling.sanbi.ac.za> Jérôme Marant on 2001-03-14 (Wed) at 15:40:13 +0100: > > I'm the Debian maintainer of the PyXML package and I'm working on > packaging 4Suite for Debian (BTW, do you agree with it?). So, I have > to remove PyXML and 4DOM sections as they are provided by the PyXML > package (4Suite depends on PyXML) and I was wondering whether I'd > have to remove more and more components :-) I'm glad to see I'm not the only one having these problems. :-) I took over maintainership of the PyXML port in the FreeBSD ports tree last November. Since then, we managed to solve some subtle dependency problems caused by PyXML installing in different locations under Python 2.0 and earlier versions, but I have yet to face up to the monster that is the proper integration of 4Suite and PyXML ports. Currently, therefore, FreeBSD has no 4Suite port. I hope this will change soon. :-) This thread has been most informative, thanks. -- Johann From jerome.marant@free.fr Fri Mar 16 10:02:46 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 16 Mar 2001 11:02:46 +0100 Subject: [XML-SIG] setup.py question In-Reply-To: Alexandre Fayolle's message of "Fri, 16 Mar 2001 10:24:29 +0100 (CET)" References: Message-ID: <7zitla58vt.fsf@amboise.ird.idealx.com> Alexandre Fayolle writes: =20 > Using xml would conflict with the core xml module in Python 2.0. Ther= e is > a change in the the __init__.py of the core xml package which checks = for > _xmlplus and uses it if it is found, so this should not beak Python 1= .5 > application using import xml to import PyXML. Thanks ! --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From jerome.marant@free.fr Fri Mar 16 10:05:48 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 16 Mar 2001 11:05:48 +0100 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: Johann Visagie's message of "Fri, 16 Mar 2001 11:23:12 +0200" References: <200103141342.GAA03805@localhost.localdomain> <7z8zm8pg6q.fsf@amboise.ird.idealx.com> <20010316112311.E4464@fling.sanbi.ac.za> Message-ID: <7zelvy58qr.fsf@amboise.ird.idealx.com> Johann Visagie writes: =20 > I'm glad to see I'm not the only one having these problems. :-) I t= ook over > maintainership of the PyXML port in the FreeBSD ports tree last Novem= ber. > Since then, we managed to solve some subtle dependency problems cause= d by > PyXML installing in different locations under Python 2.0 and earlier At the moment we are able to install both 1.5 and 2.0 on the same Debian system. Then we do provide 1.5 and 2.0 versions of the same package. --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From uche.ogbuji@fourthought.com Fri Mar 16 12:53:13 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Fri, 16 Mar 2001 05:53:13 -0700 Subject: [XML-SIG] Preparing for PyXML 0.6.5 References: <200103141342.GAA03805@localhost.localdomain> <7z8zm8pg6q.fsf@amboise.ird.idealx.com> <20010316112311.E4464@fling.sanbi.ac.za> Message-ID: <3AB20CB9.CBE80F17@fourthought.com> Johann Visagie wrote: >=20 > J=E9r=F4me Marant on 2001-03-14 (Wed) at 15:40:13 +0100: > > > > I'm the Debian maintainer of the PyXML package and I'm working on > > packaging 4Suite for Debian (BTW, do you agree with it?). So, I hav= e > > to remove PyXML and 4DOM sections as they are provided by the PyXML > > package (4Suite depends on PyXML) and I was wondering whether I'd > > have to remove more and more components :-) >=20 > I'm glad to see I'm not the only one having these problems. :-) I too= k over > maintainership of the PyXML port in the FreeBSD ports tree last Novembe= r. > Since then, we managed to solve some subtle dependency problems caused = by > PyXML installing in different locations under Python 2.0 and earlier > versions, but I have yet to face up to the monster that is the proper > integration of 4Suite and PyXML ports. Currently, therefore, FreeBSD h= as no > 4Suite port. I hope this will change soon. :-) Actually, this is not true. See http://www.4suite.org/FAQ.epy#1.1 Of course the latest version is 0.10.1, but it looks as if Peter was able to tackle the problems. --=20 Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com=20 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From johann@egenetics.com Fri Mar 16 13:33:16 2001 From: johann@egenetics.com (Johann Visagie) Date: Fri, 16 Mar 2001 15:33:16 +0200 Subject: [XML-SIG] Preparing for PyXML 0.6.5 In-Reply-To: <3AB20CB9.CBE80F17@fourthought.com>; from uche.ogbuji@fourthought.com on Fri, Mar 16, 2001 at 05:53:13AM -0700 References: <200103141342.GAA03805@localhost.localdomain> <7z8zm8pg6q.fsf@amboise.ird.idealx.com> <20010316112311.E4464@fling.sanbi.ac.za> <3AB20CB9.CBE80F17@fourthought.com> Message-ID: <20010316153316.A17768@fling.sanbi.ac.za> Uche Ogbuji on 2001-03-16 (Fri) at 05:53:13 -0700: > > > but I have yet to face up to the monster that is the proper > > integration of 4Suite and PyXML ports. Currently, therefore, FreeBSD has no > > 4Suite port. I hope this will change soon. :-) > > Actually, this is not true. > > See > > http://www.4suite.org/FAQ.epy#1.1 Hmm. This port has not been committed to the FreeBSD ports tree, and is therefore not an "official" FreeBSD port. (FreeBSD ports are installed as part of the OS in /usr/ports, and most FreeBSD users would update their ports tree regularly via CVSup or similar.) I now notice that it has been submitted several times, but looking at it I would guess the reason why it has not been committed is that it suffers from the same problems Jérôme originally mentioned. For instance, it does not attempt peaceful cohabitation with the PyXML port. -- Johann From fdrake@acm.org Fri Mar 16 13:33:59 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 16 Mar 2001 08:33:59 -0500 (EST) Subject: [XML-SIG] setup.py question In-Reply-To: <7z8zm66qsv.fsf@amboise.ird.idealx.com> References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <7z8zm66qsv.fsf@amboise.ird.idealx.com> Message-ID: <15026.5703.555244.547926@cj42289-a.reston1.va.home.com> J=E9r=F4me Marant writes: > Is there a good reason for installing PyXML in the _xmlplus > for Python 2.0 rather that xml for the previous versions ? > This change is breaking applications which are using > import xml.=20 Python 2.0 provides an "xml" package already, but PyXML is an upgrade to that package. PyXML should be installing as "_xmlplus" for Python 2.0+, and as "xml" for all older versions of Python. Can you detail the combination of releases that breaks for you? Thanks! -Fred --=20 Fred L. Drake, Jr. PythonLabs at Digital Creations From jerome.marant@free.fr Fri Mar 16 14:37:04 2001 From: jerome.marant@free.fr (Jérôme Marant) Date: 16 Mar 2001 15:37:04 +0100 Subject: [XML-SIG] setup.py question In-Reply-To: "Fred L. Drake, Jr."'s message of "Fri, 16 Mar 2001 08:33:59 -0500 (EST)" References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <7z8zm66qsv.fsf@amboise.ird.idealx.com> <15026.5703.555244.547926@cj42289-a.reston1.va.home.com> Message-ID: <7zbsr14w6n.fsf@amboise.ird.idealx.com> "Fred L. Drake, Jr." writes: =20 > Python 2.0 provides an "xml" package already, but PyXML is an > upgrade to that package. PyXML should be installing as "_xmlplus" for > Python 2.0+, and as "xml" for all older versions of Python. > Can you detail the combination of releases that breaks for you? > Thanks! Well, I can see the problem now. It is related to the way the interpreter is packaged in Debian: we usually split packages in several parts (thematically) in order not to bloat the system. For instance, python2-xmlbase contains the core xml library. My problem is that i made pyxml conflict with python2-xmlbase so that we cannot have 2 xml implementations at a time. So, It breaks applications since import xml does not work any more. After reading your remark, would do say that the core xml package is mandatory for pyxml ? If so, i can peacefully remove the "conflict". If not, I'll have to rename _xmlplus to xml. Thanks ! --=20 J=E9r=F4me Marant http://jerome.marant.free.fr From fdrake@acm.org Fri Mar 16 15:12:06 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 16 Mar 2001 10:12:06 -0500 (EST) Subject: [XML-SIG] setup.py question In-Reply-To: <7zbsr14w6n.fsf@amboise.ird.idealx.com> References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <7z8zm66qsv.fsf@amboise.ird.idealx.com> <15026.5703.555244.547926@cj42289-a.reston1.va.home.com> <7zbsr14w6n.fsf@amboise.ird.idealx.com> Message-ID: <15026.11590.178156.375665@localhost.localdomain> J=E9r=F4me Marant writes: > Well, I can see the problem now. It is related to the way the > interpreter is packaged in Debian: we usually split packages in > several parts (thematically) in order not to bloat the system. > For instance, python2-xmlbase contains the core xml library. > My problem is that i made pyxml conflict with python2-xmlbase > so that we cannot have 2 xml implementations at a time. > So, It breaks applications since import xml does not work any > more. Hmm... the reason for moving some of it into the core was to ensure that all installations have at least basic XML support if pyexpat could compile; is pyexpat part of xmlbase? (And not all of the xml package depends on pyexpat; even using another parser, the xml.dom and xml.sax packages provide the needed exceptions and constants for a number of the XML APIs. The node type constants are one example; such things need to be found in a single location for fully general API compatibility.) > After reading your remark, would do say that the core xml > package is mandatory for pyxml ? If so, i can peacefully > remove the "conflict". If not, I'll have to rename _xmlplus > to xml. I'd make xmlbase mandatory for PyXML; it includes the magic needed to PyXML take precedence if present. -Fred --=20 Fred L. Drake, Jr. PythonLabs at Digital Creations From martin@loewis.home.cs.tu-berlin.de Fri Mar 16 17:41:05 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 16 Mar 2001 18:41:05 +0100 Subject: [XML-SIG] setup.py question In-Reply-To: <7zbsr14w6n.fsf@amboise.ird.idealx.com> (jerome.marant@free.fr) References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <7z8zm66qsv.fsf@amboise.ird.idealx.com> <15026.5703.555244.547926@cj42289-a.reston1.va.home.com> <7zbsr14w6n.fsf@amboise.ird.idealx.com> Message-ID: <200103161741.f2GHf5I00943@mira.informatik.hu-berlin.de> > After reading your remark, would do say that the core xml > package is mandatory for pyxml ?=20 =46rom the point of your packaging strategy, it is. But as Fred, I'd strongly discourage splitting the Core Python distribution. The major strength of Python is the "batteries included" aspect. So for the Python core, I'd rather encourage an all-or-nothing position. Do you have any feedback how many administrators would chose a "partial" installation? How could an administrator know what kind of libraries her users need? Sometimes, offering choices only complicates matters instead of simplifying them. The people will show up on python-help or python-tutor and ask what happened to the supposed XML support of Python 2.0, since they did not get it on their system. Regards, Martin From marketing@rjsnetworks.com Sat Mar 17 00:51:38 2001 From: marketing@rjsnetworks.com (=?iso-8859-1?Q?Sales_-_rjsNetworks=2Ecom?=) Date: Fri, 16 Mar 2001 19:51:38 -0500 Subject: [XML-SIG] (no subject) Message-ID: <200103161951375.SM01192@rjsnetworks-ws1> References: <200103132237.f2DMbjm02774@mira.informatik.hu-berlin.de> <7z8zm66qsv.fsf@amboise.ird.idealx.com> <15026.5703.555244.547926@cj42289-a.reston1.va.home.com> <7zbsr14w6n.fsf@amboise.ird.idealx.com> Message-ID: <15026.11066.691824.633648@localhost.localdomain> J=E9r=F4me Marant writes: > Well, I can see the problem now. It is related to the way the > interpreter is packaged in Debian: we usually split packages in > several parts (thematically) in order not to bloat the system. > For instance, python2-xmlbase contains the core xml library. > My problem is that i made pyxml conflict with python2-xmlbase > so that we cannot have 2 xml implementations at a time. > So, It breaks applications since import xml does not work any > more. Hmm... the reason for moving some of it into the core was to ensure that all installations have at least basic XML support if pyexpat could compile; is pyexpat part of xmlbase? (And not all of the xml package depends on pyexpat; even using another parser, the xml.dom and xml.sax packages provide the needed exceptions and constants for a number of the XML APIs. The node type constants are one example; such things need to be found in a single location for fully general API compatibility.) > After reading your remark, would do say that the core xml > package is mandatory for pyxml ? If so, i can peacefully > remove the "conflict". If not, I'll have to rename _xmlplus > to xml. I'd make xmlbase mandatory for PyXML; it includes the magic needed to PyXML take precedence if present. -Fred --=20 Fred L. Drake, Jr. PythonLabs at Digital Creations From akuchlin@mems-exchange.org Sat Mar 17 05:51:42 2001 From: akuchlin@mems-exchange.org (A.M. Kuchling) Date: Sat, 17 Mar 2001 00:51:42 -0500 Subject: [XML-SIG] ANN: quotation-tools 0.0.3 released Message-ID: <200103170551.AAA02154@mira.erols.com> I've made a new release of quotation-tools, a package for processing QEL. With this release, I'm finished with hacking on the command-line for the moment; the next task is going to be a Tkinter GUI. The package is available from http://www.amk.ca/qel/software.html . --amk Changes in version 0.0.3 and version 0.0.2: * New scripts: qtmerge for merging several QEL files into one, and fortune2qel to convert fortune's files into QEL. * Implemented the QELdb class, which acts as a fast cache for a (potentially large) QEL file. * Added docstrings so pydoc can produce some helpful output. * Added XML output format to qtformat; this now pretty-prints QEL. * Added -c option to qtgrep, to cause it to just print the number of matching quotations for each file searched. * Added a CSS1 stylesheet, xml/qel.css, for formatting QEL. * Fixed bugs in dealing with the
 element.
        * Fixed bugs in dealing with the 
element. From noreply@sourceforge.net Sun Mar 18 23:03:08 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Sun, 18 Mar 2001 15:03:08 -0800 Subject: [XML-SIG] [ pyxml-Bugs-409605 ] reader.HtmlLib ignores optional starttag Message-ID: Bugs item #409605, was updated on 2001-03-18 15:03 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=409605&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: Martin v. Löwis (loewis) Assigned to: Nobody/Anonymous (nobody) Summary: reader.HtmlLib ignores optional starttag Initial Comment: Given the document good_html = """

I prefer (all things being equal) regularity/orthogonality and logical syntax/semantics in a language because there is less to have to remember. (Of course I know all things are NEVER really equal!)

Guido van Rossum, 6 Dec 91

The details of that silly code are irrelevant.

Tim Peters, 4 Mar 92 & < > é ö   """ the reader should imply the tag when it sees the first p element. Instead, it will drop the p element, as it is not directly allowed inside of the html element. Still, the document is valid, so the reader should build the P elements into the tree. To see the error, do from xml.dom.ext.reader import HtmlLib b = HtmlLib.FromHtml(good_html) print b.firstChild.firstChild ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=409605&group_id=6473 From cce@clarkevans.com Mon Mar 19 11:03:37 2001 From: cce@clarkevans.com (Clark C. Evans) Date: Mon, 19 Mar 2001 06:03:37 -0500 (EST) Subject: [XML-SIG] Getting namespace aware parser to work... In-Reply-To: <200103170551.AAA02154@mira.erols.com> Message-ID: I'm trying to process the following xml file, with this python script to strip all elements with a given namespace. I believe that I have a pretty recent version (0.5.2). I get the error following... ----------------------------------------------------------------- test.xml ----------------------------------------------------------------- strip keep ----------------------------------------------------------------- test.py ----------------------------------------------------------------- """Strips a particular namespace from an XML document.""" from xml.sax import saxutils class StripperFilter(saxutils.XMLFilterBase ): """Does the actual stripping""" def __init__(self,nmsp): """The namespace to strip is nmsp""" saxutils.XMLFilterBase.__init__(self) self.nmsp = nmsp def startElementNS(self, name, qname, attrs): """Ignores elements and strips attributes of nmsp""" if name[0] != self.nmsp: # # Warning: For efficiency this dives into the # underlying representation of AttributesNSImpl # and deletes attributes to be stripped. # # _attrs should be of the form {(ns_uri, lname): value, ...}. # _qnames of the form {(ns_uri, lname): qname, ...}.""" # for (ns_uri,lname) in attrs._attrs.keys(): if nmsp == ns_uri: del attrs._attrs[(ns_uri,lname)] saxutils.XMLFilterBase.startElementNS(self,name,qname,attrs) from xml.sax import make_parser from xml.sax.handler import feature_namespaces def testStripper(): parser = make_parser() parser.setFeature(feature_namespaces, 1) strip = StripperFilter('myuri') out = saxutils.XMLGenerator() strip.setContentHandler(out) parser.setContentHandler(strip) parser.parse("c:\\work\\xfld\\test.xml") if __name__ == '__main__': testStripper() ---------------------------------------------------------------------- The error message ---------------------------------------------------------------------- stripTraceback (most recent call last): File "", line 40, in ? File "", line 37, in testStripper File "F:\Program Files\Python\_xmlplus\sax\expatreader.py", line 43, in parse xmlreader.IncrementalParser.parse(self, source) File "F:\Program Files\Python\_xmlplus\sax\xmlreader.py", line 120, in parse self.feed(buffer) File "F:\Program Files\Python\_xmlplus\sax\expatreader.py", line 87, in feed self._parser.Parse(data, isFinal) File "F:\Program Files\Python\_xmlplus\sax\expatreader.py", line 187, in end_element_ns self._cont_handler.endElementNS(pair, None) File "F:\Program Files\Python\_xmlplus\sax\saxutils.py", line 259, in endElementNS self._cont_handler.endElementNS(name, qname) File "F:\Program Files\Python\_xmlplus\sax\saxutils.py", line 192, in endElementNS qname = self._current_context[name[0]] + ":" + name[1] TypeError: bad operand type(s) for + From cce@clarkevans.com Mon Mar 19 11:07:10 2001 From: cce@clarkevans.com (Clark C. Evans) Date: Mon, 19 Mar 2001 06:07:10 -0500 (EST) Subject: [XML-SIG] Re: Getting namespace aware parser to work... In-Reply-To: Message-ID: I believe the problem is default namespaces that do not have a prefix. The stripper gives the expected (and, of course incorrect as it's not finished) output when the test.xml file is changed to: one keep So... does the namespace aware code handle the case when a namespace is not in the lookup table? Clark On Mon, 19 Mar 2001, Clark C. Evans wrote: > Date: Mon, 19 Mar 2001 06:03:37 -0500 (EST) > From: Clark C. Evans > To: xml-sig@python.org > Subject: Getting namespace aware parser to work... > > I'm trying to process the following xml file, with > this python script to strip all elements with a > given namespace. I believe that I have a pretty > recent version (0.5.2). I get the error following... > > ----------------------------------------------------------------- > test.xml > ----------------------------------------------------------------- > > > strip > keep > > > ----------------------------------------------------------------- > test.py > ----------------------------------------------------------------- > > """Strips a particular namespace from an XML document.""" > from xml.sax import saxutils > > class StripperFilter(saxutils.XMLFilterBase ): > """Does the actual stripping""" > def __init__(self,nmsp): > """The namespace to strip is nmsp""" > saxutils.XMLFilterBase.__init__(self) > self.nmsp = nmsp > > def startElementNS(self, name, qname, attrs): > """Ignores elements and strips attributes of nmsp""" > if name[0] != self.nmsp: > # > # Warning: For efficiency this dives into the > # underlying representation of AttributesNSImpl > # and deletes attributes to be stripped. > # > # _attrs should be of the form {(ns_uri, lname): value, ...}. > # _qnames of the form {(ns_uri, lname): qname, ...}.""" > # > for (ns_uri,lname) in attrs._attrs.keys(): > if nmsp == ns_uri: del attrs._attrs[(ns_uri,lname)] > saxutils.XMLFilterBase.startElementNS(self,name,qname,attrs) > > > from xml.sax import make_parser > from xml.sax.handler import feature_namespaces > > def testStripper(): > parser = make_parser() > parser.setFeature(feature_namespaces, 1) > strip = StripperFilter('myuri') > out = saxutils.XMLGenerator() > strip.setContentHandler(out) > parser.setContentHandler(strip) > parser.parse("c:\\work\\xfld\\test.xml") > > if __name__ == '__main__': > testStripper() > > ---------------------------------------------------------------------- > The error message > ---------------------------------------------------------------------- > > > stripTraceback (most recent call last): > File "", line 40, in ? > File "", line 37, in testStripper > File "F:\Program Files\Python\_xmlplus\sax\expatreader.py", line 43, in > parse > xmlreader.IncrementalParser.parse(self, source) > File "F:\Program Files\Python\_xmlplus\sax\xmlreader.py", line 120, in > parse > self.feed(buffer) > File "F:\Program Files\Python\_xmlplus\sax\expatreader.py", line 87, in > feed > self._parser.Parse(data, isFinal) > File "F:\Program Files\Python\_xmlplus\sax\expatreader.py", line 187, in > end_element_ns > self._cont_handler.endElementNS(pair, None) > File "F:\Program Files\Python\_xmlplus\sax\saxutils.py", line 259, in > endElementNS > self._cont_handler.endElementNS(name, qname) > File "F:\Program Files\Python\_xmlplus\sax\saxutils.py", line 192, in > endElementNS > qname = self._current_context[name[0]] + ":" + name[1] > TypeError: bad operand type(s) for + > > > > > > > > > > From cce@clarkevans.com Mon Mar 19 11:38:47 2001 From: cce@clarkevans.com (Clark C. Evans) Date: Mon, 19 Mar 2001 06:38:47 -0500 (EST) Subject: [XML-SIG] (patch) Re: Getting namespace aware parser to work... In-Reply-To: Message-ID: It's not perfect (since it doesn't use a stack for implicit namespaces), but the errors I was getting should be fixed by this patch. Clark ........................ _xmlplus/sax/saxutils.py ......................... 171c171,172 < name = name[1] --- > qname = name[1] > self._out.write('<' + qname) 173,175c174,181 < name = self._current_context[name[0]] + ":" + name[1] < self._out.write('<' + name) < --- > prefix = self._current_context[name[0]] > if prefix is None: > self._out.write('<%s xmlns="%s"' % (name[1],name[0]) ) > qname = name[1] > else: > qname = prefix + ":" + name[1] > self._out.write('<' + qname) > 177c183,186 < self._out.write(' xmlns:%s="%s"' % pair) --- > if pair[0] is None: > pass > else: > self._out.write(' xmlns:%s="%s"' % pair) 181,182c190,194 < name = self._current_context[name[0]] + ":" + name[1] < self._out.write(' %s="%s"' % (name, escape(value))) --- > if name[0] is None: > qname = name[1] > else: > qname = self._current_context[name[0]] + ":" + name[1] > self._out.write(' %s="%s"' % (qname, escape(value))) 192c204,208 < qname = self._current_context[name[0]] + ":" + name[1] --- > prefix = self._current_context[name[0]] > if prefix is None: > qname = name[1] > else: > qname = prefix + ":" + name[1] From cce@clarkevans.com Mon Mar 19 12:16:15 2001 From: cce@clarkevans.com (Clark C. Evans) Date: Mon, 19 Mar 2001 07:16:15 -0500 (EST) Subject: [XML-SIG] (patch) Re: Getting namespace aware parser to work... In-Reply-To: Message-ID: This is a nicer patch to saxutils.py to fix the default namespace handling. Please excuse the bad python code... I'm still less than 100 lines old... so it may have stupid errors. --------------------- 141a142 > self._default_context = None 171c172 < name = name[1] --- > self._out.write('<' + name[1]) 173,175c174,183 < name = self._current_context[name[0]] + ":" + name[1] < self._out.write('<' + name) < --- > prefix = self._current_context[name[0]] > if prefix is None: > if self._default_context is None or self._default_context != name[0]: > self._out.write('<%s xmlns="%s"' % (name[1],name[0]) ) > self._default_context = name[0] > else: > self._out.write('<' + name[1]) > else: > self._out.write('<' + prefix + ":" + name[1]) > 177c185,188 < self._out.write(' xmlns:%s="%s"' % pair) --- > if pair[0] is None: > pass > else: > self._out.write(' xmlns:%s="%s"' % pair) 181,182c192,196 < name = self._current_context[name[0]] + ":" + name[1] < self._out.write(' %s="%s"' % (name, escape(value))) --- > if name[0] is None: > qname = name[1] > else: > qname = self._current_context[name[0]] + ":" + name[1] > self._out.write(' %s="%s"' % (qname, escape(value))) 192c206,210 < qname = self._current_context[name[0]] + ":" + name[1] --- > prefix = self._current_context[name[0]] > if prefix is None: > qname = name[1] > else: > qname = prefix + ":" + name[1] From cce@clarkevans.com Mon Mar 19 12:21:07 2001 From: cce@clarkevans.com (Clark C. Evans) Date: Mon, 19 Mar 2001 07:21:07 -0500 (EST) Subject: [XML-SIG] Namespace Stripper Filter In-Reply-To: Message-ID: Here is my first "real live" python program... anyone who'd like to comment for style, please do so as I'm a newbie. ----------------------------------------------------------------- """Strips a particular namespace from an XML document.""" from xml.sax import saxutils class StripperFilter(saxutils.XMLFilterBase ): """Does the actual stripping""" def __init__(self,nmsp): """The namespace to strip is nmsp""" saxutils.XMLFilterBase.__init__(self) self.nmsp = nmsp self.depth = 0 def startElementNS(self, name, qname, attrs): """Ignores elements and strips attributes of nmsp""" if name[0] != self.nmsp: # # Warning: For efficiency this dives into the # underlying representation of AttributesNSImpl # and deletes attributes to be stripped. # # _attrs should be of the form {(ns_uri, lname): value, ...}. # _qnames of the form {(ns_uri, lname): qname, ...}.""" # for (ns_uri,lname) in attrs._attrs.keys(): if self.nmsp == ns_uri: del attrs._attrs[(ns_uri,lname)] self._cont_handler.startElementNS(name,qname,attrs) else: self.depth = self.depth + 1 def characters(self, content): if self.depth == 0: saxutils.XMLFilterBase.characters(self,content) def endElementNS(self, name, qname): if self.depth > 0: self.depth = self.depth - 1 else: self._cont_handler.endElementNS(name,qname) def startPrefixMapping(self, prefix, uri): if self.nmsp != uri: self._cont_handler.startPrefixMapping(prefix, uri) from xml.sax import make_parser from xml.sax.handler import feature_namespaces def testStripper(): parser = make_parser() parser.setFeature(feature_namespaces, 1) strip = StripperFilter('namespace-to-strip) out = saxutils.XMLGenerator() strip.setContentHandler(out) parser.setContentHandler(strip) parser.parse("test.xml") if __name__ == '__main__': testStripper() From greg@itam.zabrze.pl Mon Mar 19 13:40:24 2001 From: greg@itam.zabrze.pl (Grzegorz Zegartowski) Date: Mon, 19 Mar 2001 14:40:24 +0100 Subject: [XML-SIG] Minidom Message-ID: <3AB60C48.C66BB433@itam.zabrze.pl> I wish to know how to reading xml files with validation... there's a parse method: xml.dom.minidom.parse(filename, parser) What should I put as a parser? Thanks, Zedd From stuartd@alerton.com Mon Mar 19 17:20:33 2001 From: stuartd@alerton.com (Stuart Donaldson) Date: Mon, 19 Mar 2001 09:20:33 -0800 Subject: [XML-SIG] WBXML? Message-ID: I'm new to this SIG mailing list, and have looked over the XML-SIG Status page but could not find any reference to WBXML a WAP Binary XML standard. Anyone out there working with this or another form of XML that is optimized both for space and ease of parsing? Thanks... -Stuart- From akuchlin@mems-exchange.org Mon Mar 19 19:01:18 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Mon, 19 Mar 2001 14:01:18 -0500 Subject: [XML-SIG] iso8601 module: re-creating an original date Message-ID: I've noticed that xml.utils.iso8601 doesn't provide enough information to allow parsing and then re-creating a date. iso8601.parse() takes a string and returns the value in seconds since the epoch. There's no way to tell if the original date string was '2000-01-01' or '2000' or '2000-01-01T00:00'. You also can't parse the date manually in the event you want an mxDateTime instead of just seconds, which means you can't handle very old or very futuristic dates. I'd like to add support for being precise and figuring out exactly what was provided, but we need to discuss the interface a bit. One possible API: parse_tuple(string) which returns a 9-tuple like the one from time.gmtime() or time.localtime(), except that fields not provided are represented by None, not 0. (This means you can't pass the tuple to functions like time.mktime() without first converting None to 0.) An alternative interface would be to return a dictionary of fields, or an object with attributes. Thoughts? --amk From fdrake@acm.org Mon Mar 19 19:15:26 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 19 Mar 2001 14:15:26 -0500 (EST) Subject: [XML-SIG] iso8601 module: re-creating an original date In-Reply-To: References: Message-ID: <15030.23246.842826.68806@localhost.localdomain> Andrew Kuchling writes: > One possible API: parse_tuple(string) which returns a 9-tuple like the > one from time.gmtime() or time.localtime(), except that fields not > provided are represented by None, not 0. (This means you can't pass Con: This maintains the existing "tuplized" excuse for a structure -- pure evil, and a pain to work with! > the tuple to functions like time.mktime() without first converting > None to 0.) An alternative interface would be to return a dictionary > of fields, or an object with attributes. I favor an object with attributes, and look forward to your updates to the module. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From harris41@msu.edu Mon Mar 19 21:04:48 2001 From: harris41@msu.edu (Scott Harrison) Date: Mon, 19 Mar 2001 16:04:48 -0500 Subject: [XML-SIG] error with xhtml strict dtd Message-ID: <3AB67470.15E08846@msu.edu> What should be done with this situation below? And do you have a mailing list? I'd like to contribute or at least stay in touch as to what is going on. Thanks -Scott Trying to use pyxml with xhtml (using current cvs version). xmlproc_val E:http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd:316:3: xml:space must have exactly the values 'default' and 'preserve' E:http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd:326:3: xml:space must have exactly the values 'default' and 'preserve' E:http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd:457:3: xml:space must have exactly the values 'default' and 'preserve' These are lines 316, 326 and 457 for xhtml1-strict.dtd: xml:space (preserve) #FIXED 'preserve' xml:space (preserve) #FIXED 'preserve' xml:space (preserve) #FIXED 'preserve' And of course the part of the xmldtd.py code that is responsible is shown here: if name=="xml:space": if type(self.type)==types.StringType: parser.report_error(2015) return if len(self.type)!=2: error=1 else: if (self.type[0]=="default" and self.type[1]=="preserve") or \ (self.type[1]=="default" and self.type[0]=="preserve"): error=0 else: error=1 if error: parser.report_error(2016) From harris41@msu.edu Mon Mar 19 21:39:41 2001 From: harris41@msu.edu (Scott Harrison) Date: Mon, 19 Mar 2001 16:39:41 -0500 Subject: [XML-SIG] Re: error with xhtml strict dtd References: <3AB67470.15E08846@msu.edu> Message-ID: <3AB67C9D.92DFE8CE@msu.edu> I would recommend this patch: Index: xml/parsers/xmlproc/xmldtd.py =================================================================== RCS file: /cvsroot/pyxml/xml/xml/parsers/xmlproc/xmldtd.py,v retrieving revision 1.11 diff -r1.11 xmldtd.py 408c408 < if len(self.type)!=2: --- > if (len(self.type)!=2) and (len(self.type)!=1): 409a410,411 > elif len(self.type)==1: > error=0 Scott Harrison wrote: > > What should be done with this situation below? And do you have > a mailing list? I'd like to contribute or at least stay > in touch as to what is going on. Thanks -Scott > > Trying to use pyxml with xhtml (using current cvs version). > xmlproc_val > > E:http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd:316:3: xml:space > must have exactly the values 'default' and 'preserve' > E:http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd:326:3: xml:space > must have exactly the values 'default' and 'preserve' > E:http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd:457:3: xml:space > must have exactly the values 'default' and 'preserve' > > These are lines 316, 326 and 457 for xhtml1-strict.dtd: > xml:space (preserve) #FIXED 'preserve' > xml:space (preserve) #FIXED 'preserve' > xml:space (preserve) #FIXED 'preserve' > > And of course the part of the xmldtd.py code > that is responsible is shown here: > > if name=="xml:space": > if type(self.type)==types.StringType: > parser.report_error(2015) > return > > if len(self.type)!=2: > error=1 > else: > if (self.type[0]=="default" and > self.type[1]=="preserve") or \ > (self.type[1]=="default" and > self.type[0]=="preserve"): > error=0 > else: > error=1 > > if error: > parser.report_error(2016) From martin@loewis.home.cs.tu-berlin.de Mon Mar 19 21:35:08 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 19 Mar 2001 22:35:08 +0100 Subject: [XML-SIG] Minidom In-Reply-To: <3AB60C48.C66BB433@itam.zabrze.pl> (message from Grzegorz Zegartowski on Mon, 19 Mar 2001 14:40:24 +0100) References: <3AB60C48.C66BB433@itam.zabrze.pl> Message-ID: <200103192135.f2JLZ8H01030@mira.informatik.hu-berlin.de> > I wish to know how to reading xml files with validation... > > there's a parse method: > xml.dom.minidom.parse(filename, parser) > > What should I put as a parser? Depends on whether you only have Python 2, or PyXML. In Python 2, no validating parser is included. With PyXML, xml.sax.sax2exts.XMLValParserFactory.make_parser() will create you a validating SAX parser (namely, xmlproc, unless additional validating parsers have been registered). Regards, Martin From j.lee@spitech.com Fri Mar 16 10:50:50 2001 From: j.lee@spitech.com (Lee, Junmar) Date: Fri, 16 Mar 2001 18:50:50 +0800 Subject: [XML-SIG] Help Message-ID: Hi, I wonder if you could help me out. I'm new at this so please forgive my ignorance. I just downloaded BeOpen-Python-2.0.exe and installed it. I then downloaded PythonXML.exe and installed that. Then I downloaded PyXML-0.6.4.win32-py2.0.exe and installed it also. My query is, what now? How do I get the XML parser to run? I read in the docs that Python/XML has three(3) parsers. How do I run them? Can I make them into an EXE for Windows? How can I do this? How can I get an EXE to look at an XML file and its DTD and say if it is well-formed and all those other XML parsing tools? Basically, I was looking for an XML parser written in Python that will run in Windows or DOS. Sorry for all the questions and for being ignorant. I just hope that you'll be able to help me out. Regards, Junmar :-) From martin@loewis.home.cs.tu-berlin.de Mon Mar 19 22:34:05 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 19 Mar 2001 23:34:05 +0100 Subject: [XML-SIG] error with xhtml strict dtd In-Reply-To: <3AB67470.15E08846@msu.edu> (message from Scott Harrison on Mon, 19 Mar 2001 16:04:48 -0500) References: <3AB67470.15E08846@msu.edu> Message-ID: <200103192234.f2JMY5H01326@mira.informatik.hu-berlin.de> > What should be done with this situation below? In general, you might submit a bug report to sourceforge.net/projects/python. > And do you have a mailing list? Sure, xml-sig@python.org. > Trying to use pyxml with xhtml (using current cvs version). > xmlproc_val > > E:http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd:316:3: xml:space > must have exactly the values 'default' and 'preserve' Please have a look at the thread starting at http://mail.python.org/pipermail/xml-sig/2000-October/003520.html It appeared to me (and to the author of xmlproc) that XML 1.0 says that the XHTML DTD is invalid (http://mail.python.org/pipermail/xml-sig/2000-October/003523.html) There is an erratum for XML 1.0 that says that this was a mistake in XML 1.0, which was corrected with http://www.w3.org/XML/xml-19980210-errata#E81 Lars Marius Garshol (the xmlproc author) indicated in http://mail.python.org/pipermail/xml-sig/2000-October/003527.html that he has a fix for this problem; so far, he has not managed to contribute this fix into PyXML. You propose the patch Index: xml/parsers/xmlproc/xmldtd.py =================================================================== RCS file: /cvsroot/pyxml/xml/xml/parsers/xmlproc/xmldtd.py,v retrieving revision 1.11 diff -r1.11 xmldtd.py 408c408 < if len(self.type)!=2: --- > if (len(self.type)!=2) and (len(self.type)!=1): 409a410,411 > elif len(self.type)==1: > error=0 As a procedural note, please always submit unified (-u) or context (-c) diffs; they are easier to read and also continue to work if the file is slightly modified. This particular patch seems incorrect: If there is a single value to xml:space, it *still* must be either "default" or "preserve"; your patch does not perform this patch. In any case, I still hope that Lars will contribute his changes later this year. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Mon Mar 19 22:47:33 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 19 Mar 2001 23:47:33 +0100 Subject: [XML-SIG] Help In-Reply-To: (j.lee@spitech.com) References: Message-ID: <200103192247.f2JMlXA01445@mira.informatik.hu-berlin.de> > I just downloaded BeOpen-Python-2.0.exe and installed it. That is a good starting point. > I then downloaded PythonXML.exe and installed that. I don't know what that is - where did you get it? > Then I downloaded PyXML-0.6.4.win32-py2.0.exe and installed it also. That is also a good thing. > My query is, what now? You now need to write a Python program that makes use of the packages. > How do I get the XML parser to run? I read in the docs that > Python/XML has three(3) parsers. How do I run them? That is a somewhat surprising request. Why do you want to run the parsers? An XML parser, when run, typically does not do much (*). To write a program that is a simple XML parser, please try import sys, xml.sax parser = xml.sax.parse(sys.argv[1], xml.sax.ContentHandler()) > Can I make them into an EXE for Windows? How can I do this? There is a number of ways to make a Python program into an executable. These are independent from PyXML; please see the Python FAQ for details. > How can I get an EXE to look at an XML file and its DTD and say if > it is well-formed and all those other XML parsing tools? The script above invokes a non-validating parser, so it will only tell you if it is well-formed, not whether it is valid. To run a validating parser, you need to instantiate xmlproc. The next release of PyXML will actually include two command line utilities to run xmlproc; they offer a few more features, though (such as outputting ESIS) - i.e. they offer some specific processing. Regards, Martin (*) It does perform the well-formedness check, and might even perform validation. So all you get out of it are ill-formedness and invalidity errors. If that is all you need PyXML might not be the appropriate choice of tool. From tpassin@home.com Mon Mar 19 23:16:13 2001 From: tpassin@home.com (Thomas B. Passin) Date: Mon, 19 Mar 2001 18:16:13 -0500 Subject: [XML-SIG] iso8601 module: re-creating an original date References: Message-ID: <001201c0b0ca$9793c4a0$7cac1218@reston1.va.home.com> Andrew Kuchling had a very good idea - > I've noticed that xml.utils.iso8601 doesn't provide enough information > to allow parsing and then re-creating a date. iso8601.parse() takes a > string and returns the value in seconds since the epoch. There's no > way to tell if the original date string was '2000-01-01' or '2000' or > '2000-01-01T00:00'. You also can't parse the date manually in the > event you want an mxDateTime instead of just seconds, which means you > can't handle very old or very futuristic dates. > > I'd like to add support for being precise and figuring out exactly > what was provided, but we need to discuss the interface a bit. > > One possible API: parse_tuple(string) which returns a 9-tuple like the > one from time.gmtime() or time.localtime(), except that fields not > provided are represented by None, not 0. (This means you can't pass > the tuple to functions like time.mktime() without first converting > None to 0.) An alternative interface would be to return a dictionary > of fields, or an object with attributes. > I favor a dictionary or an object that may contain or act like one. In favor of an object, you could add various conversion methods as it seems they are needed, and still by backwards compatible with older methods. Cheers, Tom P From martin@loewis.home.cs.tu-berlin.de Tue Mar 20 07:17:31 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 20 Mar 2001 08:17:31 +0100 Subject: [XML-SIG] Getting namespace aware parser to work... In-Reply-To: (cce@clarkevans.com) References: Message-ID: <200103200717.f2K7HVG01327@mira.informatik.hu-berlin.de> > I'm trying to process the following xml file, with > this python script to strip all elements with a > given namespace. I believe that I have a pretty > recent version (0.5.2). I get the error following... Thanks for your bug report. It would be interesting to find out what version you are using; 0.5.x is not fairly recent - 0.6.2 would be. In any case, I cannot reproduce the problem with 0.6.4, and I doubt anything relevant has changed since 0.6.2 in this respect. What version of Expat are you using (the one included with PyXML or a different one)? Looking at the error you get > qname = self._current_context[name[0]] + ":" + name[1] > TypeError: bad operand type(s) for + I would really like to know what self._current_context[name[0]] and name[1] are at this point. I found a problem with default namespaces, but otherwise, the code appears to be correct. Regards, Martin P.S. Please send patches as unified (-u) or context (-c) diffs. From larsga@garshol.priv.no Tue Mar 20 08:11:56 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 20 Mar 2001 09:11:56 +0100 Subject: [XML-SIG] Namespace Stripper Filter In-Reply-To: References: Message-ID: * Clark C. Evans | | def startElementNS(self, name, qname, attrs): | """Ignores elements and strips attributes of nmsp""" | if name[0] != self.nmsp: | # | # Warning: For efficiency this dives into the | # underlying representation of AttributesNSImpl | # and deletes attributes to be stripped. | # | # _attrs should be of the form {(ns_uri, lname): value, ...}. | # _qnames of the form {(ns_uri, lname): qname, ...}.""" | # | for (ns_uri,lname) in attrs._attrs.keys(): | if self.nmsp == ns_uri: del attrs._attrs[(ns_uri,lname)] | self._cont_handler.startElementNS(name,qname,attrs) | else: | self.depth = self.depth + 1 This isn't really a good idea, since there is no guarantee that you will in fact get AttributesNSImpl instances. The only thing that is guaranteed is that the objects you get will follow that interface. It is very likely that many SAX drivers, such as the Jython SAX driver, will not use this class, but reimplement the interface in a class specific to themselves. Otherwise it looked fine to me. --Lars M. From larsga@garshol.priv.no Tue Mar 20 08:13:26 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 20 Mar 2001 09:13:26 +0100 Subject: [XML-SIG] Minidom In-Reply-To: <200103192135.f2JLZ8H01030@mira.informatik.hu-berlin.de> References: <3AB60C48.C66BB433@itam.zabrze.pl> <200103192135.f2JLZ8H01030@mira.informatik.hu-berlin.de> Message-ID: * Martin v. Loewis | | With PyXML, xml.sax.sax2exts.XMLValParserFactory.make_parser() will | create you a validating SAX parser (namely, xmlproc, unless | additional validating parsers have been registered). We shouldn't be using sax2exts any more, since that is just a legacy thing left over from an old SAX 2.0 version. In fact, we should aim to rip all that stuff out before too long. --Lars M. From martin@loewis.home.cs.tu-berlin.de Tue Mar 20 08:26:57 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 20 Mar 2001 09:26:57 +0100 Subject: [XML-SIG] Minidom In-Reply-To: (message from Lars Marius Garshol on 20 Mar 2001 09:13:26 +0100) References: <3AB60C48.C66BB433@itam.zabrze.pl> <200103192135.f2JLZ8H01030@mira.informatik.hu-berlin.de> Message-ID: <200103200826.f2K8QvC01632@mira.informatik.hu-berlin.de> > | With PyXML, xml.sax.sax2exts.XMLValParserFactory.make_parser() will > | create you a validating SAX parser (namely, xmlproc, unless > | additional validating parsers have been registered). > > We shouldn't be using sax2exts any more, since that is just a legacy > thing left over from an old SAX 2.0 version. In fact, we should aim to > rip all that stuff out before too long. Then, of course, the question is: How do you create a parser that supports validation? Regards, Martin From loewis@informatik.hu-berlin.de Tue Mar 20 08:51:27 2001 From: loewis@informatik.hu-berlin.de (Martin von Loewis) Date: Tue, 20 Mar 2001 09:51:27 +0100 (MET) Subject: [XML-SIG] PyXML 0.6.5 is released Message-ID: <200103200851.JAA14502@pandora.informatik.hu-berlin.de> Version 0.6.5 of the Python/XML distribution is now available. It should be considered a beta release, and can be downloaded from the following URLs: http://download.sourceforge.net/pyxml/PyXML-0.6.5.tar.gz http://download.sourceforge.net/pyxml/PyXML-0.6.5.win32-py1.5.exe http://download.sourceforge.net/pyxml/PyXML-0.6.5.win32-py2.0.exe http://download.sourceforge.net/pyxml/PyXML-0.6.5-1.5.2.i386.rpm http://download.sourceforge.net/pyxml/PyXML-0.6.5-2.0.i386.rpm Changes in this version, compared to 0.6.4: * setup supports two command line options, --with-libexpat and --ldflags to specify an alternative Expat installation. * Fourthought has contributed a new type xml.utils.boolean to distinguish boolean from integral values. * The scripts xmlproc_parse and xmlproc_val, which allow command-line interaction with xmlproc, are now included. * The WDDX marshalling now supports a "strict" and a "loose" mode of operation. * minidom now supports the DocumentFragment interface, and correctly sets the ownerDocument property. * A SAX exception now retrieves line number information when it is created, not when it is printed. * Invoking sax2lib.ValidatingReaderFactory.make_parser creates a reader object that is already set to validating mode. * A number of callback errors in the SAX2 xmlproc driver have been corrected. The Python/XML distribution contains the basic tools required for processing XML data using the Python programming language, assembled into one easy-to-install package. The distribution includes parsers and standard interfaces such as SAX and DOM, along with various other useful modules. =20 The package currently contains: * XML parsers: Pyexpat (Jack Jansen), xmlproc (Lars Marius Garshol), sgmlop (Fredrik Lundh). * SAX interface (Lars Marius Garshol) * minidom DOM implementation (Paul Prescod) * 4DOM from Fourthought (Uche Ogbuji, Mike Olson) * Various utility modules and functions (various people) * Documentation and example programs (various people) The code is being developed bazaar-style by contributors from the Python XML Special Interest Group, so please send comments, questions, or bug reports to . For more information about Python and XML, see: http://www.python.org/topics/xml/ --=20 Martin v. L=F6wis http://www.informatik.hu-berlin.de/~loewis From frank63@ms5.hinet.net Tue Mar 20 09:37:19 2001 From: frank63@ms5.hinet.net (Frank Chen) Date: Tue, 20 Mar 2001 17:37:19 +0800 Subject: [XML-SIG] Re:WBXML References: Message-ID: <000c01c0b121$ea6b1fa0$f5a01ea3@MiTACUser> > > I'm new to this SIG mailing list, and have looked over the XML-SIG Status > page but could not find any reference to WBXML a WAP Binary XML standard. > > Anyone out there working with this or another form of XML that is optimized > both for space and ease of parsing? > Maybe you should look for Java Xerces about WAP. I remembered that there are some works for that. Frank From fdrake@acm.org Tue Mar 20 14:23:06 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 20 Mar 2001 09:23:06 -0500 (EST) Subject: [XML-SIG] Minidom In-Reply-To: References: <3AB60C48.C66BB433@itam.zabrze.pl> <200103192135.f2JLZ8H01030@mira.informatik.hu-berlin.de> Message-ID: <15031.26570.616169.648499@cj42289-a.reston1.va.home.com> Lars Marius Garshol writes: > We shouldn't be using sax2exts any more, since that is just a legacy > thing left over from an old SAX 2.0 version. In fact, we should aim to > rip all that stuff out before too long. Perhaps with Python 2.1 a DeprecationWarning should be issued? try: import warnings except ImportError: pass else: warnings.warn("sax2exts has been deprecated; use...", DeprecationWarning) -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From martin@loewis.home.cs.tu-berlin.de Tue Mar 20 16:18:26 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 20 Mar 2001 17:18:26 +0100 Subject: [XML-SIG] xml.xpath and xml.xslt are available Message-ID: <200103201618.f2KGIQT08759@mira.informatik.hu-berlin.de> I've added two packages to the XML package, xml.xpath and xml.xslt. These are heavily based on 4XPath 4XSLT, but use PyXPath as the expression parser. In theory, it should be possible to plug them into a 4Suite installation, or use them stand-alone (without 4Suite). In practice, much of the test suite passes, but there are still some issues left. On the plus side, this has the chance of fixing the 4XSLT bugs related to character sets, as the packages fully support Unicode. To get an overview what has been taking literally from 4Suite and what has been adopted, please have a look at xml/xpath/README.4XPath. Over the next few months, we will strive to reduce the dependency on a particular parser, and on Ft.Lib, so that really most of the files become the same eventually. If you find any problems, please let me know. Regards, Martin From noreply@sourceforge.net Tue Mar 20 17:14:14 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 20 Mar 2001 09:14:14 -0800 Subject: [XML-SIG] [ pyxml-Patches-410065 ] Range.surroundContents() Message-ID: Patches item #410065, was updated on 2001-03-20 09:14 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=410065&group_id=6473 Category: 4Suite Group: None Status: Open Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: Range.surroundContents() Initial Comment: Hi. I just started using your lib. First of all: congratulations, good job. However, the surroundContents-Method of the Range- Object is broken (at least in my version [0.10.2 ], maybe you fixed it already). First of all, calling it surround instead of surrond would keep newbies like am am with Python from getting serious problems with their self-esteem ;-) The major issue is, that you called insertNode after having removed the Range's contents by calling extractContents, which has to fail, scince then arbitrary siblings of the Range are at self.startOffset, somtimes None. Here is a fix, maybe there is a more elegant solution, I already mentioned I'm a newbie to Python. Regards Henrik Motakef 884,885c884,885 < def surroundContents(self,newParent): < """Surround the range with this node""" --- > def surrondContents(self,newParent): > """Surrond the range with this node""" 916c916 < df = self.cloneContents() --- > df = self.extractContents() 918,922c918 < newParent.appendChild(df) < < refNode = self.startContainer.childNodes [self.startOffset] < < self.startContainer.insertBefore(newParent, refNode) --- > self.insertNode(newParent) 924c920 < self.startContainer.removeChild (newParent.nextSibling) --- > newParent.appendChild(df) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=410065&group_id=6473 From daniel pearson via RT Tue Mar 20 17:24:00 2001 From: daniel pearson via RT (daniel pearson via RT) Date: Tue, 20 Mar 2001 12:24:00 -0500 (EST) Subject: [XML-SIG] [fm #6671] (news-admins) Submission report - Python/XML update Message-ID: <20010320172400.4D3C582FFF@mail.freshmeat.net> The following notes are in response to a recent freshmeat.net submission: - Martin v. Löwis has requested ownership of the freshmeat listing for Python/XML. Do you approve of this? This contribution cannot be processed until you take appropriate action on your part and get back to us. Sincerely, daniel pearson Note: Make sure you include the prefix '[fm #6671]' in the subject when replying to this email. --- Headers Follow --- >From nobody@freshmeat.net Tue Mar 20 12:23:59 2001 Return-Path: Delivered-To: news-admins@freshmeat.net Received: from www2.freshmeat.net (freshmeat.net [64.28.67.35]) by mail.freshmeat.net (Postfix) with ESMTP id CC63182FAE for ; Tue, 20 Mar 2001 12:23:59 -0500 (EST) Received: by www2.freshmeat.net (Postfix, from userid 65534) id E592ED6561; Tue, 20 Mar 2001 12:23:59 -0500 (EST) To: news-admins@freshmeat.net Subject: [fm #6671] (news-admins) Submission report - Python/XML update From: daniel pearson Message-Id: <20010320172359.E592ED6561@www2.freshmeat.net> Date: Tue, 20 Mar 2001 12:23:59 -0500 (EST) Sender: nobody@freshmeat.net -------------------------------------------- Managed by Request Tracker From stuartd@alerton.com Tue Mar 20 17:22:30 2001 From: stuartd@alerton.com (Stuart Donaldson) Date: Tue, 20 Mar 2001 09:22:30 -0800 Subject: [XML-SIG] Re: WBXML Message-ID: >From: "Frank Chen" >To: >Date: Tue, 20 Mar 2001 17:37:19 +0800 >Subject: [XML-SIG] Re:WBXML >> >> I'm new to this SIG mailing list, and have looked over the XML-SIG Status >> page but could not find any reference to WBXML a WAP Binary XML standard. >> >> Anyone out there working with this or another form of XML that is >optimized >> both for space and ease of parsing? >> > >Maybe you should look for Java Xerces about WAP. I remembered that there are >some works for that. > >Frank Thus far everything I have found regarding WBXML and most everything for WAP has been Java based. But I have a python application that I would like to incorporate these features in. And since the entire reason for looking at WBXML is performance and simplicity, the idea of using a Java layer in between just doesn't make sense. -Stuart- From Rich Salz via RT Tue Mar 20 17:52:40 2001 From: Rich Salz via RT (Rich Salz via RT) Date: Tue, 20 Mar 2001 12:52:40 -0500 (EST) Subject: [XML-SIG] [fm #6671] (news-admins) Submission report - Python/XML Message-ID: <20010320175240.61EF083043@mail.freshmeat.net> yes! --- Headers Follow --- >From rsalz@zolera.com Tue Mar 20 12:52:40 2001 Return-Path: Delivered-To: news-admins@freshmeat.net Received: from zolera.com (unknown [63.142.188.177]) by mail.freshmeat.net (Postfix) with ESMTP id 2A85C82FFF for ; Tue, 20 Mar 2001 12:52:39 -0500 (EST) Received: from zolera.com (os390.zolera.com [10.0.1.9]) by zolera.com (8.9.3/8.9.3) with ESMTP id MAA02243 for ; Tue, 20 Mar 2001 12:55:09 -0500 Sender: rsalz@zolera.com Message-ID: <3AB7997D.623044BB@zolera.com> Date: Tue, 20 Mar 2001 12:55:09 -0500 From: Rich Salz X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.14-5.0 i686) X-Accept-Language: en MIME-Version: 1.0 To: daniel pearson via RT Subject: Re: [XML-SIG] [fm #6671] (news-admins) Submission report - Python/XML update References: <20010320172400.4D3C582FFF@mail.freshmeat.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit -------------------------------------------- Managed by Request Tracker From Request Tracker Tue Mar 20 17:55:05 2001 From: Request Tracker (Request Tracker) Date: Tue, 20 Mar 2001 12:55:05 -0500 (EST) Subject: [XML-SIG] [fm #6671] (news-admins) Submission report - Python/XML update Message-ID: <20010320175505.DB0F183043@mail.freshmeat.net> On Tue, Mar 20, 2001 at 12:24:00PM -0500, daniel pearson via RT wrote: >The following notes are in response to a recent freshmeat.net submission: >- Martin v. Löwis has requested ownership of > the freshmeat listing for Python/XML. Do you approve of this? Yes, I approve; Martin has taken over maintenance of Python/XML from me. --amk --- Headers Follow --- >From akuchlin@mems-exchange.org Tue Mar 20 12:55:05 2001 Return-Path: Delivered-To: news-admins@freshmeat.net Received: from ute.cnri.reston.va.us (cnri44.cnri.reston.va.us [132.151.1.44]) by mail.freshmeat.net (Postfix) with ESMTP id 1A1B682FFF for ; Tue, 20 Mar 2001 12:55:05 -0500 (EST) Received: from akuchlin by ute.cnri.reston.va.us with local (Exim 3.20 #1) id 14fQLP-0003c7-00 for news-admins@freshmeat.net; Tue, 20 Mar 2001 12:54:55 -0500 Date: Tue, 20 Mar 2001 12:54:55 -0500 From: Andrew Kuchling To: daniel pearson via RT Subject: Re: [XML-SIG] [fm #6671] (news-admins) Submission report - Python/XML update Message-ID: <20010320125455.A13770@ute.cnri.reston.va.us> Reply-To: akuchlin@mems-exchange.org References: <20010320172400.4D3C582FFF@mail.freshmeat.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.2i In-Reply-To: <20010320172400.4D3C582FFF@mail.freshmeat.net>; from news-admins@freshmeat.net on Tue, Mar 20, 2001 at 12:24:00PM -0500 -------------------------------------------- Managed by Request Tracker From martin@loewis.home.cs.tu-berlin.de Tue Mar 20 18:16:15 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 20 Mar 2001 19:16:15 +0100 Subject: [XML-SIG] Re: WBXML In-Reply-To: (message from Stuart Donaldson on Tue, 20 Mar 2001 09:22:30 -0800) References: Message-ID: <200103201816.f2KIGFo09464@mira.informatik.hu-berlin.de> > Thus far everything I have found regarding WBXML and most everything > for WAP has been Java based. But I have a python application that I > would like to incorporate these features in. And since the entire > reason for looking at WBXML is performance and simplicity, the idea > of using a Java layer in between just doesn't make sense. I'm not aware of any WBXML libraries for Python. What kind of support are you looking for? For a parser, it would probably be most meaningful if a SAX reader was implemented; for generating WBXML, an algorithm operating on a DOM tree is probably most useful. Are you interested in contributing any code to that respect? Regards, Martin From stuartd@alerton.com Tue Mar 20 18:54:32 2001 From: stuartd@alerton.com (Stuart Donaldson) Date: Tue, 20 Mar 2001 10:54:32 -0800 Subject: [XML-SIG] Re: WBXML Message-ID: >-----Original Message----- >From: Martin v. Loewis [mailto:martin@loewis.home.cs.tu-berlin.de] >Sent: Tuesday, March 20, 2001 10:16 AM >To: Stuart Donaldson >Cc: xml-sig@python.org >Subject: Re: [XML-SIG] Re: WBXML > >> Thus far everything I have found regarding WBXML and most everything >> for WAP has been Java based. But I have a python application that I >> would like to incorporate these features in. And since the entire >> reason for looking at WBXML is performance and simplicity, the idea >> of using a Java layer in between just doesn't make sense. > >I'm not aware of any WBXML libraries for Python. What kind of support >are you looking for? > >For a parser, it would probably be most meaningful if a SAX reader was >implemented; for generating WBXML, an algorithm operating on a DOM >tree is probably most useful. > >Are you interested in contributing any code to that respect? > >Regards, >Martin I'm looking for both reading and generating. I would certainly be willing to contribute anything I generate if I decide to go this route. Currently I am looking at WBXML as one possible solution, with the hoped for advantage being an existing code base. If I have to write it all then much of that advantage goes out the window. Is there much interest in a WBXML SAX reader implementation? -Stuart- From Thomas B. Passin" via RT yes ----- Original Message ----- From: "daniel pearson via RT" To: "Python XML-SIG" Sent: Tuesday, March 20, 2001 12:24 PM Subject: [XML-SIG] [fm #6671] (news-admins) Submission report - Python/XML update > The following notes are in response to a recent freshmeat.net submission: > > - Martin v. Löwis has requested ownership of > the freshmeat listing for Python/XML. Do you approve of this? > > This contribution cannot be processed until you take appropriate action on your > part and get back to us. > > Sincerely, > daniel pearson > > > > Note: Make sure you include the prefix '[fm #6671]' > in the subject when replying to this email. > > > --- Headers Follow --- > > >From nobody@freshmeat.net Tue Mar 20 12:23:59 2001 > Return-Path: > Delivered-To: news-admins@freshmeat.net > Received: from www2.freshmeat.net (freshmeat.net [64.28.67.35]) > by mail.freshmeat.net (Postfix) with ESMTP id CC63182FAE > for ; Tue, 20 Mar 2001 12:23:59 -0500 (EST) > Received: by www2.freshmeat.net (Postfix, from userid 65534) > id E592ED6561; Tue, 20 Mar 2001 12:23:59 -0500 (EST) > To: news-admins@freshmeat.net > Subject: [fm #6671] (news-admins) Submission report - Python/XML update > From: daniel pearson > Message-Id: <20010320172359.E592ED6561@www2.freshmeat.net> > Date: Tue, 20 Mar 2001 12:23:59 -0500 (EST) > Sender: nobody@freshmeat.net > > -------------------------------------------- Managed by Request Tracker > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > --- Headers Follow --- >From tpassin@home.com Tue Mar 20 18:27:32 2001 Return-Path: Delivered-To: news-admins@freshmeat.net Received: from femail15.sdc1.sfba.home.com (femail15.sdc1.sfba.home.com [24.0.95.142]) by mail.freshmeat.net (Postfix) with ESMTP id 4354382FFB for ; Tue, 20 Mar 2001 18:27:32 -0500 (EST) Received: from cj64132b ([24.18.172.124]) by femail15.sdc1.sfba.home.com (InterMail vM.4.01.03.20 201-229-121-120-20010223) with SMTP id <20010320232732.XSGT23165.femail15.sdc1.sfba.home.com@cj64132b> for ; Tue, 20 Mar 2001 15:27:32 -0800 Message-ID: <001001c0b195$c26953e0$7cac1218@reston1.va.home.com> From: "Thomas B. Passin" To: "daniel pearson via RT" References: <20010320172400.4D3C582FFF@mail.freshmeat.net> Subject: Re: [XML-SIG] [fm #6671] (news-admins) Submission report - Python/XML update Date: Tue, 20 Mar 2001 18:30:32 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MIMEOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 -------------------------------------------- Managed by Request Tracker From ravi.nagaraja@wipro.com Wed Mar 21 09:33:09 2001 From: ravi.nagaraja@wipro.com (Ravi Nagaraja) Date: Wed, 21 Mar 2001 15:03:09 +0530 Subject: [XML-SIG] Cannot install XML package Message-ID: <3AB87555.E860D2C9@wipro.com> Hi, I downloaded the Python version 2.0 and installed. I then downloaded the python XML package PyXML-0_6_2_win32-py2_0.exe I also installed the XML package. I also downloaded the file - PyXML-0_6_2_tar.gz and extracted it to a dir. When i tried to run the command: python setup.py build , It stops with an error message : No such command : cl.exe What other files should i have to install the XML package ? Thanks and regards Ravi.N From guenter.radestock@sap.com Wed Mar 21 11:05:51 2001 From: guenter.radestock@sap.com (Radestock, Guenter) Date: Wed, 21 Mar 2001 12:05:51 +0100 Subject: [XML-SIG] Error handling in PyExpat Message-ID: Hello, I am using PyExpat to parse XML files and sometimes these files are not correct. If I find an error in my handler (start_element, end_element or characters), I raise an exception and abort processing the XML file. If I raise the exception my self in the handler, parser.ErrorLineNumber (and other variables describing the error position) are not available to my code (ErrorLineNumber contains a random value); that is in the exception handler that catches my exception. It should be possible to detect the exception in the expat parser module and set call set_error() in pyexpat.c if the information is available from expat. I could not check the expat documentation right now (sourceforge is currently unavailable and I don't have it locally) but I hope, somebody has thought of this. Unfortunately the (C level) handlers are void functions so there must be another way to tell expat that processing has failed. I have checked my (between PyXML-0.6.3 and 0.6.4) PyExpat source and the xmlplus sources for the SAX implementation but did not find the code I am looking for. Are there plans to implement this or should I do it my self? What I need is: If I raise an exception inside a handler, pyexpat.c.set_error() should be called (or some other function that gets line number, column number, byte posision etc.). I am not sure if this should be done for every exception or only for subclasses of expat.error. Thanks in advance for any help. - Guenter From martin@loewis.home.cs.tu-berlin.de Wed Mar 21 12:07:21 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 21 Mar 2001 13:07:21 +0100 Subject: [XML-SIG] Cannot install XML package In-Reply-To: <3AB87555.E860D2C9@wipro.com> (ravi.nagaraja@wipro.com) References: <3AB87555.E860D2C9@wipro.com> Message-ID: <200103211207.f2LC7LU01830@mira.informatik.hu-berlin.de> > I downloaded the Python version 2.0 and installed. > I then downloaded the python XML package PyXML-0_6_2_win32-py2_0.exe Did you try to run this file? Also, I'd recomment to use 0.6.5 instead of 0.6.2. > I also installed the XML package. > I also downloaded the file - PyXML-0_6_2_tar.gz and extracted it to a > dir. > When i tried to run the command: python setup.py build , > It stops with an error message : No such command : cl.exe > > What other files should i have to install the XML package ? To install it from sources, you need a C++ compiler (Visual C++). To install the binary distribution (.exe), you need only Python 2.0. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Wed Mar 21 13:58:34 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 21 Mar 2001 14:58:34 +0100 Subject: [XML-SIG] Error handling in PyExpat In-Reply-To: (guenter.radestock@sap.com) References: Message-ID: <200103211358.f2LDwYK02510@mira.informatik.hu-berlin.de> > I am using PyExpat to parse XML files and sometimes these files are > not correct. If I find an error in my handler (start_element, > end_element or characters), I raise an exception and abort > processing the XML file. If I raise the exception my self in the > handler, parser.ErrorLineNumber (and other variables describing the > error position) are not available to my code (ErrorLineNumber > contains a random value); that is in the exception handler that > catches my exception. Yes, expat does not support user-identified error lines. However, it should be possible to propagate such information with the exception that you raise. > It should be possible to detect the exception in the expat parser > module and set call set_error() in pyexpat.c if the information is > available from expat. Not sure what you mean. set_error generates a Python exception when the expat parser has produced an error. That has nothing to do with errors that callback functions might have found. > Unfortunately the (C level) handlers are void functions so there > must be another way to tell expat that processing has failed. I don't think so. This is C, so there is no means of exception handling. Once a callback is invoked, it is safe to assume that the XML in itself is correct. You have to let expat finish parsing before it returns to you (AFAIK). Of course, once pyexpat has seen a Python exception, all callbacks are cleared, so no further events get reported. > I have checked my (between PyXML-0.6.3 and 0.6.4) PyExpat source and > the xmlplus sources for the SAX implementation but did not find the > code I am looking for. Are there plans to implement this or should > I do it my self? In expat proper? Not my plan, certainly. In pyexpat? Don't know how. If you can come up with some code to do what you want, that would be good. > If I raise an exception inside a handler, pyexpat.c.set_error() > should be called > (or some other function that gets line number, column number, byte posision > etc.). flag_error is called in that case; I don't think it should manipulate the user's exception object. Regards, Martin From guenter.radestock@sap.com Wed Mar 21 15:57:23 2001 From: guenter.radestock@sap.com (Radestock, Guenter) Date: Wed, 21 Mar 2001 16:57:23 +0100 Subject: [XML-SIG] Error handling in PyExpat Message-ID: > From: Martin v. Loewis [mailto:martin@loewis.home.cs.tu-berlin.de] > Sent: Mittwoch, 21. M=E4rz 2001 14:59 > To: Radestock, Guenter > Cc: XML-SIG@python.org; Faerber, Franz > Subject: Re: [XML-SIG] Error handling in PyExpat >=20 >=20 > > I am using PyExpat to parse XML files and sometimes these files are > > not correct. If I find an error in my handler (start_element, > > end_element or characters), I raise an exception and abort > > processing the XML file. If I raise the exception my self in the > > handler, parser.ErrorLineNumber (and other variables describing the > > error position) are not available to my code (ErrorLineNumber > > contains a random value); that is in the exception handler that > > catches my exception. >=20 > Yes, expat does not support user-identified error lines. However, it > should be possible to propagate such information with the exception > that you raise. Sorry - I missed it somehow. ErrorLineNumber gave me numbers outside=20 the document - probably because I called it only after parsing, but ErrorByteIndex has the right value, at least before I raise the exception. The values will be incorrect in the exception handler because the parser continues, I guess. Probably the parsing will = continue, but my handlers will not be called anymore because PyExpat (not Expat itself) knows about the exception? > > Unfortunately the (C level) handlers are void functions so there > > must be another way to tell expat that processing has failed. >=20 > I don't think so. This is C, so there is no means of exception > handling. Once a callback is invoked, it is safe to assume that the > XML in itself is correct. You have to let expat finish parsing before > it returns to you (AFAIK). OK so there is no way to stop Expat when things go south in the C level handler (they could have defined handlers int instead of void and = stopped parsing when somebody returned -1 ...). Seems PyExpat can't do any better this way. Thanks a lot. - Guenter PS: if you would stop calling handlers after a handler has raised an exception, you could freeze ErrorLine, ErroColumn and ErrorByteIndex to the values = they had when the (Python) handler returned to you. But it seems you don't = stop calling handlers. Probably I should do something like this in my = script. From martin@loewis.home.cs.tu-berlin.de Wed Mar 21 16:24:41 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 21 Mar 2001 17:24:41 +0100 Subject: [XML-SIG] Error handling in PyExpat In-Reply-To: (guenter.radestock@sap.com) References: Message-ID: <200103211624.f2LGOfH03075@mira.informatik.hu-berlin.de> > Sorry - I missed it somehow. ErrorLineNumber gave me numbers > outside the document - probably because I called it only after > parsing, but ErrorByteIndex has the right value, at least before I > raise the exception. The values will be incorrect in the exception > handler because the parser continues, I guess. Probably the parsing > will continue, but my handlers will not be called anymore because > PyExpat (not Expat itself) knows about the exception? All correct, AFAICT. > OK so there is no way to stop Expat when things go south in the C level > handler (they could have defined handlers int instead of void and stopped > parsing when somebody returned -1 ...). It looks like that. You may want to report that as a bug, at sourceforge.net/projects/expat. > PS: if you would stop calling handlers after a handler has raised an > exception, you could freeze ErrorLine, ErroColumn and ErrorByteIndex > to the values they had when the (Python) handler returned to you. > But it seems you don't stop calling handlers. All handlers are cleared in case of an error, so expat should not call anything anymore. It will still continue to operate until it runs out of data, or gets to the end of the document, or finds an XML error. Freezing the error location would be an option, but might not do what you expect - it would freeze the location of the last error that expat found, which is not necessarily related to what the application considers an error. If the real problem was a division by zero, or a NameError because of a typo in the callback - should that propagate into the state of the expat object? What you should do is to record the current position in the exception object. It appears that pyexpat does not support retrieven the *Current* information - any patch to that respect would be appreciated (*). Regards, Martin (*) I don't know *why* it does not expose XML_GetCurrentLineNumber etc; perhaps earlier versions did not support it? That might need some investigation. From noreply@sourceforge.net Thu Mar 22 00:35:31 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 21 Mar 2001 16:35:31 -0800 Subject: [XML-SIG] [ pyxml-Patches-410416 ] Minor C fixes to PyXML 0.6.5 Message-ID: Patches item #410416, was updated on 2001-03-21 16:35 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=410416&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: The Written Word (china) (tww-china) Assigned to: Nobody/Anonymous (nobody) Summary: Minor C fixes to PyXML 0.6.5 Initial Comment: Some of the C functions have a semicolon at the end. The patch at ftp://ftp.thewrittenword.com/outgoing/pub/PyXML-0.6.5.patch fixes them. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=410416&group_id=6473 From represearch@yahoo.com Wed Mar 21 19:52:33 2001 From: represearch@yahoo.com (reptile research) Date: Wed, 21 Mar 2001 19:52:33 Subject: [XML-SIG] (no subject) Message-ID: From alexandre.fayolle@free.fr Thu Mar 22 11:33:19 2001 From: alexandre.fayolle@free.fr (Alexandre Fayolle) Date: Thu, 22 Mar 2001 12:33:19 +0100 (MET) Subject: [XML-SIG] 4DOM compliance potential problem Message-ID: <985260799.3ab9e2ff426d3@imp.free.fr> Sourceforge seems to be down right now, so I post this directly to the list. I working on an XML-Java course, and I the existence of the specified attribute of the Attr interface was just brought into light to me. It seems to me that 4DOM does not comply on tht spec regarding this point. OTOH, the intended behaviour seems a real pain to implement (requires an access to the DTD when using validation, since the required info is not available from a SAX interface) I'm quite happy with the current implemetation, but maybe this incompliance should be documented somewhere. Alexandre 'freezing in London' Fayolle -- http://alexandre.fayolle.free.fr http://www.logilab.org Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). From martin@loewis.home.cs.tu-berlin.de Thu Mar 22 13:32:24 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 22 Mar 2001 14:32:24 +0100 Subject: [XML-SIG] 4DOM compliance potential problem In-Reply-To: <985260799.3ab9e2ff426d3@imp.free.fr> (message from Alexandre Fayolle on Thu, 22 Mar 2001 12:33:19 +0100 (MET)) References: <985260799.3ab9e2ff426d3@imp.free.fr> Message-ID: <200103221332.f2MDWOK03193@mira.informatik.hu-berlin.de> > It seems to me that 4DOM does not comply on tht spec regarding this > point. OTOH, the intended behaviour seems a real pain to implement > (requires an access to the DTD when using validation, since the > required info is not available from a SAX interface) A primary problem is that SAX does not suppot reporting whether the information came from the DTD or from the document, see http://lists.xml.org/archives/xml-dev/200102/msg00761.html David Megginson has no intent to add it to SAX (or to continue development of SAX, for that matter). Even *if* that information was available through SAX, you still need a validating parser to properly build the DOM tree - a non-validating parser would not guess that an absent attribute might need to appear in the tree. So it appears that the DOM requires a parser to read the DTD. However, they also write # XML does not mandate that a non-validating XML processor read and # process entity declarations made in the external subset or declared # in external parameter entities. In turn, I'd say that it is actually a bug in the DOM spec to mandate that the specified attribute "works" - it should be a three-state value: yes, no, maybe, and Attr nodes for unspecified but defaulted attributes should not be mandated. Regards, Martin From uche.ogbuji@fourthought.com Thu Mar 22 14:25:44 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 22 Mar 2001 07:25:44 -0700 Subject: [XML-SIG] 4DOM compliance potential problem In-Reply-To: Message from "Martin v. Loewis" of "Thu, 22 Mar 2001 14:32:24 +0100." <200103221332.f2MDWOK03193@mira.informatik.hu-berlin.de> Message-ID: <200103221425.HAA01411@localhost.localdomain> > > It seems to me that 4DOM does not comply on tht spec regarding this > > point. OTOH, the intended behaviour seems a real pain to implement > > (requires an access to the DTD when using validation, since the > > required info is not available from a SAX interface) > > A primary problem is that SAX does not suppot reporting whether the > information came from the DTD or from the document, see > > http://lists.xml.org/archives/xml-dev/200102/msg00761.html Yes. Lack of info from the low-level parsers has always been the problem here (we haven't written a dom.ext.readers.Xmlproc yet). > So it appears that the DOM requires a parser to read the DTD. However, > they also write > > # XML does not mandate that a non-validating XML processor read and > # process entity declarations made in the external subset or declared > # in external parameter entities. > > In turn, I'd say that it is actually a bug in the DOM spec to mandate > that the specified attribute "works" - it should be a three-state > value: yes, no, maybe, and Attr nodes for unspecified but defaulted > attributes should not be mandated. We complained to www-dom about this years ago, but all the discussion didn't, apparently, lead them to reconsider this. I must confess that I've tended to wave off that particular corner of DOM madness since then. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From Juergen Hermann" Hi! Is there any means in PyXML or other sources to serialize a SAX stream (i.e. w/o building an intermediary DOM tree)? Ciao, J=FCrgen -- J=FCrgen Hermann, Developer (jhe@webde-ag.de) WEB.DE AG, http://webde-ag.de/ From martin@loewis.home.cs.tu-berlin.de Fri Mar 23 11:12:36 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 23 Mar 2001 12:12:36 +0100 Subject: [XML-SIG] SAX Serializer In-Reply-To: (jh@web.de) References: Message-ID: <200103231112.f2NBCaF00792@mira.informatik.hu-berlin.de> > Is there any means in PyXML or other sources to serialize a SAX stream > (i.e. w/o building an intermediary DOM tree)? Sure. Pass it to a xml.sax.saxutils.XMLGenerator, and save the XML document. Not sure what kind of serialization you had in mind; this might actually be one of the more efficient and compact options (compared to, say, pickling something). Regards, Martin From akuchlin@mems-exchange.org Sat Mar 24 03:08:08 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Fri, 23 Mar 2001 22:08:08 -0500 Subject: [XML-SIG] iso8601: re-creating an original date II Message-ID: <20010323220808.A13896@newcnri.cnri.reston.va.us> I'm about halfway through my proposed course of enhancing iso8601.py to note which portions of a date were supplied, but clearly it's reinventing the wheel. The ISO8601Date class needs a converter to and from 9-tuples, seconds until the epoch, and string format, and it all feels like I'm reimplementing the C library or mxDateTime -- badly -- so I'm abandoning the effort. mxDateTime already includes an ISO-8601 class; from the docs it doesn't seem to support round-tripping, but that could be added, and it's probably less work than recreating lots of complicated time handling code. --amk From mal@lemburg.com Sat Mar 24 14:21:00 2001 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 24 Mar 2001 15:21:00 +0100 Subject: [XML-SIG] iso8601: re-creating an original date II References: <20010323220808.A13896@newcnri.cnri.reston.va.us> Message-ID: <3ABCAD4C.7462AE91@lemburg.com> Andrew Kuchling wrote: > > I'm about halfway through my proposed course of enhancing iso8601.py to note > which portions of a date were supplied, but clearly it's reinventing the > wheel. The ISO8601Date class needs a converter to and from 9-tuples, seconds > until the epoch, and string format, and it all feels like I'm reimplementing > the C library or mxDateTime -- badly -- so I'm abandoning the effort. > > mxDateTime already includes an ISO-8601 class; from the docs it doesn't seem > to support round-tripping, but that could be added, and it's probably less > work than recreating lots of complicated time handling code. mxDateTime has an ISO 8601 parser, not a special ISO 8601 class. I am not sure what you mean with "round-tripping" -- could you explain ? -- Marc-Andre Lemburg ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Pages: http://www.lemburg.com/python/ From noreply@sourceforge.net Mon Mar 26 12:37:47 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Mon, 26 Mar 2001 04:37:47 -0800 Subject: [XML-SIG] [ pyxml-Bugs-411350 ] 4XSLT xsl:attribute name not required Message-ID: Bugs item #411350, was updated on 2001-03-26 04:37 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=411350&group_id=6473 Category: 4Suite Group: None Status: Open Priority: 5 Submitted By: Alexandre Fayolle (afayolle) Assigned to: Nobody/Anonymous (nobody) Summary: 4XSLT xsl:attribute name not required Initial Comment: Version used : 4Suite 0.10.2 Using banzai When applied to any well formed document with 4XSLT, the following output is given: 4XSLT should raise an exception. Cheers, Alexandre 'back from the UK' Fayolle ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=411350&group_id=6473 From stuff4gary@hotmail.com Mon Mar 26 23:43:49 2001 From: stuff4gary@hotmail.com (gary cor) Date: Mon, 26 Mar 2001 23:43:49 Subject: [XML-SIG] After installing python2.0 what other packages should I intsall for XML? Message-ID: Hello people, I want to produce a web image database using XML and python. If anyone has already done this I would be grateful if they could recommend what I should install on Win 2000 and learn how to use? I want to the XML as a way of identifying my images and I want people to be able to edit my image descriptions from simple forms... And to be able to collect them to add comments etc. Gary _________________________________________________________________________ Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com. From jsydik@virtualparadigm.com Tue Mar 27 02:48:51 2001 From: jsydik@virtualparadigm.com (Jeremy J. Sydik) Date: Mon, 26 Mar 2001 20:48:51 -0600 Subject: [XML-SIG] After installing python2.0 what other packages should I intsall for XML? In-Reply-To: Message-ID: It depends on if you've installed Python already or not. If you have, the package at pyxml.sourceforge.net should get you up and running. If not, I've found that the activestate distribution has been nice to work with and maintain on Win2K (particularly because of the package database accessible through their site) -----Original Message----- From: xml-sig-admin@python.org [mailto:xml-sig-admin@python.org]On Behalf Of gary cor Sent: Monday, March 26, 2001 11:44 PM To: xml-sig@python.org Subject: [XML-SIG] After installing python2.0 what other packages should I intsall for XML? Hello people, I want to produce a web image database using XML and python. If anyone has already done this I would be grateful if they could recommend what I should install on Win 2000 and learn how to use? I want to the XML as a way of identifying my images and I want people to be able to edit my image descriptions from simple forms... And to be able to collect them to add comments etc. Gary _________________________________________________________________________ Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com. _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig From martin@loewis.home.cs.tu-berlin.de Tue Mar 27 13:52:31 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 27 Mar 2001 15:52:31 +0200 Subject: [XML-SIG] Unicode support in xmlproc Message-ID: <200103271352.f2RDqVx01477@mira.informatik.hu-berlin.de> I have committed a few changes to xmlproc which make it generate Unicode strings, and deal with most aspects of character sets in XML correctly (with respect to the recommendation). In particular, it honors the encoding attribute of the xml declaration and performs the optional autodetection of an encoding. Encoding information provided from a higher level (e.g. MIME content type) is still for further study (offering a set_input_encoding on the XMLCommonParser might be appropriate). On Python 1.5, a fallback procedure is used which only supports a subset of the character sets (namely, US-ASCII, UTF-8, and Latin-1); the application then receives UTF-8 encoded byte strings from xmlproc. AFAIK, the only missing aspect is proper support for Unicode in tag and attribute names; XML allows for a quite long list of characters, and I'm not sure how to best implement that. If anybody has an sre regular expression that correctly matches the Name production of XML, please let me know. This code has seen only little testing, so I'm pretty sure that there are bugs in it. If you find any problems, please post them to the list or on SF; ideally, the major problems should be resolved before 0.7 is released. Unfortunately, running the testsuite with xmlproc as the default parser does no good: many test cases expect an IncremementalParser, and drv_xmlproc is not incremental. Regards, Martin From larsga@garshol.priv.no Tue Mar 27 14:38:02 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 27 Mar 2001 16:38:02 +0200 Subject: [XML-SIG] Unicode support in xmlproc In-Reply-To: <200103271352.f2RDqVx01477@mira.informatik.hu-berlin.de> References: <200103271352.f2RDqVx01477@mira.informatik.hu-berlin.de> Message-ID: * Martin v. Loewis | | AFAIK, the only missing aspect is proper support for Unicode in tag | and attribute names; XML allows for a quite long list of characters, | and I'm not sure how to best implement that. If anybody has an sre | regular expression that correctly matches the Name production of XML, | please let me know. The question is also what the performance of that would be. Name matching is performed very very often, so any changes here strongly affect the overall performance of xmlproc. It may also be that we want to use a dictionary of characters for this. I think several avenues need to be explored here to find the best approach. | Unfortunately, running the testsuite with xmlproc as the default | parser does no good: many test cases expect an IncremementalParser, | and drv_xmlproc is not incremental. That's probably easy to fix, since xmlproc is incremental. --Lars M. From martin@loewis.home.cs.tu-berlin.de Tue Mar 27 16:57:43 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: 27 Mar 2001 18:57:43 +0200 Subject: [XML-SIG] Unicode support in xmlproc In-Reply-To: (message from Lars Marius Garshol on 27 Mar 2001 16:37:55 +0200) Message-ID: <200103271711.f2RHBtY02942@mira.informatik.hu-berlin.de> > The question is also what the performance of that would be. Name > matching is performed very very often, so any changes here strongly > affect the overall performance of xmlproc. That is certainly a problem. I had the hope that the Unicode character classes of Python 2.0 are related to what a BaseChar is in XML, but that turned out to be wrong: XML uses Unicode 2.0; the Python tables are based on Unicode 3.0. Also, many letters have been excluded from BaseChar which count as letters in Unicode. > It may also be that we want to use a dictionary of characters for > this. I think several avenues need to be explored here to find the > best approach. Indeed; I'll see what I can come up with. > That's probably easy to fix, since xmlproc is incremental. I'll look into that as well. Regards, Martin From Lance_Hill/OLS.OLS@olsinc.net Tue Mar 27 18:00:59 2001 From: Lance_Hill/OLS.OLS@olsinc.net (Lance_Hill/OLS.OLS@olsinc.net) Date: Tue, 27 Mar 2001 13:00:59 -0500 Subject: [XML-SIG] Swap images for text elements using xsl? Message-ID: Hi all, I have a page generated from a database which I would like to use to show status information. Currenly, I am using a table to display the text wrapped in each tag (generally a Y or N), but I would prefer to use an image selectred depending on the text element in each tag. I had thought that using a choose/which would work well, but I cannot figure out where to put it in the included code. Also, can I just subsitute a " ...etc. ...etc.
New Client Status
Thanks, Lance M. Hill From r.burton@180sw.com Tue Mar 27 20:19:14 2001 From: r.burton@180sw.com (Ross Burton) Date: 27 Mar 2001 21:19:14 +0100 Subject: [XML-SIG] Metadata in XBEL Message-ID: <985724354.4243.0.camel@eddie> Hi, I am involved in adding support for XBEL to Galeon, the GNOME Mozilla Gecko based browser. Initial export is working, but there are several issues related to the metadata elements which I would like some clarification about. The specification for metadata is vague, elements have a "owner" attribute which should be a URI. But what forms of metadata are valid? The DTD implies that there can be no children of metadata elements (the content is EMPTY). Currently Galeon-specific attributes are exported as follows: /home/users/ross/pictures/slashdot.png It does seem that the DTD is in error as requiring all metadata to be in the owner attribute is rather limiting. But what content is allowed as children of the metadata element? Just text? Or could the content of a metadata element be a free-form XML tree? For example: /home/users/ross/pictures/slashdot.org true This, although allowing more free-form data, is heavier for the application. We intent to be "polite" to XBEL data and store any unknown metadata so that it can be written out again - if only text is allowed this is trivial, otherwise tree fragments have to be stored. I hope this can be cleared up, Regards, Ross Burton From martin@loewis.home.cs.tu-berlin.de Tue Mar 27 21:30:51 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Tue, 27 Mar 2001 23:30:51 +0200 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <985724354.4243.0.camel@eddie> (message from Ross Burton on 27 Mar 2001 21:19:14 +0100) References: <985724354.4243.0.camel@eddie> Message-ID: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> > The specification for metadata is vague, elements have a > "owner" attribute which should be a URI. But what forms of metadata are > valid? The DTD implies that there can be no children of metadata > elements (the content is EMPTY). [...] > It does seem that the DTD is in error as requiring all metadata to be in > the owner attribute is rather limiting. That is clearly not the intent, so I'd agree that the DTD is in error. You'll have to ask Fred Drake to be sure; I *think* the idea was that metadata has a content model of ANY. The documentation makes it clear that "owner" is just to tell apart the various sources which may put metadata into the bookmark list: The \element{metadata} element is used as a container for all auxillary information related to a node which belongs to a single metadata scheme. The specific contents of \element{metadata} is highly dependent on the metadata scheme which applies; XML namespaces should be used to identify explicit markup used within the element. So the intent clearly is that content within the metadata element is possible, and may use XML markup. That, of course, would mean that a version 1.1 of XBEL needs to be issued, so perhaps this is the time to think about other pending improvements. > This, although allowing more free-form data, is heavier for the > application. We intent to be "polite" to XBEL data and store any > unknown metadata so that it can be written out again - if only text > is allowed this is trivial, otherwise tree fragments have to be > stored. You don't necessarily have to store tree fragments; you just need to find the matching closing tag. I don't know how your parsing technology works, but it seems that restring it to text cannot be such a big simplification: you have to do normal XML parsing, otherwise you won't properly deal with CDATA sections and other XML "features". Regards, Martin From dieter@handshake.de Tue Mar 27 20:57:09 2001 From: dieter@handshake.de (Dieter Maurer) Date: Tue, 27 Mar 2001 22:57:09 +0200 (CEST) Subject: [XML-SIG] Unicode support in xmlproc In-Reply-To: <138016683@toto.iv> Message-ID: <15040.65189.27587.140130@lindm.dm> Great! Dieter From ross@180sw.com Tue Mar 27 22:26:40 2001 From: ross@180sw.com (Ross Burton) Date: Tue, 27 Mar 2001 23:26:40 +0100 (BST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> Message-ID: On Tue, 27 Mar 2001, Martin v. Loewis wrote: > > The specification for metadata is vague, elements have a > > "owner" attribute which should be a URI. But what forms of metadata are > > valid? The DTD implies that there can be no children of metadata > > elements (the content is EMPTY). > [...] > > It does seem that the DTD is in error as requiring all metadata to be in > > the owner attribute is rather limiting. > > That is clearly not the intent, so I'd agree that the DTD is in error. > You'll have to ask Fred Drake to be sure; I *think* the idea was that > metadata has a content model of ANY. The documentation makes it clear > that "owner" is just to tell apart the various sources which may put > metadata into the bookmark list: > So the intent clearly is that content within the metadata element is > possible, and may use XML markup. Thought so. > That, of course, would mean that a version 1.1 of XBEL needs to be > issued, so perhaps this is the time to think about other pending > improvements. Less child nodes (such as ) and more attributes? :-) Only kidding but libxml (aka gnome-xml) isn't that good with child nodes. I miss W3C DOM... Thanks for the mail, it's confirmed what I thought (and feared... :-) Ross -- Ross Burton Software Engineer OneEighty Software Ltd Tel: +44 20 8263 2332 The Lansdowne Building Fax: +44 20 8263 6314 2 Lansdowne Road r.burton@180sw.com Croydon, Surrey CR9 2ER, UK http://www.180sw.com./ ==================================================================== Under the Regulation of Investigatory Powers (RIP) Act 2000 together with any and all Regulations in force pursuant to the Act OneEighty Software Ltd reserves the right to monitor any or all incoming or outgoing communications as provided for under the Act From cce@clarkevans.com Wed Mar 28 06:33:11 2001 From: cce@clarkevans.com (Clark C. Evans) Date: Wed, 28 Mar 2001 01:33:11 -0500 (EST) Subject: [XML-SIG] Simple wxPython XSLT Testing Tool using MSXML Message-ID: <Pine.LNX.4.21.0103280128060.13057-100000@clarkevans.com> Title says it all... I was wondering if Uche or someone else would replace "TransformStage" so that it used the FourThought XSLT processor... One item... the file "msxml" is generated using makepy, I'm opting for the "very static" approach; this could be written to use the dynamic methods as detailed on P204 of the Python Programming On Windows. I'm not using this method since I have a very small stripped down "msxml" that only has the items I need for my program. This is a development tool that I've been using to test XSLT. Share & Enjoy, Clark P.S. coding comments would be greatly apprechiated as I'm relatively new to python.... ... import sys, os import msxml import pythoncom from wxPython.wx import * from wxPython.html import * from wxPython.lib import wxpTag class OutputHtmlWindow(wxHtmlWindow): def __init__(self, parent, id): wxHtmlWindow.__init__(self, parent, id) def OnLinkClicked(self, linkinfo): self.base_OnLinkClicked(linkinfo) def OnSetTitle(self, title): self.base_OnSetTitle(title) class OpenFileLine(wxWindow): def __init__(self,parent,id,pos,mode,label): wxWindow.__init__(self,parent,id,pos,wxSize(400,30)) self.mode = mode wxStaticText(self, -1, label, wxPoint(0,5),wxSize(75,25),wxALIGN_RIGHT) self.text = wxTextCtrl(self,-1,"",wxPoint(75,0),wxSize(200,25)) wxButton(self,10,"Change",wxPoint(275,0)) EVT_BUTTON(self, 10, self.OnClick) EVT_SET_FOCUS def OnClick(self, event): dlg = wxFileDialog(self, "Choose a file", ".", "*.*", "*.xml,*.xslt,*.xsl,*.html,*.xhtml", self.mode) if dlg.ShowModal() == wxID_OK: if os.path.exists(dlg.GetPath()): self.text.SetValue(dlg.GetPath()) dlg.Destroy() def OnSetFocus(self,event): wxMessageBox("Focus!") self.text.SetFocus() def SetValue(self,str): self.text.SetValue(str) self.text.SetInsertionPoint(0) def GetValue(self): return self.text.GetValue() class OpenFileDlg(wxDialog): def __init__(self, parent,mode): if mode is wxOPEN: wxDialog.__init__(self,parent, -1, "Open Files", wxDefaultPosition, wxSize(450, 250)) else: if mode is wxSAVE: wxDialog.__init__(self,parent, -1, "Save Files", wxDefaultPosition, wxSize(450, 250)) else: raise TypeError("Expected wxOpen or wxSave") self.mode = mode self.metaxslt = OpenFileLine(self,-1,wxPoint(20,5),self.mode,"Meta X&SLT:") self.metadata = OpenFileLine(self,-1,wxPoint(20,35),self.mode,"Meta D&ATA:") self.mainxslt = OpenFileLine(self,-1,wxPoint(20,65),self.mode,"Main &XSLT:") self.maindata = OpenFileLine(self,-1,wxPoint(20,95),self.mode,"Main &DATA:") self.output = OpenFileLine(self,-1,wxPoint(20,125),self.mode,"&Output:") wxButton(self, wxID_OK, " OK ", wxPoint(75, 175), wxDefaultSize).SetDefault() wxButton(self, wxID_CANCEL, " Cancel ", wxPoint(200, 175), wxDefaultSize) def ReadNames(self,conf): self.metaxslt.SetValue(conf.Read("metaxslt")) self.metadata.SetValue(conf.Read("metadata")) self.mainxslt.SetValue(conf.Read("mainxslt")) self.maindata.SetValue(conf.Read("maindata")) self.output.SetValue(conf.Read("output")) def WriteNames(self,conf): conf.Write("metaxslt",self.metaxslt.GetValue()) conf.Write("metadata",self.metadata.GetValue()) conf.Write("mainxslt",self.mainxslt.GetValue()) conf.Write("maindata",self.maindata.GetValue()) conf.Write("output",self.output.GetValue()) class MainFrame(wxFrame): def __init__(self, parent, ID, title): ID_ABOUT = 101 ID_OPEN = 102 ID_SAVE = 103 ID_META = 104 ID_TRAN = 105 ID_PRINT = 106 ID_EXIT = 107 wxFrame.__init__(self, parent, ID, title, wxDefaultPosition, wxSize(600, 400)) self.CreateStatusBar() self.SetStatusText("This is the statusbar") menu = wxMenu() menu.Append(ID_ABOUT, "&About", "More information about this program") menu.Append(ID_OPEN, "&Open\tCtrl+O", "Open files.") menu.Append(ID_SAVE, "&Save\tCtrl+S", "Save files." ) menu.Append(ID_TRAN, "&Transform\tCtrl+T", "Run XSLT Transform") menu.Append(ID_META, "&Meta\tCtrl+M", "Show meta frame") menu.Append(ID_PRINT, "&Print\tCtrl+P", "Print HTML." ) menu.AppendSeparator() menu.Append(ID_EXIT, "E&xit", "Terminate the program") menuBar = wxMenuBar() menuBar.Append(menu, "&File"); self.SetMenuBar(menuBar) EVT_MENU(self, ID_ABOUT, self.OnAbout) EVT_MENU(self, ID_OPEN, self.OnOpen) EVT_MENU(self, ID_SAVE, self.OnSave) EVT_MENU(self, ID_META, self.ShowMeta) EVT_MENU(self, ID_EXIT, self.TimeToQuit) EVT_MENU(self, ID_PRINT, self.OnPrint) EVT_MENU(self, ID_TRAN, self.OnTransform) self.primary = wxSplitterWindow(self,-1,wxDefaultPosition, wxDefaultSize, wxSP_3D) self.meta = wxSplitterWindow(self.primary,-1,wxDefaultPosition, wxDefaultSize, wxSP_3D) self.metaxslt = wxTextCtrl(self.meta, -1, "", wxDefaultPosition, wxDefaultSize, wxTE_MULTILINE|wxSUNKEN_BORDER) self.metadata = wxTextCtrl(self.meta, -1, "", wxDefaultPosition, wxDefaultSize, wxTE_MULTILINE|wxSUNKEN_BORDER) self.meta.SplitHorizontally(self.metaxslt,self.metadata) self.meta.SetMinimumPaneSize(100) self.meta.Show(0) self.secondary = wxSplitterWindow(self.primary,-1,wxDefaultPosition, wxDefaultSize, wxSP_3D) self.main = wxSplitterWindow(self.secondary,-1,wxDefaultPosition, wxDefaultSize, wxSP_3D) self.mainxslt = wxTextCtrl(self.main, -1, "", wxDefaultPosition, wxDefaultSize, wxTE_MULTILINE|wxSUNKEN_BORDER) self.maindata = wxTextCtrl(self.main, -1, "", wxDefaultPosition, wxDefaultSize, wxTE_MULTILINE|wxSUNKEN_BORDER) self.main.SplitHorizontally(self.mainxslt,self.maindata) self.main.SetMinimumPaneSize(100) self.out = wxSplitterWindow(self.secondary,-1,wxDefaultPosition, wxDefaultSize, wxSP_3D) self.output = wxTextCtrl(self.out, -1, "", wxDefaultPosition, wxDefaultSize, wxTE_MULTILINE|wxSUNKEN_BORDER) self.html = OutputHtmlWindow(self.out, -1) self.html.SetRelatedFrame(self, "wxXSLT: %s") self.html.SetRelatedStatusBar(0) self.out.SplitHorizontally(self.output,self.html) self.out.SetMinimumPaneSize(100) self.primary.SplitVertically(self.meta,self.secondary) self.primary.SetMinimumPaneSize(0) self.primary.SetSashPosition(0) self.secondary.SplitVertically(self.main,self.out) self.secondary.SetMinimumPaneSize(50) self.secondary.SetSashPosition(300) self.main.SetSashPosition(200) self.out.SetSashPosition(200) self.meta.SetSashPosition(200) self.LoadFrames() def ShowMeta(self,event): if self.meta.IsShown(): self.meta.Show(0) self.primary.SetSashPosition(0) else: self.meta.Show(1) size = self.GetSize() self.primary.SetSashPosition(size.width/3) def OnAbout(self, event): dlg = wxMessageDialog(self, "This program can be used to\n" "test XSLT within a Python Environment.", "About Me", wxOK | wxICON_INFORMATION) dlg.ShowModal() dlg.Destroy() def TimeToQuit(self, event): self.Close(true) def OnPrint(self, event): printer = wxHtmlEasyPrinting() printer.PrintFile(self.html.GetOpenedPage()) def LoadFrames(self): conf = wxConfig("PythonXSLTTester") def Load(ctl,conf,str,bad): try: if conf.Read(str): file = open(conf.Read(str),"r") ctl.SetValue(file.read()) file.close() else: ctl.SetValue("") return bad except IOError, value: return "%s\n%s" % (bad,value) bad = Load(self.metaxslt,conf,"metaxslt","") bad = Load(self.metadata,conf,"metadata",bad) bad = Load(self.mainxslt,conf,"mainxslt",bad) bad = Load(self.maindata,conf,"maindata",bad) bad = Load(self.output,conf,"output",bad) if bad: wxMessageBox(bad,"Could Not Open One Or More Files") def OnOpen(self,event): win = OpenFileDlg(self,wxOPEN) conf = wxConfig("PythonXSLTTester") win.ReadNames(conf) val = win.ShowModal() if val == wxID_OK: win.WriteNames(conf) self.LoadFrames() def OnSave(self,event): conf = wxConfig("PythonXSLTTester") def Save(ctl,conf,str,bad): try: if conf.Read(str): val = ctl.GetValue() if val: file = open(conf.Read(str),"w") file.write(val) file.close() return bad except IOError, value: return "%s\n%s" % (bad,value) bad = Save(self.metaxslt,conf,"metaxslt","") bad = Save(self.metadata,conf,"metadata",bad) bad = Save(self.mainxslt,conf,"mainxslt",bad) bad = Save(self.maindata,conf,"maindata",bad) bad = Save(self.output,conf,"output",bad) if bad: wxMessageBox(bad,"Could Not Save One Or More Files") def ErrorToString(self,hr,msg,exc,arg): ret = "%d: %s" % (hr,msg) if exc: wcode, source, text, helpFile, helpID, scode = exc ret = "%s\nSource: %s\nText: %s" % (ret, source,text) return ret def OnTransform(self,event): if self.meta.IsShown(): temp = self.TransformStage(self.metaxslt.GetValue(),self.metadata.GetValue()) if temp: self.mainxslt.SetValue(temp) else: return temp = self.TransformStage(self.mainxslt.GetValue(),self.maindata.GetValue()) if temp: self.output.SetValue(temp) try: self.html.SetPage(temp) except: pass def TransformStage(self,xslt,data): try: domxslt = msxml.DOMDocument.default_interface(msxml.FreeThreadedDOMDocument()) domdata = msxml.DOMDocument.default_interface(msxml.DOMDocument()) domdata.validateOnParse = 0 domdata.async = 0 domdata.preserveWhiteSpace = 1 domxslt.validateOnParse = 1 domxslt.async = 0 domxslt.preserveWhiteSpace = 1 try: domdata.loadXML(data) except pythoncom.com_error, ( hr, msg, exc, arg ): wxMessageBox("%s\n\n%s\nline:%d col:%d\ntext: %s" % ( self.ErrorToString(hr,msg,exc,arg), domdata.parseError.reason, domdata.parseError.line, domdata.parseError.linepos, domdata.parseError.srcText), "Error Parsing Data") return None try: domxslt.loadXML(xslt) except pythoncom.com_error, ( hr, msg, exc, arg): wxMessageBox("%s\n\n%s\nline:%d col:%d\ntext: %s" % ( self.ErrorToString(hr,msg,exc,arg), domdata.parseError.reason, domdata.parseError.line, domdata.parseError.linepos, domdata.parseError.srcText), "Error Parsing XSLT") return None templ = msxml.XSLTemplate.default_interface(msxml.XSLTemplate()) templ.stylesheet = domxslt proc = templ.createProcessor() proc.input = domdata proc.transform() return proc.output.encode("ISO8859-1") except pythoncom.com_error, ( hr, msg, exc, arg): wxMessageBox(self.ErrorToString(hr,msg,exc,arg), "Error Transforming") return None class TheApp(wxApp): def OnInit(self): frame = MainFrame(NULL, -1, "Python XSLT Testing Tool") frame.Show(true) self.SetTopWindow(frame) return true if __name__ == '__main__': app = TheApp(0) app.MainLoop() From Olivier.Cayrol@logilab.fr Wed Mar 28 08:33:06 2001 From: Olivier.Cayrol@logilab.fr (Olivier CAYROL (Logilab)) Date: Wed, 28 Mar 2001 10:33:06 +0200 (CEST) Subject: [XML-SIG] Swap images for text elements using xsl? In-Reply-To: <OFE83DE0B4.75622DE6-ON85256A1C.0062F790@olsinc.net> Message-ID: <Pine.LNX.4.21.0103281028170.2971-100000@sagittarius.logilab.fr> Hello, On Tue, 27 Mar 2001 Lance_Hill/OLS.OLS@olsinc.net wrote: > Currenly, I am using a table to display the text wrapped in each tag > (generally a Y or N), but I would prefer to use an image selectred > depending on the text element in each tag. A solution is to use an <xsl:choose>, <xsl:when>, <xsl:otherwise> statement (see example below). <xsl:template match=3D"/"> <HTML> <BODY> <table border=3D"1"> <tr> <th>New Client Status</th> ...etc. </tr> =20 =20 <xsl:for-each select=3D"status_report/agency"> <tr> <td> <xsl:choose> <xsl:when test=3D"new-client_status =3D 'Y'"> <IMG SRC=3D"image_yes.gif"/> </xsl:when> <xsl:otherwise> <IMG SRC=3D"image_no.gif"/> </xsl:otherwise> </xsl:choose> </td> ...etc. =20 </tr> </xsl:for-each> </table> </BODY> </HTML> </xsl:template> </xsl:stylesheet> Regards, O. CAYROL. _________________________________________________________________________ Olivier CAYROL LOGILAB - Paris (France) http://www.logilab.com/ Change your millenium, try NARVAL the Intelligent Personal Assistant. Changez de mill=E9naire, essayez NARVAL l'Assistant Personnel Intelligent. _________________________________________________________________________ From fdrake@acm.org Wed Mar 28 17:29:59 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 28 Mar 2001 12:29:59 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <985724354.4243.0.camel@eddie> References: <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> Message-ID: <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> [Adding David Faure to the recipients list.] Ross Burton writes: > I am involved in adding support for XBEL to Galeon, the GNOME Mozilla > Gecko based browser. Initial export is working, but there are several Cool! I've been meaning to play with Galeon; I guess I've just gotten a better excuse. ;-) > The specification for metadata is vague, <metadata> elements have a > "owner" attribute which should be a URI. But what forms of metadata are > valid? The DTD implies that there can be no children of metadata > elements (the content is EMPTY). Here's the problem: What we want is to be able to say "ANY-and-we-really-mean-it", not ANY as defined in the DTD language. That definition tells us that ANY means anything *defined in the DTD*, which is pretty limited -- this is an inherited SGML wart. I don't know how to express what we actually want in the DTD language; if anyone can tell me, I'd be glad to change the DTD for revision 1.1. If anyone can tell me how to do it in XSchema, I'd be happy to use that for the schema language instead of using the DTD language. > Currently Galeon-specific attributes are exported as follows: ...ugh!... Don't do that. > It does seem that the DTD is in error as requiring all metadata to be in > the owner attribute is rather limiting. But what content is allowed as > children of the metadata element? Just text? Or could the content of a > metadata element be a free-form XML tree? For example: > > <site ...> > <info> > <metadata owner="http://galeon.sourceforge.net"> > <pixmap>/home/users/ross/pictures/slashdot.org</pixmap> > <toolbar>true</toolbar> > </metadata> > </info> > </site> This is *much* better! It also matches the intent. > This, although allowing more free-form data, is heavier for the > application. We intent to be "polite" to XBEL data and store any unknown > metadata so that it can be written out again - if only text is allowed > this is trivial, otherwise tree fragments have to be stored. This wasn't hard to do for Grail, which also supported this use. But Python data types make this pretty trivial as long as I can get all the interesting parse events. Martin v. Loewis writes: > That, of course, would mean that a version 1.1 of XBEL needs to be > issued, so perhaps this is the time to think about other pending > improvements. I'm very happy with doing this. In fact, I've made a couple of changes to the DTD and documentation based on comments from David Faure (from the Konqueror development group). In particular, I've added the "icon" attribute to the <bookmark/> and <folder/> elements, and the "toolbar" attribute to the <folder/> element. The later is intended to mark which folder should be used as the "Personal Toolbar" -- my tentative change allows it to have the values "yes" or "no", with "no" as the default. This may need some reconsideration; I can envision having software that supports multiple toolbars, but I'm not sure of the best way to encode that information. (It may even be appropriate to push that into application-specific metadata inside the <metadata/> element.) Another idea I've thought about from time to time is of linking to other bookmark collections, so that a folder-like thing about be used to refer to another (possibly remote) XBEL document by URI, or to RSS or other documents that could be used to store bookmarks (possibly including Netscape-style HTML bookmarks). I think this would be easy to support in XBEL, and it only takes software to make it useful. ;) Martin, were there other warts you were thinking about? (Anyone?) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Digital Creations From fdrake@acm.org Wed Mar 28 17:35:06 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 28 Mar 2001 12:35:06 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <Pine.LNX.4.21.0103272323310.2124-100000@gallahad.180sw.com> References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <Pine.LNX.4.21.0103272323310.2124-100000@gallahad.180sw.com> Message-ID: <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> Ross Burton writes: > Less child nodes (such as <title>) and more attributes? :-) Only kidding > but libxml (aka gnome-xml) isn't that good with child nodes. I miss W3C > DOM... Having joined the ranks of DOM implementors myself, I can only wish I missed it. ;-( Is the libxml API really that hard to work with, or does it have implementation limitations that cause it not to work with deeply nested documents? I keep meaning to look at it more, but just haven't had the time. Is there any reason the GNOME people are writing their own XML parser instead of using Expat? Thanks! -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Digital Creations From r.burton@180sw.com Wed Mar 28 19:06:18 2001 From: r.burton@180sw.com (Ross Burton) Date: Wed, 28 Mar 2001 20:06:18 +0100 Subject: [XML-SIG] Metadata in XBEL References: <985724354.4243.0.camel@eddie><200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> Message-ID: <004901c0b7ba$2dbf84a0$2a01a8c0@eddie> > > I am involved in adding support for XBEL to Galeon, the GNOME Mozilla > > Gecko based browser. Initial export is working, but there are several > > The specification for metadata is vague, <metadata> elements have a > > "owner" attribute which should be a URI. But what forms of metadata are > > valid? The DTD implies that there can be no children of metadata > > elements (the content is EMPTY). > > Here's the problem: What we want is to be able to say > "ANY-and-we-really-mean-it", not ANY as defined in the DTD language. > That definition tells us that ANY means anything *defined in the DTD*, > which is pretty limited -- this is an inherited SGML wart. I don't > know how to express what we actually want in the DTD language; if > anyone can tell me, I'd be glad to change the DTD for revision 1.1. > If anyone can tell me how to do it in XSchema, I'd be happy to use > that for the schema language instead of using the DTD language. Ah. I'm not a DTD expert so though that ANY meant literally anything. > > Currently Galeon-specific attributes are exported as follows: > > ...ugh!... Don't do that. Okay. > > It does seem that the DTD is in error as requiring all metadata to be in > > the owner attribute is rather limiting. But what content is allowed as > > children of the metadata element? Just text? Or could the content of a > > metadata element be a free-form XML tree? For example: > > > > <site ...> > > <info> > > <metadata owner="http://galeon.sourceforge.net"> > > <pixmap>/home/users/ross/pictures/slashdot.org</pixmap> > > <toolbar>true</toolbar> > > </metadata> > > </info> > > </site> > > This is *much* better! It also matches the intent. Right. I'll change the code soon. Maybe there should be an example of use for the metadata elements in XBEL 1.1. > Martin v. Loewis writes: > > That, of course, would mean that a version 1.1 of XBEL needs to be > > issued, so perhaps this is the time to think about other pending > > improvements. > I'm very happy with doing this. In fact, I've made a couple of > changes to the DTD and documentation based on comments from David > Faure (from the Konqueror development group). > In particular, I've added the "icon" attribute to the <bookmark/> > and <folder/> elements, and the "toolbar" attribute to the <folder/> > element. The later is intended to mark which folder should be used as > the "Personal Toolbar" -- my tentative change allows it to have the > values "yes" or "no", with "no" as the default. This may need some > reconsideration; I can envision having software that supports multiple > toolbars, but I'm not sure of the best way to encode that > information. (It may even be appropriate to push that into > application-specific metadata inside the <metadata/> element.) I like those additions... because they are the main reason Galeon has to use metadata! Galeon does allow multiple toolbars to be displayed (thinking about it, just the one is a limitation really), but I think that a simple "yes|no" with no as the default is good enough for that. Pushing that into metadata is not using the potential of the toolbar attribute. > Another idea I've thought about from time to time is of linking to > other bookmark collections, so that a folder-like thing about be used > to refer to another (possibly remote) XBEL document by URI, or to RSS > or other documents that could be used to store bookmarks (possibly > including Netscape-style HTML bookmarks). I think this would be easy > to support in XBEL, and it only takes software to make it useful. ;) Sounds like the future plans for Gnobog (a GNOME bookmark organiser). At the moment it just reads/wites Netscape/IE and can re-organise, but they are planning on moving to a "filesystem" like architecture, where bookmarks are stored seperately to the tree view. This way aliases are taken to the logical extension and the entire system behaves just like an ext2 filesystem with hard links everywhere. Nodes in the tree will be allowed to point at a bookmark entry, or another (possibly remote) set of folders. I have no say whatsoever here, but I'm +1 for adding the toolbar and icon attributes, clarifying the metadata elements and releasing 1.1 of the XBEL spec. :-) Regards, and thanks for the mails, Ross Burton From martin@loewis.home.cs.tu-berlin.de Wed Mar 28 20:15:48 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 28 Mar 2001 22:15:48 +0200 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> (fdrake@acm.org) References: <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> Message-ID: <200103282015.f2SKFmx04164@mira.informatik.hu-berlin.de> > Here's the problem: What we want is to be able to say > "ANY-and-we-really-mean-it", not ANY as defined in the DTD language. > That definition tells us that ANY means anything *defined in the DTD*, > which is pretty limited -- this is an inherited SGML wart. I don't > know how to express what we actually want in the DTD language; if > anyone can tell me, I'd be glad to change the DTD for revision 1.1. I think you are right: it cannot be expressed. Looking at the four options of "Element Valid", none of them applies. > Martin v. Loewis writes: > > That, of course, would mean that a version 1.1 of XBEL needs to be > > issued, so perhaps this is the time to think about other pending > > improvements. > > I'm very happy with doing this. In fact, I've made a couple of > changes to the DTD and documentation based on comments from David > Faure (from the Konqueror development group). So I'd reverse my previous comment: *If* a 1.1 release of XBEL is issued for good reasons, it should probably show ANY as the element contents, to avoid confusion; and the documentation should be clear that any well-formed element (plus text) is accepted as contents > Martin, were there other warts you were thinking about? None specifically. I was suggesting that any missing features that came up during in the Galeon project should be worth consideration. Regards, Martin From fdrake@acm.org Wed Mar 28 20:34:41 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 28 Mar 2001 15:34:41 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <200103282015.f2SKFmx04164@mira.informatik.hu-berlin.de> References: <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> <200103282015.f2SKFmx04164@mira.informatik.hu-berlin.de> Message-ID: <15042.19169.974191.457571@cj42289-a.reston1.va.home.com> Martin v. Loewis writes: > So I'd reverse my previous comment: *If* a 1.1 release of XBEL is > issued for good reasons, it should probably show ANY as the element > contents, to avoid confusion; and the documentation should be clear > that any well-formed element (plus text) is accepted as contents Agreed; I've made this change in my tentative 1.1 DTD and documentation. > None specifically. I was suggesting that any missing features that > came up during in the Galeon project should be worth consideration. Sounds good. Ross, David: please speak up if you have any additional considerations for us! -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Digital Creations From r.burton@180sw.com Wed Mar 28 21:41:42 2001 From: r.burton@180sw.com (Ross Burton) Date: Wed, 28 Mar 2001 22:41:42 +0100 Subject: [XML-SIG] Metadata in XBEL References: <985724354.4243.0.camel@eddie><200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de><15042.8087.392696.683721@cj42289-a.reston1.va.home.com><200103282015.f2SKFmx04164@mira.informatik.hu-berlin.de> <15042.19169.974191.457571@cj42289-a.reston1.va.home.com> Message-ID: <008b01c0b7d0$1f51a9a0$2a01a8c0@eddie> > > None specifically. I was suggesting that any missing features that > > came up during in the Galeon project should be worth consideration. > > Sounds good. Ross, David: please speak up if you have any > additional considerations for us! The only features which XBEL was missing for Galeon which required metadata are: 1) icon for bookmark 2) toolbar for folder 3) notes on bookmark 4) nick-name (shortcut name for typing into location box) 5) add to context menu Of these 1 and 2 are already in XBEL 1.1. The question is are 3-5 general enough to be in the spec? I think that 3 and 4 possibly are. Several browsers allow free-form notes to be attached to sites, although that does overlap somewhat with the metadata tags. Maybe standard owners for metadata elements can be defined for optional metadata, so that an owner of "python.org/xbel/notes" (say) could be used for notes on a item. This way the data is confimed to the <info> node where it belongs, but it still identified. Also, some browsers allow short names to be assigned to sites, so that typing in the short name is sufficient to navigate to the URL. Of course nick names are similar to the ID attribute, in that they are both short names. However, I can see systems where the ID is generated by the system itself, not the user. I'm not so convinced about 5 (it adds the selected folder/bookmark to the default context menu, so it is always available). Personally I'm not so convinced of it's usefulness, so it should not be in XBEL 1.1. In summary, I'd like to see 4 in the spec, and possibly 3. Comments anyone? Regards, Ross Burton From fdrake@acm.org Wed Mar 28 22:05:55 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 28 Mar 2001 17:05:55 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <008b01c0b7d0$1f51a9a0$2a01a8c0@eddie> References: <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> <200103282015.f2SKFmx04164@mira.informatik.hu-berlin.de> <15042.19169.974191.457571@cj42289-a.reston1.va.home.com> <008b01c0b7d0$1f51a9a0$2a01a8c0@eddie> Message-ID: <15042.24643.532405.135455@cj42289-a.reston1.va.home.com> Ross Burton writes: > 3) notes on bookmark How is this different from the <desc/> element? In Grail, this is filled in by a multi-line type-in box in the bookmark's or folder's Properties dialog -- it is equivalent to the same feature in Navigator. > 4) nick-name (shortcut name for typing into location box) Presumably Galeon supports this. Anyone else? I'm not at all sure I've ever seen it. > 5) add to context menu This sounds strangely like the "personal toolbar" -- how are they different? -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Digital Creations From srn@coolheads.com Wed Mar 28 09:10:19 2001 From: srn@coolheads.com (Steven R. Newcomb) Date: Wed, 28 Mar 2001 03:10:19 -0600 Subject: [XML-SIG] Extreme Markup Languages 2001 Conference Message-ID: <200103280910.DAA25844@bruno.techno.com> Reminder: It's time to send in your paper proposal for the Extreme Markup Languages conference in Montreal next August. Papers/proposals are due this weekend. Call for Participation for Extreme Markup Languages 2001 NOTE - CONFERENCE DATES AND LOCATION HAVE CHANGED! Highlights: - highly technical peer-reviewed 3.7-day conference preceded by 2 days of tutorials - SGML, XML, Topic Maps, query languages, linking, schemas, transformations, inference engines, formatting and behavior, and more - Submissions due by March 31, 2001 - For more information visit www.gca.org Extreme Markup Languages 2001 There's Nothing so Practical as a Good Theory >From GCA (Alexandria, Va.) - Extreme Markup Languages brings >together software developers, markup theorists, information >visionaries, and other assorted geeks for formal presentations, >poster sessions, question and answer sessions, hallway discussions, >arguments and gesticulations in front of flip charts, table-top >software demos, coffee, and the cuisine, ambience, and charm of >Montréal in August. Extreme conference participants include thought >leaders from corporate and academic information management, >knowledge engineering, enterprise integration/corporate memory, >science, and technical and cultural research. There will be four types of presentations at Extreme: peer reviewed technical papers, late breaking news, posters, and invited keynotes. All will be new material, address some aspect of information management from a theoretical or practical standpoint, and be detailed and rigorous. Come join us to discuss information alchemy: making documents into information and data into gold. WHEN: August 12-17, 2001 WHERE: Le Centre Sheraton, Montréal, Canada SPONSOR: Graphic Communications Association (GCA) Chairs: Steven R. Newcomb B. Tommie Usdin, Mulberry Technologies, Inc. Co-Chairs: Deborah A. Lapeyre, Mulberry Technologies, Inc. C. M. Sperberg-McQueen, World Wide Web Consortium/MIT Laboratory for Computer Sciences WHAT: Call for Papers, Peer Reviewers, Posters, and Tutorials HOW: Submit full papers or paper proposals to the conference secretariat in SGML or XML according to one of the submission DTDs and sent via email to: extreme@mulberrytech.com. Guidelines for Submission and the DTDs are available by email: extreme@mulberrytech.com or at http://www.mulberrytech.com/Extreme Apply to the Peer Review panel using the form at: http://www.mulberrytech.com/Extreme/Peer/ Submit tutorial proposals according to the instructions at: http://www.mulberrytech.com/Extreme/Tutorial SCHEDULE: Peer Review Applications Due. . March 2, 2001 Tutorial Proposals Due . . . . March 16, 2001 Paper Submission Deadline . . . March 31, 2001 Speakers Notified . . . . . . . May 14, 2001 Revised Papers Due. . . . . . . June 18, 2001 Tutorials . . . . . . . . . . . August 12-13, 2001 Conference . . . . . . . . . . August 14-17, 2001 QUESTIONS: Email to Extreme@mulberrytech.com or call Tommie Usdin +1 301/315-9631 MORE INFORMATION: For updated information on the program and plans for the conference as they develop, see http://www2.gca.org/extreme/ -Steve -- Steven R. Newcomb, Consultant srn@coolheads.com voice: +1 972 359 8160 fax: +1 972 359 0270 405 Flagler Court Allen, Texas 75013-2821 USA From r.burton@180sw.com Wed Mar 28 22:24:43 2001 From: r.burton@180sw.com (Ross Burton) Date: Wed, 28 Mar 2001 23:24:43 +0100 Subject: [XML-SIG] Metadata in XBEL References: <985724354.4243.0.camel@eddie><200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de><15042.8087.392696.683721@cj42289-a.reston1.va.home.com><200103282015.f2SKFmx04164@mira.informatik.hu-berlin.de><15042.19169.974191.457571@cj42289-a.reston1.va.home.com><008b01c0b7d0$1f51a9a0$2a01a8c0@eddie> <15042.24643.532405.135455@cj42289-a.reston1.va.home.com> Message-ID: <00a301c0b7d5$e433f020$2a01a8c0@eddie> > > 3) notes on bookmark > How is this different from the <desc/> element? In Grail, this is > filled in by a multi-line type-in box in the bookmark's or folder's > Properties dialog -- it is equivalent to the same feature in > Navigator. Erm... D'oh! > > 4) nick-name (shortcut name for typing into location box) > Presumably Galeon supports this. Anyone else? I'm not at all sure > I've ever seen it. I though Netscape did this? Just checked, it doesn't. Damn. I'm not doing well tonight, am I? Maybe I should get more sleep. :-) Hey, ignore that too unless people can quote other browsers which support this. > > 5) add to context menu > This sounds strangely like the "personal toolbar" -- how are they > different? That was my thought... Don't think about putting it in the spec, I'll encode it as Galeon-specific metadata. Ross From fdrake@acm.org Wed Mar 28 22:32:36 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 28 Mar 2001 17:32:36 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <00a301c0b7d5$e433f020$2a01a8c0@eddie> References: <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> <200103282015.f2SKFmx04164@mira.informatik.hu-berlin.de> <15042.19169.974191.457571@cj42289-a.reston1.va.home.com> <008b01c0b7d0$1f51a9a0$2a01a8c0@eddie> <15042.24643.532405.135455@cj42289-a.reston1.va.home.com> <00a301c0b7d5$e433f020$2a01a8c0@eddie> Message-ID: <15042.26244.446881.208379@cj42289-a.reston1.va.home.com> Ross Burton writes: > well tonight, am I? Maybe I should get more sleep. :-) No, you just need a better grade of caffeine! > That was my thought... Don't think about putting it in the spec, I'll > encode it as Galeon-specific metadata. So I'll presume that the current changes to XBEL are good for Galeon. I've not heard from David Faure yet; I'll wait to see if he chimes in before checking in a new DTD and documentation. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Digital Creations From noreply@sourceforge.net Thu Mar 29 08:31:24 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 29 Mar 2001 00:31:24 -0800 Subject: [XML-SIG] [ pyxml-Bugs-412141 ] pDomlette fails on cloneNode Message-ID: <E14iXq0-0007ga-00@usw-sf-web2.sourceforge.net> Bugs item #412141, was updated on 2001-03-29 00:31 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=412141&group_id=6473 Category: 4Suite Group: None Status: Open Priority: 5 Submitted By: Alexandre Fayolle (afayolle) Assigned to: Nobody/Anonymous (nobody) Summary: pDomlette fails on cloneNode Initial Comment: Not all classes in pDomlette implement the cloneNode method. PI, Attributes and Comments are notable exceptions. However the cloneNode implementation in the Element class calls cloneNode on all the children of the current Element, which can result in attribute errors. Here's a patch against pDomlette from 4Suite 0.10.2. It will skip children that are not elements. This behaviour seems acceptable to me, but should be documented somewhere if the patch was to be included in the main distribution. --- /home/alf/tmp/pDomlette.py Thu Mar 29 10:27:23 2001 +++ pDomlette.py Thu Mar 29 10:11:50 2001 @@ -289,8 +289,9 @@ newElement.setAttributeNS(attr.namespaceURI,attr.name,attr.value) if deep: for c in self.childNodes: - nc = c.cloneNode(deep) - newElement.appendChild(nc) + if c.nodeType == c.ELEMENT_NODE: + nc = c.cloneNode(deep) + newElement.appendChild(nc) return newElement ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=412141&group_id=6473 From Alexandre.Fayolle@logilab.fr Thu Mar 29 09:04:00 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Thu, 29 Mar 2001 11:04:00 +0200 (CEST) Subject: [XML-SIG] User Friendly and XML Message-ID: <Pine.LNX.4.21.0103291101260.21238-100000@leo.logilab.fr> For those of you who do not already know User Friendly (the comic strip), I advise you to give a look at yesterday and today's cartoons, which illustrate the evilness of XML books. http://ars.userfriendly.org/cartoons/?id=20010328&mode=classic http://ars.userfriendly.org/cartoons/?id=20010329&mode=classic Cheers, Alexandre Fayolle -- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). From david@mandrakesoft.com Thu Mar 29 16:35:54 2001 From: david@mandrakesoft.com (David Faure) Date: Thu, 29 Mar 2001 17:35:54 +0100 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <008b01c0b7d0$1f51a9a0$2a01a8c0@eddie> References: <985724354.4243.0.camel@eddie> <15042.19169.974191.457571@cj42289-a.reston1.va.home.com> <008b01c0b7d0$1f51a9a0$2a01a8c0@eddie> Message-ID: <200103291635.f2TGZsr27038@faure.worldonline.co.uk> Hello everyone, First, I'm glad that Galeon uses XBEL too, I didn't know that. > 1) icon for bookmark > 2) toolbar for folder Those are the two things I asked for in XBEL too. Good to see it will be in the 1.1 version of the spec. However there is a small concern about how icons are designated - as usual between KDE and Gnome, since the same problem exists in the .desktop files. The way it currently works in Konqueror is the following. <bookmark icon="www" href="http://www.kde.org/" > <title>KDE Home Page The icon is an attribute of the bookmark element, and of the folder element, and * either the icon name is the base name of a globally available icon; no extension is written, and no directory either. The icon loading looks for icons of that name + ".png" or ".xpm", under the standard (for KDE) directories, e.g. /usr/share/icons/hicolor/16x16/*/ This makes it possible to have a different icon for 8-bit displays (locolor instead of hicolor), and gives access to different icon sizes. * or the icon name is like "favicons/www-1.ibm.com" to designate a "favourite icon" for a given site, which has been stored under the ~/.kde/share/icons/favicons/ directory (with .png appended). * obviously full paths are supported too. I realize that all this is very hard to standardize !! The only practical solution is to add the search paths of the other environment in each, as was done for .desktop files. But that doesn't consistute a clean spec. I'm afraid I have no solution to offer, I guess I'm just pointing out that sharing the same attribute might not be enough for users to use the same bookmark collection with both browsers. I saw in another mail on the subject, this piece of XML : > > > > /home/users/ross/pictures/slashdot.org > true > > > Is this a concrete case of XML used by Galeon, or is it more like a theorical example ? I'm surprised by , etc. Is that part of XBEL ? I guess not :) Surely jumping in the middle of a discussion doesn't help :-) Anyway, back to the icon issue, this seems to suggest that Galeon uses full paths ? > 3) notes on bookmark That, and many other things associated with bookmarks, will end up being necessary. Juergen (who plans to contribute to Konqueror's bookmarks) mentionned scoring: "to give the site a score (of out 10, for example)... then you could search for "linux kde development" with a score >= 7 for example". Especially useful if merging is done, see end of mail. Other things that users mentionned were: list of keywords (still for searches), and, hmm, how often a given bookmark was visited. Not very important, given that we still don't support the added/visited/modified dates yet. > 4) nick-name (shortcut name for typing into location box) Interesting idea :) In fact this is possible in Konqueror, but via a separate module (the "short-URI filter"), so it's currently unrelated with bookmarks. One often requested feature, is for merging. For instance, in a company, there could be a "company-global" set of bookmarks, to be merged with the user's bookmarks - much like everything else in KDE already has a global and a local directory, possibly with even more levels (e.g. for groups of people). To make that possible, XBEL could have a sort of "include this other bookmark collection" tag, and it could be up to the application to create aliases towards those global bookmarks in the user's bookmark file. Well, that's just one solution - it allows to change the order, to remove a global bookmark, to insert its own anywhere... but it doesn't notice new bookmarks in the global collection, unless some timestamp is used. Another way could be that including another set of bookmarks simply means that all those bookmarks appear first, then those in the user's file. This way, changes to the global collection are automatically taken into account, but it's impossible to modify/remove/reorder/change anything in the global collection. It's probably much easier to implement too, and has the exact semantic of a #include. I suggest to add this to XBEL then: a simple . There's still the issue of relative paths vs absolute paths, but, well... no solution here either :} In summary, despite the compatibility problem with icon names (and paths), I'm very happy if icon="..." and toolbar="yes" are added to XBEL (given that Konqueror already uses those), I suggest to add an possibility, and the few other things that are not in XBEL and that might be in konqueror one day (keywords, scoring), can certainly be done as konq-specific metadata - unless others want to share the same data. -- David FAURE, david@mandrakesoft.com, faure@kde.org http://perso.mandrakesoft.com/~david/, http://www.konqueror.org/ KDE, Making The Future of Computing Available Today From r.burton@180sw.com Thu Mar 29 16:56:57 2001 From: r.burton@180sw.com (Ross Burton) Date: Thu, 29 Mar 2001 17:56:57 +0100 Subject: [XML-SIG] Metadata in XBEL References: <985724354.4243.0.camel@eddie> <15042.19169.974191.457571@cj42289-a.reston1.va.home.com> <008b01c0b7d0$1f51a9a0$2a01a8c0@eddie> <200103291635.f2TGZsr27038@faure.worldonline.co.uk> Message-ID: <005601c0b871$4a632280$1501a8c0@180sw.com> Hi, > First, I'm glad that Galeon uses XBEL too, I didn't know that. Well, the lack of XBEL is what prompted me to start work on it. There is now a branch were work by me and Ricardo is slowly but surely progressing. > The way it currently works in Konqueror is the following. > > KDE Home Page > > > The icon is an attribute of the bookmark element, and of the folder element, > and > * either the icon name is the base name of a globally > available icon; no extension is written, and no directory either. > The icon loading looks for icons of that name + ".png" or ".xpm", > under the standard (for KDE) directories, e.g. /usr/share/icons/hicolor/16x16/*/ > This makes it possible to have a different icon for 8-bit displays > (locolor instead of hicolor), and gives access to different icon sizes. > * or the icon name is like "favicons/www-1.ibm.com" to designate a > "favourite icon" for a given site, which has been stored under the > ~/.kde/share/icons/favicons/ directory (with .png appended). > * obviously full paths are supported too. > I realize that all this is very hard to standardize !! > The only practical solution is to add the search paths of the other > environment in each, as was done for .desktop files. But that doesn't > consistute a clean spec. I'm afraid I have no solution to offer, > I guess I'm just pointing out that sharing the same attribute might > not be enough for users to use the same bookmark collection with > both browsers. Hmm... I didn't know KDE had lists of icons. That could be an issue. > I saw in another mail on the subject, this piece of XML : > > > > > > > > /home/users/ross/pictures/slashdot.org > > true > > > > > > > Is this a concrete case of XML used by Galeon, or is it more like a > theorical example ? > I'm surprised by , etc. Is that part of XBEL ? > I guess not :) That's the Lack Of Caffine And Sleep problem again. site == bookmark. Galeon now (my local copy, anyway) exports its metadata in that form, so there is only one metadata element owned by Galeon in each bookmark/folder. > Surely jumping in the middle of a discussion doesn't help :-) > Anyway, back to the icon issue, this seems to suggest that Galeon > uses full paths ? Yes, it does. > > 3) notes on bookmark > That, and many other things associated with bookmarks, will end > up being necessary. > Juergen (who plans to contribute to Konqueror's bookmarks) mentionned > scoring: "to give the site a score (of out 10, for example)... then you could > search for "linux kde development" with a score >= 7 for example". > Especially useful if merging is done, see end of mail. > Other things that users mentionned were: list of keywords > (still for searches), and, hmm, how often a given bookmark was > visited. Not very important, given that we still don't support the > added/visited/modified dates yet. Nice ideas. You're not alone with the added/visited/modified dates, BTW. :-) > One often requested feature, is for merging. For instance, in a company, > there could be a "company-global" set of bookmarks, to be merged with > the user's bookmarks - much like everything else in KDE already has > a global and a local directory, possibly with even more levels (e.g. for > groups of people). I like that idea too, it could be very handy. > In summary, despite the compatibility problem with icon names (and paths), > I'm very happy if icon="..." and toolbar="yes" are added to XBEL > (given that Konqueror already uses those), I suggest to add an > possibility, and the few other things that are not in XBEL and that might > be in konqueror one day (keywords, scoring), can certainly be done as > konq-specific metadata - unless others want to share the same data. I'm for creating a set of metadata owners which can be considered "standard" in that they are defined under common grounds in the open. Not part of the actual specification (as it's best if that is kept small) but a catalogue of owners and expected content which would allow sharing of data. Into this could be added all of the usefull attributes which could be shared without making XBEL overly complex, such as keywords and scoring. Regards, Ross Burton --- Ross Burton Software Engineer OneEighty Software Ltd Tel: +44 20 8263 2332 The Lansdowne Building Fax: +44 20 8263 6314 2 Lansdowne Road r.burton@180sw.com Croydon, Surrey CR9 2ER, UK http://www.180sw.com./ ==================================================================== Under the Regulation of Investigatory Powers (RIP) Act 2000 together with any and all Regulations in force pursuant to the Act OneEighty Software Ltd reserves the right to monitor any or all incoming or outgoing communications as provided for under the Act From noreply@sourceforge.net Thu Mar 29 17:31:23 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 29 Mar 2001 09:31:23 -0800 Subject: [XML-SIG] [ pyxml-Bugs-412235 ] xml.xslt.RtfWriter broken Message-ID: Bugs item #412235, was updated on 2001-03-29 09:31 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=412235&group_id=6473 Category: 4Suite Group: None Status: Open Priority: 5 Submitted By: Alexandre Fayolle (afayolle) Assigned to: Nobody/Anonymous (nobody) Summary: xml.xslt.RtfWriter broken Initial Comment: When trying to process an XSLT outputting a pDomlette document fragment, runNode will fail with the following traceback : >>> frag = p.runNode(element,1,{},RtfWriter(None,d2)) Traceback (innermost last): File "", line 1, in ? File "/usr/lib/python1.5/site-packages/xml/xslt/Processor.py", line 186, in runNode baseUri, outputStream) File "/usr/lib/python1.5/site-packages/xml/xslt/Processor.py", line 244, in execute self.writers[-1].startDocument() AttributeError: startDocument The problem lies in RtfWriter, which shoudl inherit from NullWriter (which provides default implementation for all writer method). Here's a patch: --- /tmp/RtfWriter.py Thu Mar 29 19:25:01 2001 +++ RtfWriter.py Thu Mar 29 19:26:05 2001 @@ -19,8 +19,9 @@ from Ft.Lib import pDomlette from xml.dom.ext import SplitQName from xml.dom import XMLNS_NAMESPACE +from xml.xslt import NullWriter -class RtfWriter: +class RtfWriter(NullWriter.NullWriter): def __init__(self, outputParams, ownerDoc): self.__ownerDoc = ownerDoc self.__root = pDomlette.DocumentFragment(ownerDoc) Cheers Alexandre ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=412235&group_id=6473 From martin@loewis.home.cs.tu-berlin.de Thu Mar 29 17:39:12 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Thu, 29 Mar 2001 19:39:12 +0200 Subject: [XML-SIG] Matching NameChars Message-ID: <200103291739.f2THdCe01821@mira.informatik.hu-berlin.de> I have now committed two new modules, utils/xmlchargen.py and xml/utils/characters.py (generated from the former). These represent common regular expressions: specifically, expressions for the productions in sections B and 2.3, Names and Tokens. For each of them, there is a string constant Foo represending a regular expression, and a compiled regular expression re_Foo. I've changed xmlproc to use those. As it turns out, this will slow-down parsing on an example document (the XSLT spec) by 3%, contrary to my earlier (more optimistic) measurements. Marc-Andr=E9 suggested to write C code to speed this up. So here is a revised challenge for any prospective contributor: write a C module that emulates xml.utils.characters, by providing objects with the same methods as the compiled regular expressions, but faster matching algorithms. Alternatively, come up with a patch to sre that performs faster matching when presented with Unicode character classes - that would help more Python users than the former approach. Hint: Please have a look at how expat represents the bitmaps, that appears to be quite efficient. I'd discourage outright copying of those tables, though - somebody should verify that they are still correct for XML 1.0 2nd edition. Regards, Martin From noreply@sourceforge.net Thu Mar 29 17:47:54 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 29 Mar 2001 09:47:54 -0800 Subject: [XML-SIG] [ pyxml-Patches-412237 ] sgmlop returns Unicode Message-ID: Patches item #412237, was updated on 2001-03-29 09:47 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=412237&group_id=6473 Category: None Group: None Status: Open Priority: 5 Submitted By: Walter Dörwald (doerwalter) Assigned to: Nobody/Anonymous (nobody) Summary: sgmlop returns Unicode Initial Comment: This patch enhances sgmlop: It adds a third parser type (XMLUnicodeParser) that returns Unicode objects to the application. The parser recognizes all 8bit encodings in the XML header and decodes the 8bit characters accordingly. The encoding defaults to UTF-8. (This could be changed easily or made customizable) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=412237&group_id=6473 From larsga@garshol.priv.no Thu Mar 29 21:38:58 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 29 Mar 2001 23:38:58 +0200 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> References: <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> Message-ID: * Fred L. Drake, Jr. | | Here's the problem: What we want is to be able to say | "ANY-and-we-really-mean-it", not ANY as defined in the DTD language. | That definition tells us that ANY means anything *defined in the | DTD*, which is pretty limited -- this is an inherited SGML wart. I | don't know how to express what we actually want in the DTD language; | if anyone can tell me, I'd be glad to change the DTD for revision | 1.1. I would do it like this: That would allow anyone creating an extended DTD to first define their elements, then redefine %any;, then refer to the XBEL DTD and have it interpreted correctly. ANY would also work, in the sense that another DTD could define the extra elements and then refer to the XBEL DTD, but I think it would be too loose, and that a PE is better. --Lars M. From larsga@garshol.priv.no Thu Mar 29 21:40:39 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 29 Mar 2001 23:40:39 +0200 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> Message-ID: * Fred L. Drake, Jr. | | Having joined the ranks of DOM implementors myself, I can only wish | I missed it. ;-( So do I. In fact, if you're writing a DOM implementation I suggest that you stop and that we design an API more suitable to whatever it is we want to do. Python really could use a better tree XML API than the DOM. Pyxie looks good, but it needs work. JDOM also looks good. --Lars M. From fdrake@acm.org Thu Mar 29 21:46:52 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 29 Mar 2001 16:46:52 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: References: <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> Message-ID: <15043.44364.217328.30550@cj42289-a.reston1.va.home.com> Lars Marius Garshol writes: > I would do it like this: > > > > > I like this much better! I've named this metadata.mix to be consistent with other PEs in XBEL, but otherwise used this directly. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From fdrake@acm.org Thu Mar 29 21:51:07 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 29 Mar 2001 16:51:07 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> Message-ID: <15043.44619.964310.42196@cj42289-a.reston1.va.home.com> [Removed Ross Burton from the list of recipients; the Python DOM issue isn't relevant to galeon.] Lars Marius Garshol writes: > So do I. In fact, if you're writing a DOM implementation I suggest > that you stop and that we design an API more suitable to whatever it > is we want to do. Python really could use a better tree XML API than > the DOM. Pyxie looks good, but it needs work. JDOM also looks good. Alas, I'm afraid there was an element of "buzzword compliance" in the motivation for the implementation I'm involved in. I'd be very interested in developing a new API to use instead, though. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From larsga@garshol.priv.no Thu Mar 29 21:57:54 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 29 Mar 2001 23:57:54 +0200 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <15043.44619.964310.42196@cj42289-a.reston1.va.home.com> References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> <15043.44619.964310.42196@cj42289-a.reston1.va.home.com> Message-ID: * Fred L. Drake, Jr. | | Alas, I'm afraid there was an element of "buzzword compliance" in | the motivation for the implementation I'm involved in. I'd be very | interested in developing a new API to use instead, though. Then I think we should put it in the roadmap, unless we have some people ready to start on it now. I am not able to, since I'll be very heavily loaded until Easter and (what bliss!!!) on holiday after that. --Lars M. From fdrake@acm.org Thu Mar 29 21:59:24 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 29 Mar 2001 16:59:24 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> <15043.44619.964310.42196@cj42289-a.reston1.va.home.com> Message-ID: <15043.45116.937337.669625@cj42289-a.reston1.va.home.com> Lars Marius Garshol writes: > Then I think we should put it in the roadmap, unless we have some > people ready to start on it now. I am not able to, since I'll be very > heavily loaded until Easter and (what bliss!!!) on holiday after that. Sounds good to me. I won't be able to work on it until after Python 2.1 is out. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From rsalz@zolera.com Thu Mar 29 23:43:43 2001 From: rsalz@zolera.com (Rich Salz) Date: Thu, 29 Mar 2001 18:43:43 -0500 Subject: [XML-SIG] Metadata in XBEL References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> <15043.44619.964310.42196@cj42289-a.reston1.va.home.com> Message-ID: <3AC3C8AF.28D02DCC@zolera.com> > I'd be very > interested in developing a new API to use instead, though. I'd rather have the current stuff documented. :) /r$ From fdrake@acm.org Fri Mar 30 00:11:39 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 29 Mar 2001 19:11:39 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <3AC3C8AF.28D02DCC@zolera.com> References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> <15043.44619.964310.42196@cj42289-a.reston1.va.home.com> <3AC3C8AF.28D02DCC@zolera.com> Message-ID: <15043.53051.338378.384859@cj42289-a.reston1.va.home.com> Rich Salz writes: > I'd rather have the current stuff documented. :) Are you aware of the DOM documentation in the development version of the Python docs? See: http://python.sourceforge.net/devel-docs/lib/markup.html -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From martin@loewis.home.cs.tu-berlin.de Fri Mar 30 06:14:03 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 30 Mar 2001 08:14:03 +0200 Subject: [XML-SIG] Documentation In-Reply-To: <15043.53051.338378.384859@cj42289-a.reston1.va.home.com> (fdrake@acm.org) References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> <15043.44619.964310.42196@cj42289-a.reston1.va.home.com> <3AC3C8AF.28D02DCC@zolera.com> <15043.53051.338378.384859@cj42289-a.reston1.va.home.com> Message-ID: <200103300614.f2U6E3W01048@mira.informatik.hu-berlin.de> > Rich Salz writes: > > I'd rather have the current stuff documented. :) > > Are you aware of the DOM documentation in the development version of > the Python docs? See: > > http://python.sourceforge.net/devel-docs/lib/markup.html There is still a lot of stuff missing, though: - saxexts/sax2exts. I think Lars Marius claims that these are obsolete, but I can't see how to live without them. - saxlib.{DeclHandler, LexicalHandler} - saxutils.{ErrorPrinter, ErrorRaiser, Location} - xml.dom.javadom - DOM interfaces beyond Core in 4DOM - xml.dom.ext - xml.marshal - xml.utils.qp_xml Regards, Martn From Eugene.Leitl@lrz.uni-muenchen.de Fri Mar 30 09:12:18 2001 From: Eugene.Leitl@lrz.uni-muenchen.de (Eugene Leitl) Date: Fri, 30 Mar 2001 11:12:18 +0200 (MET DST) Subject: [XML-SIG] Documentation In-Reply-To: <200103300614.f2U6E3W01048@mira.informatik.hu-berlin.de> Message-ID: On Fri, 30 Mar 2001, Martin v. Loewis wrote: > There is still a lot of stuff missing, though: Any way XML-RPC will make it into PyXML? From martin@loewis.home.cs.tu-berlin.de Fri Mar 30 11:27:14 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 30 Mar 2001 13:27:14 +0200 Subject: [XML-SIG] Documentation In-Reply-To: (message from Eugene Leitl on Fri, 30 Mar 2001 11:12:18 +0200 (MET DST)) References: Message-ID: <200103301127.f2UBREH08494@mira.informatik.hu-berlin.de> > On Fri, 30 Mar 2001, Martin v. Loewis wrote: > > > There is still a lot of stuff missing, though: > > Any way XML-RPC will make it into PyXML? Due to contributions of code, of course. I was talking about missing documentation, though, not about missing code. Regards, Martin From support@internetdiscovery.com Fri Mar 30 14:58:57 2001 From: support@internetdiscovery.com (Mike Clarkson) Date: Fri, 30 Mar 2001 06:58:57 -0800 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <15042.8087.392696.683721@cj42289-a.reston1.va.home.com> References: <985724354.4243.0.camel@eddie> <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> Message-ID: <3.0.6.32.20010330065857.007c7140@popd.ix.netcom.com> At 12:29 PM 3/28/01 -0500, you wrote: > >[Adding David Faure to the recipients list.] > Here's the problem: What we want is to be able to say >"ANY-and-we-really-mean-it", not ANY as defined in the DTD language. >That definition tells us that ANY means anything *defined in the DTD*, >which is pretty limited -- this is an inherited SGML wart. I don't >know how to express what we actually want in the DTD language; if >anyone can tell me, I'd be glad to change the DTD for revision 1.1. Isn't the canonical "solution" to this: . ]]> That's legal in terms of an ANY definition of isn't it? We do this because we also don't want parse the contents of the metadara. Or if the contents of the tag stores non-conforming HTML, such as a user-generated description or comment, is that not the same problem: in HTML 2.0

. ]]> Even if it were conforming XML, we'd want to mask it off anyway to protect it from being parsed by the DOM; we want to treat it as a chunk that gets replaced en masse when the user decides to. Mike. From ken@bitsko.slc.ut.us Fri Mar 30 15:28:55 2001 From: ken@bitsko.slc.ut.us (Ken MacLeod) Date: 30 Mar 2001 09:28:55 -0600 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: Lars Marius Garshol's message of "29 Mar 2001 23:40:39 +0200" References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> Message-ID: Lars Marius Garshol writes: > * Fred L. Drake, Jr. > | > | Having joined the ranks of DOM implementors myself, I can only wish > | I missed it. ;-( > > So do I. In fact, if you're writing a DOM implementation I suggest > that you stop and that we design an API more suitable to whatever it > is we want to do. Python really could use a better tree XML API than > the DOM. Pyxie looks good, but it needs work. JDOM also looks good. I have one such API implemented in Orchard. The API is described in [1], and the Python implementation available from [2]. I also have a C implementation there as well, but the C <-> Python bridge is not available yet. I've briefly mentioned Orchard here a couple of times, but not in the context of its DOM or SAX APIs, because I'd presumed, apparently incorrectly, that there'd be little interest in less Java-ish APIs when Python has very solid SAX and DOM bindings with several implementations. Orchard implements the "node based" SAX we've discussed here before, where the nodes used in SAX are the same nodes used to build a tree. Implementing a pull-parser with nodes is a minor addition, but hasn't been specced yet. Orchard has a "grove-like" feel to it (intentionally) and allows for compatible subsets (like Common XML) or supersets (like Jonathan Borden's XSet or parsed-syntax information). Orchard's XML node semantics are intended to be compatible with DOM, such that bi-directional wrappers are both possible and shouldn't be too difficult. For example, I would like to use the W3C DOM Test Suite, via a wrapper, to certify the Orchard tree implementation. Orchard's SAX-like interface has been lightly tested against Java-style SAX parsers using SAX<->Orchard filters, and is also intended to be fully compatible. Orchard grew out of the need to implement this style of SAX and DOM for Perl's bindings. Over the last couple of years my writing has been split fairly evenly between Perl and Python and I wanted to be able to use this style of API in my Python applications as well. I figured even if it were just for myself, I'd be happy ;-) Let me know what you think, -- Ken [1] [2] From rsalz@zolera.com Fri Mar 30 17:17:45 2001 From: rsalz@zolera.com (Rich Salz) Date: Fri, 30 Mar 2001 12:17:45 -0500 Subject: [XML-SIG] Re: Documentation References: <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <15042.8394.88965.369473@cj42289-a.reston1.va.home.com> <15043.44619.964310.42196@cj42289-a.reston1.va.home.com> <3AC3C8AF.28D02DCC@zolera.com> <15043.53051.338378.384859@cj42289-a.reston1.va.home.com> <200103300614.f2U6E3W01048@mira.informatik.hu-berlin.de> Message-ID: <3AC4BFB9.F7E7E69F@zolera.com> > There is still a lot of stuff missing, though: and, for those of us living on the bleeding edge, xpath and xlst. :) And just in case it needs to be said, "I mean no disrespect." I'd much rather have undocumented code than no code. Fred's link is very useful -- it shows that there is an overall harness in which to put the docs. Probably early next week I'll have XML Canonicalization code ready to contribute. And now I know what to write up, too. /r$ From larsga@garshol.priv.no Fri Mar 30 18:23:59 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 30 Mar 2001 20:23:59 +0200 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <3.0.6.32.20010330065857.007c7140@popd.ix.netcom.com> References: <985724354.4243.0.camel@eddie> <985724354.4243.0.camel@eddie> <200103272130.f2RLUpu04246@mira.informatik.hu-berlin.de> <3.0.6.32.20010330065857.007c7140@popd.ix.netcom.com> Message-ID: * Mike Clarkson | | Isn't the canonical "solution" to this: | | | . | ]]> | No, this is not very nice, I think, because the information inside the CDATA marked section will then have to be reparsed. It is much better to redefine the DTD and just use an extended DTD. Well-written XBEL applications that don't understand this stuff should just ignore it, and those that understand it would much prefer to have the content parsed as part of the document. | That's legal in terms of an ANY definition of isn't it? Yes, it is. | We do this because we also don't want parse the contents of the | metadara. Well, if you _really_ don't it works, but... --Lars M. From martin@loewis.home.cs.tu-berlin.de Fri Mar 30 19:13:21 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Fri, 30 Mar 2001 21:13:21 +0200 Subject: [XML-SIG] New Developer Message-ID: <200103301913.f2UJDLC02061@mira.informatik.hu-berlin.de> Please welcome Rich Salz as a PyXML developer. He'll look into the xml.xslt and xml.xpath packages, as well as into authoring documentation. Regards, Martin From jtauber@bowstreet.com Sat Mar 31 00:58:29 2001 From: jtauber@bowstreet.com (James Tauber) Date: Fri, 30 Mar 2001 19:58:29 -0500 Subject: [XML-SIG] PyTREX as xml.schema.trex? Message-ID: Now that PyTREX (http://pytrex.sourceforge.net/) is beta, is there any interest in making it part of PyXML? James From fdrake@acm.org Sat Mar 31 06:38:34 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sat, 31 Mar 2001 01:38:34 -0500 (EST) Subject: [XML-SIG] New Developer In-Reply-To: <200103301913.f2UJDLC02061@mira.informatik.hu-berlin.de> References: <200103301913.f2UJDLC02061@mira.informatik.hu-berlin.de> Message-ID: <15045.31594.466578.261510@beowolf.pythonlabs.org> Martin v. Loewis writes: > Please welcome Rich Salz as a PyXML > developer. He'll look into the xml.xslt and xml.xpath packages, as > well as into authoring documentation. Hurray, documentation! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From martin@loewis.home.cs.tu-berlin.de Sat Mar 31 12:29:53 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 31 Mar 2001 14:29:53 +0200 Subject: [XML-SIG] PyTREX as xml.schema.trex? In-Reply-To: (message from James Tauber on Fri, 30 Mar 2001 19:58:29 -0500) References: Message-ID: <200103311229.f2VCTrY07215@mira.informatik.hu-berlin.de> > Now that PyTREX (http://pytrex.sourceforge.net/) is beta, is there any > interest in making it part of PyXML? Certainly. I have imported it into /xml/xml/schema/trex.py, as you've proposed. I also made you a developer, so you can update it as needed. I have not incorporated the test suite. If you want to ship it with PyXML, you should import it into xml/test. If you merely want to provide a copy to PyXML, you can also import it into /test/trex. Or, you can leave that alone, so that everybody would get the test suite from the pytrex CVS. Thanks for the contribution, Martin From larsga@garshol.priv.no Sat Mar 31 13:12:30 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 31 Mar 2001 15:12:30 +0200 Subject: [XML-SIG] xmlproc in PyXML CVS tree Message-ID: I have now, finally, moved xmlproc into the PyXML CVS tree, where it will be maintained from now on. I will no longer maintain it separately in my own CVS tree, but use the PyXML tree for this. To this end the xmlproc test suite has been added to the PyXML CVS tree as a separate top-level project called 'test'. This test suite, and especially the one named 'oasis' should be used to verify any changes made to xmlproc to ensure that they do not break anything. Please note that the test suite is about 20 MB, so it's a substantial download. --Lars M. From uche.ogbuji@fourthought.com Sat Mar 31 13:37:14 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 31 Mar 2001 06:37:14 -0700 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: Message from David Faure of "Thu, 29 Mar 2001 17:35:54 +0100." <200103291635.f2TGZsr27038@faure.worldonline.co.uk> Message-ID: <200103311337.GAA06268@localhost.localdomain> > One often requested feature, is for merging. For instance, in a company, > there could be a "company-global" set of bookmarks, to be merged with > the user's bookmarks - much like everything else in KDE already has > a global and a local directory, possibly with even more levels (e.g. for > groups of people). Yes. I actually implemented an off-line merge earlier, but I think a standardized merge indicator would be useful. > To make that possible, XBEL could have a sort of "include this other > bookmark collection" tag, and it could be up to the application to create > aliases towards those global bookmarks in the user's bookmark file. > Well, that's just one solution - it allows to change the order, to remove > a global bookmark, to insert its own anywhere... but it doesn't notice new > bookmarks in the global collection, unless some timestamp is used. > > Another way could be that including another set of bookmarks simply means > that all those bookmarks appear first, then those in the user's file. > This way, changes to the global collection are automatically taken into account, > but it's impossible to modify/remove/reorder/change anything in the global > collection. It's probably much easier to implement too, and has the exact semantic > of a #include. I suggest to add this to XBEL then: a simple > . > There's still the issue of relative paths vs absolute paths, but, well... > no solution here either :} That should instead be spelled Or such, so that processors that don't have first-class merge support can still include the other file through xinclude. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From uche.ogbuji@fourthought.com Sat Mar 31 13:40:21 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 31 Mar 2001 06:40:21 -0700 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: Message from "Ross Burton" of "Thu, 29 Mar 2001 17:56:57 +0100." <005601c0b871$4a632280$1501a8c0@180sw.com> Message-ID: <200103311340.GAA06282@localhost.localdomain> > > In summary, despite the compatibility problem with icon names (and paths), > > I'm very happy if icon="..." and toolbar="yes" are added to XBEL > > (given that Konqueror already uses those), I suggest to add an > > possibility, and the few other things that are not in XBEL and that might > > be in konqueror one day (keywords, scoring), can certainly be done as > > konq-specific metadata - unless others want to share the same data. > > I'm for creating a set of metadata owners which can be considered "standard" > in that they are defined under common grounds in the open. Not part of the > actual specification (as it's best if that is kept small) but a catalogue of > owners and expected content which would allow sharing of data. Into this > could be added all of the usefull attributes which could be shared without > making XBEL overly complex, such as keywords and scoring. Though some might not like it, this sounds like a job for namespaces, and for m12n such as that provided by RSS (which is a rocking success, BTW for open-source interoperability). Speaking of RSS, I think XBEL is properly a job for RDF. It could be converted to RDF without a lot of damage to its structure. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From uche.ogbuji@fourthought.com Sat Mar 31 13:54:35 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 31 Mar 2001 06:54:35 -0700 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: Message from Ken MacLeod of "30 Mar 2001 09:28:55 CST." Message-ID: <200103311354.GAA06314@localhost.localdomain> > I have one such API implemented in Orchard. The API is described in > [1], and the Python implementation available from [2]. I also have a > C implementation there as well, but the C <-> Python bridge is not > available yet. > > I've briefly mentioned Orchard here a couple of times, but not in the > context of its DOM or SAX APIs, because I'd presumed, apparently > incorrectly, that there'd be little interest in less Java-ish APIs > when Python has very solid SAX and DOM bindings with several > implementations. Oh come now. Fred and I are both DOM implementors who have expressed strong discontent with the DOM. I'm all for a better tree API. However, I'd like one, but *not* based on JDOM, but rather 100% Pythonic. I think to do otherwise is to risk continuing the performance and resource-hogging properties of straightforward DOM ports. > Orchard implements the "node based" SAX we've discussed here before, > where the nodes used in SAX are the same nodes used to build a tree. > Implementing a pull-parser with nodes is a minor addition, but hasn't > been specced yet. I think a Lisp approach to storing the nodes is an interesting idea, given Python's strong list processing. Basically, just a straightforward translation of the parameters of SAX events (plus node-type) into nested lists. Probably not exactly what we'd want for a PyDOM, but an easy straw man to build. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From jtauber@bowstreet.com Sat Mar 31 14:01:11 2001 From: jtauber@bowstreet.com (James Tauber) Date: Sat, 31 Mar 2001 09:01:11 -0500 Subject: [XML-SIG] Metadata in XBEL Message-ID: > Oh come now. Fred and I are both DOM implementors who have > expressed strong > discontent with the DOM. I'm all for a better tree API. > > However, I'd like one, but *not* based on JDOM, but rather > 100% Pythonic. I > think to do otherwise is to risk continuing the performance and > resource-hogging properties of straightforward DOM ports. Agreed. JDOM came about because DOM's language neutrality led to inefficiencies for particular languages. JDOM attempted to take advantage of the specifics of Java. A tree API should take advantage of the specifics of Python. James From uche.ogbuji@fourthought.com Sat Mar 31 14:03:04 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 31 Mar 2001 07:03:04 -0700 Subject: [XML-SIG] PyTREX as xml.schema.trex? In-Reply-To: Message from James Tauber of "Fri, 30 Mar 2001 19:58:29 EST." Message-ID: <200103311403.HAA06336@localhost.localdomain> > Now that PyTREX (http://pytrex.sourceforge.net/) is beta, is there any > interest in making it part of PyXML? Are you kidding? Absolutely! Disclaimer: I haven't had a moment to try it yet, though I hope to soon. I swotted over the TREX specs without the benefit of a friendly implementation to play with, and PyTREX should be fun to poke at. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From jtauber@bowstreet.com Sat Mar 31 14:08:45 2001 From: jtauber@bowstreet.com (James Tauber) Date: Sat, 31 Mar 2001 09:08:45 -0500 Subject: [XML-SIG] PyTREX as xml.schema.trex? Message-ID: > > Now that PyTREX (http://pytrex.sourceforge.net/) is beta, > is there any > > interest in making it part of PyXML? > > Certainly. I have imported it into /xml/xml/schema/trex.py, as you've > proposed. I also made you a developer, so you can update it as > needed. I have not incorporated the test suite. If you want to ship it > with PyXML, you should import it into xml/test. If you merely want to > provide a copy to PyXML, you can also import it into /test/trex. Or, > you can leave that alone, so that everybody would get the test suite > from the pytrex CVS. Thank you! I think I'll leave the test suite separate for now. I'll probably import it later, though. Have people typically maintained a parallel CVS and done separate releases for a while? i.e. should I make changes to both pytrex/pytrex.py and /xml/xml/schema/trex.py and continue to do releases from pytrex? James From jtauber@bowstreet.com Sat Mar 31 14:13:47 2001 From: jtauber@bowstreet.com (James Tauber) Date: Sat, 31 Mar 2001 09:13:47 -0500 Subject: [XML-SIG] PyTREX as xml.schema.trex? Message-ID: One more thing... Hints on how best to whip up some documentation? I would only need to write up a single page saying how to invoke the validator and intepret the return object. James > -----Original Message----- > From: James Tauber [mailto:jtauber@bowstreet.com] > Sent: Saturday, March 31, 2001 9:09 AM > To: 'Martin v. Loewis' > Cc: xml-sig@python.org > Subject: RE: [XML-SIG] PyTREX as xml.schema.trex? > > > > > Now that PyTREX (http://pytrex.sourceforge.net/) is beta, > > is there any > > > interest in making it part of PyXML? > > > > Certainly. I have imported it into /xml/xml/schema/trex.py, > as you've > > proposed. I also made you a developer, so you can update it as > > needed. I have not incorporated the test suite. If you want > to ship it > > with PyXML, you should import it into xml/test. If you > merely want to > > provide a copy to PyXML, you can also import it into /test/trex. Or, > > you can leave that alone, so that everybody would get the test suite > > from the pytrex CVS. > > Thank you! > > I think I'll leave the test suite separate for now. I'll > probably import it > later, though. > > Have people typically maintained a parallel CVS and done > separate releases > for a while? i.e. should I make changes to both pytrex/pytrex.py and > /xml/xml/schema/trex.py and continue to do releases from pytrex? > > James > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > From jtauber@bowstreet.com Sat Mar 31 14:14:07 2001 From: jtauber@bowstreet.com (James Tauber) Date: Sat, 31 Mar 2001 09:14:07 -0500 Subject: [XML-SIG] RDF? Message-ID: Where are we with standard RDF support in Python? Any work being done? Any interest in Dan Krech and I donating the RDF library within Redfoot (http://redfoot.sourceforge.net/)? Possibly merging with other RDF implementations. Uche? James From uche.ogbuji@fourthought.com Sat Mar 31 14:23:59 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 31 Mar 2001 07:23:59 -0700 Subject: [XML-SIG] PyTREX as xml.schema.trex? In-Reply-To: Message from James Tauber of "Sat, 31 Mar 2001 09:08:45 EST." Message-ID: <200103311423.HAA06459@localhost.localdomain> > Have people typically maintained a parallel CVS and done separate releases > for a while? i.e. should I make changes to both pytrex/pytrex.py and > /xml/xml/schema/trex.py and continue to do releases from pytrex? Yes. This is how 4DOM, and now 4XSLT and 4XPath have migrated into PyXML core. 4DOM had parallel CVS for almost 6 months, and according to current schedule 4XSLT/XPath will have parallel CVS for about 2 months (until the 4Suite 1.0 release ca. June 1). -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From uche.ogbuji@fourthought.com Sat Mar 31 14:28:50 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sat, 31 Mar 2001 07:28:50 -0700 Subject: [XML-SIG] RDF? In-Reply-To: Message from James Tauber of "Sat, 31 Mar 2001 09:14:07 EST." Message-ID: <200103311428.HAA06470@localhost.localdomain> > > Where are we with standard RDF support in Python? Any work being done? Any > interest in Dan Krech and I donating the RDF library within Redfoot > (http://redfoot.sourceforge.net/)? Possibly merging with other RDF > implementations. Well, we have 4RDF as well, but there's no reason why we can't have multiple RDF implementations. We could merge implementations, and I really have no problem with that, but since the core RDF model is subject to so many interpretations, I think it's an especially good idea to have parallel implementations. The question is: who wants RDF in PyXML? And would you prefer to start with a lightweight solution, or one with all the trimmings? -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA Software-engineering, knowledge-management, XML, CORBA, Linux, Python From tpassin@home.com Sat Mar 31 15:02:29 2001 From: tpassin@home.com (Thomas B. Passin) Date: Sat, 31 Mar 2001 10:02:29 -0500 Subject: [XML-SIG] RDF? References: <200103311428.HAA06470@localhost.localdomain> Message-ID: <001c01c0b9f3$9bc5d900$7cac1218@reston1.va.home.com> Uche Ogbuji > > Well, we have 4RDF as well, but there's no reason why we can't have multiple > RDF implementations. > Yes, yes > We could merge implementations, and I really have no problem with that, but > since the core RDF model is subject to so many interpretations, I think it's > an especially good idea to have parallel implementations. > yes again. > The question is: who wants RDF in PyXML? And would you prefer to start with a > lightweight solution, or one with all the trimmings? > Its a good question - about PyXML be only the "real" core xml infrastructure - parsing, DOM, and so on, or about major support areas for applications like RDF? If not, that sounds like a new SIG. That might be a good iead in the long run, but we would need more keen minds blasting stuff out first. I think that the SIG shouldn't be split at this time, so I'd say to keep RDF in it. And many thanks to James for this work and his generous offer to share it. Cheers, Tom P From ken@bitsko.slc.ut.us Sat Mar 31 15:03:11 2001 From: ken@bitsko.slc.ut.us (Ken MacLeod) Date: 31 Mar 2001 09:03:11 -0600 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: Uche Ogbuji's message of "Sat, 31 Mar 2001 06:54:35 -0700" References: <200103311354.GAA06314@localhost.localdomain> Message-ID: Uche Ogbuji writes: > However, I'd like [a Pythonic DOM], but *not* based on JDOM, but > rather 100% Pythonic. I think to do otherwise is to risk continuing > the performance and resource-hogging properties of straightforward > DOM ports. The Orchard API is "near pure" Pythonic, using only objects for nodes, and arrays and mappings for NodeLists and NamedNodeLists. No DOM-style iterators and manipulation. Other behaviors, which do not replicate built-in Python features (like normalize()) are retained. "Near pure" means two things: 1) Orchard uses accessor overrides on attribute lookups, to make things like element.tag_name map properly to element.prefix and element.local_name, and 2) in refactoring the node base class to work across non-XML nodes (like RSS or MPEG), certain "intrinsic" properties, like Parent and NodeType are moved into a seperate namespace. Orchard provides namespaced-attributes on all nodes. Not a necessity just XML nodes, but a big win for other formats. Namespaced attributes are accessed using a tuple key acessing the node with mapping syntax: dublin_core = "http://purl.org/dc/elements/1.1/" print rss_channel[(dublin_core, 'creator')] I've recently came with an idea for an Orchard.namespace() function which creates a name generator, so the above is more simply now: DC = Orchard.namespace("http://purl.org/dc/elements/1.1/") print rss_channel[DC.creator] for all XML local-names which are valid Python tokens. I've got to run now, so I'll do some more evangelizing later ;-) -- Ken From fdrake@acm.org Sat Mar 31 16:14:08 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sat, 31 Mar 2001 11:14:08 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <200103311354.GAA06314@localhost.localdomain> References: <200103311354.GAA06314@localhost.localdomain> Message-ID: <15046.592.612106.548532@beowolf.pythonlabs.org> Uche Ogbuji writes: > I think a Lisp approach to storing the nodes is an interesting idea, given > Python's strong list processing. Basically, just a straightforward > translation of the parameters of SAX events (plus node-type) into nested > lists. Probably not exactly what we'd want for a PyDOM, but an easy straw > man to build. There is xml.utils.qp_xml, which is very lightweight. Perhaps that should be examined more carefully? I think one thing we need to consider before we settle on any particular API is, how much abstraction should we provide, and how much should we expose the lexical details? One thing we've found whlie working with Zope is that while abstract is nice, we usually want to work our transformations in a near surgical manner -- the less we change about the input, the better. This is especially important if we're feeding a WebDAV client, where a human is relatively likely to want to view or edit the source text we generate. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From martin@loewis.home.cs.tu-berlin.de Sat Mar 31 16:54:57 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 31 Mar 2001 18:54:57 +0200 Subject: [XML-SIG] PyTREX as xml.schema.trex? In-Reply-To: (message from James Tauber on Sat, 31 Mar 2001 09:13:47 -0500) References: Message-ID: <200103311654.f2VGsvw08220@mira.informatik.hu-berlin.de> > Hints on how best to whip up some documentation? I would only need > to write up a single page saying how to invoke the validator and > intepret the return object. I think you've got two options: you can either commit something into the PyXML www pages (checkout www from pyxml, on commit, a cron job should pick it up automatically after 6 hours); alternatively, you could write a section in doc/xml-howto.tex. That will take some more time to propagate, since I (or Andrew) has to forward this change to the Python HOWTOs. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Sat Mar 31 16:50:55 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 31 Mar 2001 18:50:55 +0200 Subject: [XML-SIG] PyTREX as xml.schema.trex? In-Reply-To: (message from James Tauber on Sat, 31 Mar 2001 09:08:45 -0500) References: Message-ID: <200103311650.f2VGotu08217@mira.informatik.hu-berlin.de> > Have people typically maintained a parallel CVS and done separate releases > for a while? i.e. should I make changes to both pytrex/pytrex.py and > /xml/xml/schema/trex.py and continue to do releases from pytrex? People maintain stuff in parallel all the time. If you don't want to lose your hair over it, you better use CVS tags to indicate when you've copied changes from one tree to the other. That lets you find out whether there have been independent or overlapping changes in either tree. Specifically for PyXML, atleast the following pieces had different homes at some time: - xmlproc (Lars gave up his own CVS tree just now) - 4DOM (has been long in the Fourthought CVS) - pyexpat.c (primarily lives in Python CVS, with copies in PyXML and Zope) - expat (lives in expat CVS, and is updated in PyXML only occasionally) - xml-howto.tex/xml-ref.tex (lives primarily in the Python HOWTOs) - the core of xml.sax, and minidom (maintain in Python CVS, copied into PyXML - sometimes vice versa) As for what you should do with PyTREX: that's your own decision. If you expect to move on a fast pace, I recommend to keep your own project - it might take some time until PyXML 0.7 is released. There is then no need to commit every single change into the PyXML copy as well. I'll give advance warning of a 0.7 release (likely after the 4XSLT issues have been settled). Regards, Martin From fdrake@acm.org Sat Mar 31 16:59:52 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sat, 31 Mar 2001 11:59:52 -0500 (EST) Subject: [XML-SIG] PyTREX as xml.schema.trex? In-Reply-To: References: Message-ID: <15046.3336.163741.378763@beowolf.pythonlabs.org> James Tauber writes: > Hints on how best to whip up some documentation? I would only need to write > up a single page saying how to invoke the validator and intepret the return > object. My recommendation is to use the Python LaTeX format. Not because I think it's a good format (though I don't think it's particularly bad), but because it will be easy to integrate with other parts of the Python documentation. I'm starting to move once more on turning all the documentation into an XML format for authoring, and there will be a tool to convert the Python LaTeX markup into XML with very little manual intervention. I'm also starting to actually learn XSLT so I can start making use of an XML version of the docs. So there is hope. Regarding PyXML documentation, what I'd like to do is to create two documents (based in part on the existing documentation in PyXML and the Python Library Reference). The first document (ok, probably a set of smaller docs) will give the Python bindings of common XML APIs: DOM, SAX2, etc. The second will be reference documentation for utility modules and implementation-specific extensions of the standard APIs. The tutorial HOWTO will remain a separate document. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From akuchlin@mems-exchange.org Sat Mar 31 18:34:14 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Sat, 31 Mar 2001 13:34:14 -0500 Subject: [XML-SIG] RDF? In-Reply-To: <001c01c0b9f3$9bc5d900$7cac1218@reston1.va.home.com>; from tpassin@home.com on Sat, Mar 31, 2001 at 10:02:29AM -0500 References: <200103311428.HAA06470@localhost.localdomain> <001c01c0b9f3$9bc5d900$7cac1218@reston1.va.home.com> Message-ID: <20010331133414.A32562@ute.cnri.reston.va.us> On Sat, Mar 31, 2001 at 10:02:29AM -0500, Thomas B. Passin wrote: >Its a good question - about PyXML be only the "real" core xml >infrastructure - parsing, DOM, and so on, or about major support areas for >applications like RDF? If not, that sounds like a new SIG. That might be a We've discussed applications on the XML-SIG before; just in the last few days we've had all that discussion of XBEL, for example. Discussion of building XML-related things in Python is on-topic for this SIG, in my view, even if the system isn't a candidate for inclusion in the PyXML distribution. RDF looks like it's going to be common and fundamental enough that IMHO there should be some support for it in the basic package. I'd lean toward making it minimal instead of full-featured, but it doesn't look like RDF needs that large an API; neither 4RDF nor Redfoot have very large APIs, though I haven't looked at them very closely. --amk From akuchlin@mems-exchange.org Sat Mar 31 18:36:15 2001 From: akuchlin@mems-exchange.org (Andrew Kuchling) Date: Sat, 31 Mar 2001 13:36:15 -0500 Subject: [XML-SIG] Documentation In-Reply-To: <200103311654.f2VGsvw08220@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Sat, Mar 31, 2001 at 06:54:57PM +0200 References: <200103311654.f2VGsvw08220@mira.informatik.hu-berlin.de> Message-ID: <20010331133615.B32562@ute.cnri.reston.va.us> On Sat, Mar 31, 2001 at 06:54:57PM +0200, Martin v. Loewis wrote: >could write a section in doc/xml-howto.tex. That will take some more >time to propagate, since I (or Andrew) has to forward this change to >the Python HOWTOs. We should really change that, though; the master copy should live in the pyxml CVS on SourceForge, close to the code it documents. --amk From fdrake@acm.org Sat Mar 31 19:21:47 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Sat, 31 Mar 2001 14:21:47 -0500 (EST) Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <200103311337.GAA06268@localhost.localdomain> References: <200103291635.f2TGZsr27038@faure.worldonline.co.uk> <200103311337.GAA06268@localhost.localdomain> Message-ID: <15046.11851.341533.770037@beowolf.pythonlabs.org> Uche Ogbuji writes: > Yes. I actually implemented an off-line merge earlier, but I think a > standardized merge indicator would be useful. To make this meaningful, do we need more discussion of what "merge" means, or should this be left entirely to clients? I'm inclined to think we need a good description of the expected range of application and motivation, and the rest can be left to specific applications. > That should instead be spelled > > > > Or such, so that processors that don't have first-class merge support can > still include the other file through xinclude. This syntax seems reasonable; I presume we'll want to include some way to mark multiple sources with priorities to determine "who wins" in the presence of multiple sources for a bookmark; some applications will present all versions of a bookmark and others will only want to present one but make the determination based on the bookmark data. I presume this element should be allowed in both and elements. Do we want to do this in XBEL 1.1 or wait for more experiance before adding it? -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From rsalz@zolera.com Sat Mar 31 19:30:53 2001 From: rsalz@zolera.com (Rich Salz) Date: Sat, 31 Mar 2001 14:30:53 -0500 Subject: [XML-SIG] RDF? Message-ID: <200103311930.OAA14630@zolera.com> I agree that anything having to do with Python implementation of XML can be discussed here. As for what should be in PyXML... I used to be quite sure: core technology only. Sure, there were those nagging doubts -- what is 'core,' other than 'I know it when I see it' -- but I was pretty confident. Now, I'm not so sure. (It was consideration of xmlrpc.) For the end-user, what are the reasons to not include a package under PyXML? I understand the developer issues -- see James's recent thread bout PyTrex, for example -- but what does an end-user (i.e., me :) care? Hoping to spark some discussion. /r$ From tpassin@home.com Sat Mar 31 19:59:30 2001 From: tpassin@home.com (Thomas B. Passin) Date: Sat, 31 Mar 2001 14:59:30 -0500 Subject: [XML-SIG] RDF? References: <200103311428.HAA06470@localhost.localdomain> <001c01c0b9f3$9bc5d900$7cac1218@reston1.va.home.com> <20010331133414.A32562@ute.cnri.reston.va.us> Message-ID: <004201c0ba1d$1945da00$7cac1218@reston1.va.home.com> Andrew Kuchling said - > We've discussed applications on the XML-SIG before; just in the last > few days we've had all that discussion of XBEL, for example. > Discussion of building XML-related things in Python is on-topic for > this SIG, in my view, even if the system isn't a candidate for > inclusion in the PyXML distribution. > Yes, me too. Tom P From martin@loewis.home.cs.tu-berlin.de Sat Mar 31 20:45:50 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sat, 31 Mar 2001 22:45:50 +0200 Subject: [XML-SIG] RDF? In-Reply-To: <200103311930.OAA14630@zolera.com> (message from Rich Salz on Sat, 31 Mar 2001 14:30:53 -0500) References: <200103311930.OAA14630@zolera.com> Message-ID: <200103312045.f2VKjoc09005@mira.informatik.hu-berlin.de> > Now, I'm not so sure. (It was consideration of xmlrpc.) For the > end-user, what are the reasons to not include a package under PyXML? > I understand the developer issues -- see James's recent thread bout > PyTrex, for example -- but what does an end-user (i.e., me :) care? End users will only complain about "too much" if the size of the distribution grows unacceptably. Since bandwidth and disk space are going up all the time, this is not a real danger - although 20MB testsuite for xmlproc probably would have been a little too much. They will also complain if the stuff that is there does not work. That, indirectly, limits growth, and means that sometimes things have to be taken out just because the original author ran away, and nobody cares to maintain it (or because the original author did not want to take the blame anymore :-). Regards, Martin From david@mandrakesoft.com Sat Mar 31 22:21:13 2001 From: david@mandrakesoft.com (David Faure) Date: Sat, 31 Mar 2001 23:21:13 +0100 Subject: [XML-SIG] Metadata in XBEL In-Reply-To: <15046.11851.341533.770037@beowolf.pythonlabs.org> References: <200103311337.GAA06268@localhost.localdomain> <15046.11851.341533.770037@beowolf.pythonlabs.org> Message-ID: <200103312221.f2VMLEX02984@faure.worldonline.co.uk> On Saturday 31 March 2001 20:21, Fred L. Drake, Jr. wrote: > Uche Ogbuji writes: > > Yes. I actually implemented an off-line merge earlier, but I think a > > standardized merge indicator would be useful. What's off-line merge ? > To make this meaningful, do we need more discussion of what "merge" > means, or should this be left entirely to clients? I'm inclined to > think we need a good description of the expected range of application > and motivation, and the rest can be left to specific applications. > > > That should instead be spelled > > > > > > > > Or such, so that processors that don't have first-class merge support can > > still include the other file through xinclude. > > This syntax seems reasonable; I presume we'll want to include some > way to mark multiple sources with priorities to determine > "who wins" in the presence of multiple sources for a bookmark; some > applications will present all versions of a bookmark and others will > only want to present one but make the determination based on the > bookmark data. > I presume this element should be allowed in both > and elements. Do we want to do this in XBEL 1.1 or wait for > more experiance before adding it? I agree that the whole issue probably needs more thinking if we want to get it right and devise a complete merging mechanism. I was simply suggesting an easy solution (including another file) - but that definitely doesn't go as far as a full merging, plus the possibility to "hide" included bookmarks, etc. I'm fine with this being left out from XBEL 1.1, and we can come back on it when someone starts implementing it, or if someone has a mechanism to suggest. -- David FAURE, david@mandrakesoft.com, faure@kde.org http://perso.mandrakesoft.com/~david/, http://www.konqueror.org/ KDE, Making The Future of Computing Available Today