From akuchlin@cnri.reston.va.us Mon Mar 16 20:28:37 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Mon, 16 Mar 1998 15:28:37 -0500 (EST) Subject: [XML-SIG] saxlib.py, package structure, & HOWTO outline Message-ID: <199803162028.PAA06111@cnri.reston.va.us> This message bounces around all over the place. The current status of the XML-SIG is quite promising; we've already got prototype implementations of the two XML APIs (SAX and DOM), and a prototype interface to XMLTok. A bit of explanation (that will probably get recycled into the HOWTO): SAX and DOM are two sides of the same coin; they're different ways to access representations of XML documents. DOM is a tree-based representation, so you have the whole document in memory at once (unless you either do something extremely clever with lazy construction of the tree, or place constraints on how you can traverse the tree and build it on the fly). SAX is an event-based API, so you write callbacks, which get called by the XML parser as elements begin and end. Both are useful for different tasks; you can wander all over the tree at random with DOM, but SAX is lower-level and lets you construct only the data structures you require--perhaps none at all. This distinction is nicely explained at . I'll add a link to this page to the XML-SIG's Resources page, at . Suggestions for more links are welcome. I've taken a brief look at saxlib.py, and it looks very neat and understandable; I quite agree with Paul Prescod's favorable impression of it. What's missing from it? As far as I can tell, documentation is the only thing missing, but I'm no XML expert. Tutorial information on SAX seems hard to come by, but that's what the HOWTO will be for... One minor nit: saxdemo.py has a problem with the following lines. import xmlproc p=xmlproc.Parser() There doesn't seem to be a Parser class or function in xmlproc.py, so an AttributeError is raised. Have I messed something up? I've done no more than download Stefane Fermigier's DOM code; haven't actually looked at it yet. One thing I've noticed is that it uses packages ("from dom.transformer import *") while the SAX library just uses top-level modules. Perhaps we should try to pin down the layout of the XML package first. Should there be subpackages (XML.SAX.foo, XML.DOM.foo, ...) or is it enough to put everything in a package named 'XML'? Fuzzily thinking about the organization of an XML-HOWTO, my outline looks like: Overview: (a few paragraphs) What is XML? Why do you care? Introduction to XML: (a few pages) Extremely brief intro to XML syntax & ideas, w/ pointers to complete resources Glossary: Glossaries usually come at the end, but there are enough acronyms and concepts that it might be better placed here. DOM: The tree-based interface to XML documents. Explanations, sample code, ... SAX: The event-based interface. Explanations, sample code, ... A.M. Kuchling http://starship.skyport.net/crew/amk/ Technology is a gift of God. After the gift of life it is perhaps the greatest of God's gifts. It is the mother of civilizations, of arts and of sciences. -- Freeman Dyson , _Infinite in All Directions_ From papresco@technologist.com Mon Mar 16 23:13:54 1998 From: papresco@technologist.com (Paul Prescod) Date: Mon, 16 Mar 1998 18:13:54 -0500 Subject: [XML-SIG] saxlib and enumerations Message-ID: <350DB232.F882058D@technologist.com> Okay, I've translated a Java SAX program to Python to make sure that our interface is the same as theirs. After some fiddling with my JPython installation, I successfully parsed an XML document based on a Java parser (Microstar's AELFred). I was hoping to be able to be able to parse using Python programs with no change, but failed in my first test, with xmllib.py and xlsax.py. Here's what my Java-trasnlated code to handle attributes looks like: attNames = atts.getAttributeNames() while (attNames.hasMoreElements()): aname = attNames.nextElement() ... attNames is a Java "enumeration object". Of course the Python equivalent is a sequence. So the code would look like this if I was using a Python parser: attNames = atts.getAttributeNames() for (aname in attNames.hasMoreElements()): ... I can think of three ways to address this: #1. We could abandon the idea of using Java parsers directly in JPython, and always require a thin "wrapper" that translates Java enumerations to Python sequences. #2. We could ask Jim to wrap *all* Java enumerations in Python sequences. #3. We could port the Java "enumeration" concept to Python for use with saxlib. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" From fermigie@math.jussieu.fr Mon Mar 16 23:26:56 1998 From: fermigie@math.jussieu.fr (Stefane Fermigier) Date: Tue, 17 Mar 1998 00:26:56 +0100 Subject: [XML-SIG] saxlib and enumerations In-Reply-To: <350DB232.F882058D@technologist.com>; from Paul Prescod on Mon, Mar 16, 1998 at 06:13:54PM -0500 References: <350DB232.F882058D@technologist.com> Message-ID: <19980317002656.56302@riemann.math.jussieu.fr> On Mon, Mar 16, 1998 at 06:13:54PM -0500, Paul Prescod wrote: > Okay, I've translated a Java SAX program to Python to make sure that our > interface is the same as theirs. After some fiddling with my JPython > installation, I successfully parsed an XML document based on a Java > parser (Microstar's AELFred). I was hoping to be able to be able to > parse using Python programs with no change, but failed in my first test, > with xmllib.py and xlsax.py. > > Here's what my Java-trasnlated code to handle attributes looks like: > > attNames = atts.getAttributeNames() > while (attNames.hasMoreElements()): > aname = attNames.nextElement() > ... > attNames is a Java "enumeration object". Of course the Python equivalent > is a sequence. So the code would look like this if I was using a Python > parser: > > attNames = atts.getAttributeNames() > for (aname in attNames.hasMoreElements()): > ... Attribute maps are currently implement, in either saxlib or PyDOM, as dictionnaries, so this would be coded as: for aname, avalue in atts.items(): Of course this is not good for other reasons, and should be fixed before public release. My point is that I don't want to use a language with sophisticated builtin data structures like python and just mimic the low-level interface of a Java library. > I can think of three ways to address this: > > #1. We could abandon the idea of using Java parsers directly in > JPython, and always require a thin "wrapper" that translates Java > enumerations to Python sequences. > > #2. We could ask Jim to wrap *all* Java enumerations in Python > sequences. > > #3. We could port the Java "enumeration" concept to Python for use with > saxlib. There is once again the issue of iterators in python that could be raised here. The #1 solution is the only valid one for the short term. Cheers, S. -- Stéfane Fermigier, MdC à l'Université Paris 7. Tel: 01.44.27.61.01 (Bureau). Mathematician, hacker, bassist. http://www.math.jussieu.fr/~fermigie/ "Life is good for only two things, discovering mathematics and teaching mathematics." Siméon Poisson. From digitome@iol.ie Tue Mar 17 12:29:40 1998 From: digitome@iol.ie (Sean Mc Grath) Date: Tue, 17 Mar 1998 12:29:40 GMT Subject: [XML-SIG] Showing off the power of Python for XML processing Message-ID: <199803171229.MAA26765@mail.iol.ie> First up - I am delighted to see this list come into being! I look forward to plenty of traffic on this list. As anyone who has read the article in Dobbs in Feb. will know, I made a stab at inventing a native Python data structure for representing the tree structure of an XML document. I would like to see some discussion as to how best to expose this tree structure for Python applications. Obviously, DOM will be one interface but should we limit ourselves to it? Whether we like it or not, developers selecting scripting languages for XML processing are going to perform line count comparisons. I think it would be great to be able to show how Python code can be a)succinct, b) understandable and c) maintainable for XML processing. Some arbitrary notions:- 1) Iterators In the Python article for Dobbs I provided a __getitem__ at the XML tree level to allow:- for ANode in ATree: Do something Good or bad? Would it be better Pythoneze to create a list of nodes an iterate that? MyNodeList = ATree.GetDescendants() for n in MyNodeList: Do something 2) Slice operations I think this is one of the areas where Python can really shine for XML processing. I write a lot of XML processing apps and a lot of the processing is driven by context: "if my parent is a SECT and my grandparent is a CHAP: if GetAncestors()[1:3] = ("SECT","CHAP"): do something A health collection of primitives for creating such lists combined with Pythons list processing, slicing functionality is *mouth watering*. 3) Collection Processing Rarely do any of my XML processing apps stand alone. By that I mean that they tend to process a collection of XML docs. // Process all chap*.xml docs. Print data content of // foo elements for f in glob.glob ("chap*.xml"): for t in LoadXML(f): if t.AtElement ("FOO"): print GetDataDescendants() How best to do it? 4) TreeApply One trick that I have found very useful is an apply() style helper function for trees. I have an XMLTreeApply helper function that walks an XML tree applying the supplied function to all the nodes in the tree. It proves particular useful for throwaway lambda functions XMLTreeApply (lamdba x:if x.AtElement("FOO"): print GetDataDescendants()) 5) Exposing XML from non-XML data sources It is only a matter of time before relational databases and so on natively provide functionality to expose their data as XML. In the mean-time wouldn't it be useful if dbm, glob, pstats and even calendar exposed XML? Comments??? Sean From fermigie@math.jussieu.fr Tue Mar 17 13:08:42 1998 From: fermigie@math.jussieu.fr (Stefane Fermigier) Date: Tue, 17 Mar 1998 14:08:42 +0100 Subject: [XML-SIG] Showing off the power of Python for XML processing In-Reply-To: <199803171229.MAA26765@mail.iol.ie>; from Sean Mc Grath on Tue, Mar 17, 1998 at 12:29:40PM +0000 References: <199803171229.MAA26765@mail.iol.ie> Message-ID: <19980317140842.40469@riemann.math.jussieu.fr> On Tue, Mar 17, 1998 at 12:29:40PM +0000, Sean Mc Grath wrote: > > As anyone who has read the article in Dobbs in Feb. will know, I did, and your book (Parseme.1st) too. > I made a stab at inventing a native Python data structure for > representing the tree structure of an XML document. I would > like to see some discussion as to how best to expose this > tree structure for Python applications. Obviously, DOM will > be one interface but should we limit ourselves to it? I my DOM package, I did represent the tree structure a bit differently than you did (I use lists where you use pointers to previous and next sibling). I guess that once an API has been chosen, the difference is minor. > Whether we like it or not, developers selecting scripting > languages for XML processing are going to perform line count > comparisons. I think it would be great to be able to show > how Python code can be a)succinct, b) understandable and > c) maintainable for XML processing. Absolutely. > Some arbitrary notions:- > > 1) Iterators > > In the Python article for Dobbs I provided a __getitem__ > at the XML tree level to allow:- > > for ANode in ATree: > Do something > > Good or bad? Would it be better Pythoneze to create a list > of nodes an iterate that? > MyNodeList = ATree.GetDescendants() > for n in MyNodeList: > Do something The order in which you traverse a tree is significant, there is no reason (I guess) to promote one instead on another. I would rather use an iterator: for node in tree.top_down_iterator(): do_something(node) > 2) Slice operations > > I think this is one of the areas where Python can really > shine for XML processing. I write a lot of XML processing > apps and a lot of the processing is driven by context: > > "if my parent is a SECT and my grandparent is a CHAP: > > if GetAncestors()[1:3] = ("SECT","CHAP"): > do something > > A health collection of primitives for creating such lists > combined with Pythons list processing, slicing functionality > is *mouth watering*. Do you mean: node.GetAncestors() or: GetAncestors(node) ? (the difference is only syntactical, but means > 3) Collection Processing > > Rarely do any of my XML processing apps stand alone. By > that I mean that they tend to process a collection of XML docs. > > // Process all chap*.xml docs. Print data content of > // foo elements > > for f in glob.glob ("chap*.xml"): > for t in LoadXML(f): > if t.AtElement ("FOO"): > print GetDataDescendants() > > How best to do it? Here's how you would do it in my current framwork for f in glob.glob ("chap*.xml"): p = XmlParser() document = p.parse('', f) # or: p.parse('', f); document = p.document t = MyTranformer() t.tranform(document) # or: document = t.tranform(document) etc... > 4) TreeApply > > One trick that I have found very useful is an apply() style > helper function for trees. > > I have an XMLTreeApply helper function that walks an XML > tree applying the supplied function to all the nodes in the tree. > It proves particular useful for throwaway lambda functions > > XMLTreeApply (lamdba x:if x.AtElement("FOO"): print GetDataDescendants()) Yes, that's cool, but it exposes one of the drawback of Python wrt scheme: you can't use instructions in lambda expressions. Another option is to use a query fonction: for node in document.query_descendants('this.GI == "FOO"'): ... or: for node in document.query_descendants(lambda x: x.GI == 'FOO'): ... (not implemented currently) or: for node in document.getElementsByTagName('FOO'): ... ^-- this one in in the DOM core specs, and I guess the W3C is working on an extension of this mecanism. > 5) Exposing XML from non-XML data sources > > It is only a matter of time before relational databases and so on natively > provide functionality to expose their data as XML. In the mean-time > wouldn't it be useful if dbm, glob, pstats and even calendar exposed > XML? Probably. One related point: I'm still working on the tranformation engine included in my DOM package. I have two options: use XSL (i.e. compile XSL stylesheets into Tranformer classes), but I personnally dislike XSL, or invent a new transformation language. What do you think ? (I can guess the answer: XSL is a standard, blah, blah, but ECMAScript is the standard scripting language of XSL, and there is no current implementation of ECMAScript in Python that I'm aware of. So even if we can compile basic XSL in Python, we can't cope with embeded JavaScript without hand-translation). Cheers, S. -- Stéfane Fermigier, MdC à l'Université Paris 7. Tel: 01.44.27.61.01 (Bureau). Mathematician, hacker, bassist. http://www.math.jussieu.fr/~fermigie/ "Life is good for only two things, discovering mathematics and teaching mathematics." Siméon Poisson. From akuchlin@cnri.reston.va.us Tue Mar 17 16:50:08 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Tue, 17 Mar 1998 11:50:08 -0500 (EST) Subject: [XML-SIG] saxlib.py, package structure, & HOWTO outline (repost) Message-ID: <199803171650.LAA26488@newcnri.cnri.reston.va.us> [I'm reposting this earlier message of mine, because I think many people joined the list after it went out and haven't seen it. This message bounces over several topics; some bits are questions that we should discuss, others are just random musings of mine.] The current status of the XML-SIG is quite promising; we've already got prototype implementations of the two XML APIs (SAX and DOM), and a prototype interface to XMLTok. A bit of explanation (that will probably get recycled into the HOWTO): SAX and DOM are two sides of the same coin; they're different ways to access representations of XML documents. DOM is a tree-based representation, so you have the whole document in memory at once (unless you either do something extremely clever with lazy construction of the tree, or place constraints on how you can traverse the tree and then build it on the fly). SAX is an event-based API, so you write callbacks, which get called by the XML parser as elements begin and end. Both are useful for different tasks; you can wander all over the tree at random with DOM, but SAX is lower-level and lets you construct only the data structures you require--perhaps none at all. This distinction is nicely explained at . I'll add a link to this page to the XML-SIG's Resources page, at . Suggestions for more links are welcome. I've taken a brief look at saxlib.py, and it looks very neat and understandable; I quite agree with Paul Prescod's favorable impression of it. What's missing from it? As far as I can tell, documentation is the only thing missing, but I'm no XML expert. Tutorial information on SAX seems hard to come by, but that's what the HOWTO will be for... One minor nit: saxdemo.py has a problem with the following lines. import xmlproc p=xmlproc.Parser() There doesn't seem to be a Parser class or function in xmlproc.py, so an AttributeError is raised. Have I messed something up? I've done no more than download Stefane Fermigier's DOM code; haven't actually looked at it yet. One thing I've noticed is that it uses packages ("from dom.transformer import *") while the SAX library just uses top-level modules. Perhaps we should try to pin down the layout of the XML package first. Should there be subpackages (XML.SAX.foo, XML.DOM.foo, ...) or is it enough to put everything in a package named 'XML'? I'd like to settle the question of package organization first, so the SAX and DOM implementations can be modified accordingly. Then I'll try rewriting my quotation-file handling using the new code, and see what problems I run into. Fuzzily thinking about the organization of an XML-HOWTO, my outline looks like: Overview: (a few paragraphs) What is XML? Why do you care? Introduction to XML: (a few pages) Extremely brief intro to XML syntax & ideas, w/ pointers to complete resources Glossary: Glossaries usually come at the end, but there are enough acronyms and concepts that it might be better placed here. DOM: The tree-based interface to XML documents. Explanations, sample code, ... SAX: The event-based interface. Explanations, sample code, ... A.M. Kuchling http://starship.skyport.net/crew/amk/ Technology is a gift of God. After the gift of life it is perhaps the greatest of God's gifts. It is the mother of civilizations, of arts and of sciences. -- Freeman Dyson , _Infinite in All Directions_ From djad022@uce.ac.uk Tue Mar 17 20:08:36 1998 From: djad022@uce.ac.uk (Daniel Biddle) Date: Tue, 17 Mar 1998 20:08:36 +0000 (GMT) Subject: [XML-SIG] Showing off the power of Python for XML processing In-Reply-To: <19980317140842.40469@riemann.math.jussieu.fr> from "Stefane Fermigier" at Mar 17, 98 02:08:42 pm Message-ID: <199803172010.PAA08021@python.org> Stefane Fermigier wrote: > Message-ID: <19980317140842.40469@riemann.math.jussieu.fr> > Date: Tue, 17 Mar 1998 14:08:42 +0100 > From: Stefane Fermigier > To: XML-SIG@python.org > Subject: Re: [XML-SIG] Showing off the power of Python for XML processing > > 4) TreeApply > > > > One trick that I have found very useful is an apply() style > > helper function for trees. > > > > I have an XMLTreeApply helper function that walks an XML > > tree applying the supplied function to all the nodes in the tree. > > It proves particular useful for throwaway lambda functions > > > > XMLTreeApply (lamdba x:if x.AtElement("FOO"): print GetDataDescendants()) > > Yes, that's cool, but it exposes one of the drawback of Python wrt scheme: > you can't use instructions in lambda expressions. True, but this isn't much of a drawback as Python supports first-class functions, which can be locally defined and deleted after use: def temp(x): if x.AtElement("FOO"): print GetDataDescendants() XMLTreeApply(temp) del temp (I think you've missed some things out of your example, but I've not tried to add them.) > for node in document.getElementsByTagName('FOO'): > ... > > ^-- this one in in the DOM core specs, and I guess the W3C is working on > an extension of this mecanism. If it's in the DOM spec, people will probably expect it. > > 5) Exposing XML from non-XML data sources > > > > It is only a matter of time before relational databases and so on natively > > provide functionality to expose their data as XML. In the mean-time > > wouldn't it be useful if dbm, glob, pstats and even calendar exposed > > XML? > > Probably. > Something like a toXML() function method on the objects these packages return? > One related point: I'm still working on the tranformation engine included > in my DOM package. I have two options: use XSL (i.e. compile XSL stylesheets > into Tranformer classes), but I personnally dislike XSL, or invent a new > transformation language. What in particular do you dislike about it? > What do you think ? (I can guess the answer: XSL is a standard, blah, blah, > but ECMAScript is the standard scripting language of XSL, and there is no > current implementation of ECMAScript in Python that I'm aware of. So even > if we can compile basic XSL in Python, we can't cope with embeded JavaScript > without hand-translation). Actually the scripting language in XSL is *based* on ECMAScript but with a few modifications to the scoping rules and an additional bit of syntax to make measurements slightly neater. A perfect ECMAScript implementation would be not quite right. (Grr!) Of course, we should complain loudly until XSL supports Python. B-) -- Daniel Biddle djad022@uce.ac.uk author of pyfunge From larsga@ifi.uio.no Tue Mar 17 22:58:17 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 17 Mar 1998 23:58:17 +0100 Subject: [XML-SIG] saxlib.py, package structure, & HOWTO outline Message-ID: * Andrew Kuchling | | I've taken a brief look at saxlib.py, and it looks very neat and | understandable; I quite agree with Paul Prescod's favorable | impression of it. What's missing from it? My understanding of the entity handling is still not complete, so there may be something missing in that area. I will clear that up as soon as possible. | One minor nit: saxdemo.py has a problem with the following lines. | | import xmlproc | p=xmlproc.Parser() | | There doesn't seem to be a Parser class or function in xmlproc.py, | so an AttributeError is raised. Have I messed something up? No, it's me who has messed up. I renamed the parser in xmlproc, but didn't rerelease saxlib. saxlib has been updated now with a fix for this as well as HTML documentation generated by gendoc. (I've also added documentation to xmlproc.) (I'm not replying to the stuff about package organization until I've had time to look more closely at XMLTok and PyDOM.) -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From larsga@ifi.uio.no Tue Mar 17 23:09:13 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 18 Mar 1998 00:09:13 +0100 Subject: [XML-SIG] saxlib and enumerations Message-ID: * Paul Prescod | | [Java enumerators vs Python lists] | | #2. We could ask Jim to wrap *all* Java enumerations in Python | sequences. I think we should try this approach. This problem will reappear in all Java/Python integration attempts because of the fundamental differences between these two languages. Java has enumerators because of things that are missing in the language design (parametric types) and Python does not have them because they aren't necessary. Ie: they could be implemented in Python, but that would take away part of the point of using Python in the first place. So if Jim thinks this is possible I think that's what we should do. It would solve the problem for DOM and other packages as well, it will require no wrapper code and it won't force us into having to use clumsy constructs just to remain compatible with JPython. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From papresco@technologist.com Wed Mar 18 13:28:35 1998 From: papresco@technologist.com (Paul Prescod) Date: Wed, 18 Mar 1998 08:28:35 -0500 Subject: [XML-SIG] [Fwd: Alpha release of XED: A smart XML instance editor] Message-ID: <350FCC03.F9094199@technologist.com> Check this out... Henry S. Thompson wrote: > > I'm please to announce the availability of the alpha release of XED, > a WYSIWYG XML instance editor. XED uses the LT XML toolset integrated > with a Python-Tk user interface, to provide a free, cross-platform, > well-formedness preserving editor for XML document instances. > > I very much welcome test users at this point, bearing in mind the > alpha-status of this release. > > You can download XED for Windows95/NT from > > ftp://ftp.cogsci.ed.ac.uk/pub/ht/xed.zip > > A binary for Solaris will be available later today. > > Feedback VERY much wanted. > > ht > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- Paul Prescod - http://itrc.uwaterloo.ca/~papresco "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" From papresco@technologist.com Wed Mar 18 13:35:54 1998 From: papresco@technologist.com (Paul Prescod) Date: Wed, 18 Mar 1998 08:35:54 -0500 Subject: [XML-SIG] Re: Alpha release of XED: A smart XML instance editor References: <788.199803181245@naomi.cogsci.ed.ac.uk> Message-ID: <350FCDBA.A96C7566@technologist.com> Henry, you may have heard that there is a lot of activity going on with regards to Python and XML. One of the things we will be doing is embedding a C toolkit. Would LT XML be appropriate for that? That question has a couple of parts: * can the software be released under a license as free as Python's? * do you know whether it is as fast as James Clark's tokenizer? * is it written in ANSI C? We are primarily interested in a parser at this point, so if the answer for the parser alone is different from that for the whole toolkit, that would be worth knowing. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" From ht@cogsci.ed.ac.uk Wed Mar 18 13:50:57 1998 From: ht@cogsci.ed.ac.uk (Henry S. Thompson) Date: Wed, 18 Mar 1998 13:50:57 GMT Subject: [XML-SIG] Re: Alpha release of XED: A smart XML instance editor In-Reply-To: <350FCDBA.A96C7566@technologist.com> (message from Paul Prescod on Wed, 18 Mar 1998 08:35:54 -0500) Message-ID: <1392.199803181350@naomi.cogsci.ed.ac.uk> At the moment XED is based on the toolkit, which does NOT have as free a license as Python's. In due course what I plan to do is release XED with the sources, but with a no-commercial-exploitation license. This would give Python programs access to our PyXML interface. It may well be that at some point we shift from basing it on the toolkit to basing it on the underlying tokeniser, RXP, which probably WILL be freely available without restriction, but that isn't clear yet, and certainly there's no timetable for it. RXP is currently not as fast as James's tokenizer. ht From papresco@technologist.com Wed Mar 18 14:14:01 1998 From: papresco@technologist.com (Paul Prescod) Date: Wed, 18 Mar 1998 09:14:01 -0500 Subject: [XML-SIG] Re: saxlib and enumerations References: Message-ID: <350FD6A9.42504F49@technologist.com> Lars Marius Garshol wrote: > > I think we should try this approach. This problem will reappear in all > Java/Python integration attempts because of the fundamental > differences between these two languages. I agree! > Java has enumerators because > of things that are missing in the language design (parametric types) > and Python does not have them because they aren't necessary. Java has enumerators because they are useful, just as C++ has enumerators (in STL) and Python has "defacto" enumerators (objects that implement the sequence protocol). The question is just whether the Java heterogenous enumerator construct is similar enough to the Python heterogenous sequence protocol to map one into the other the way Jim has mapped strings. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" From ht@cogsci.ed.ac.uk Wed Mar 18 15:41:33 1998 From: ht@cogsci.ed.ac.uk (Henry S. Thompson) Date: Wed, 18 Mar 1998 15:41:33 GMT Subject: [XML-SIG] So where is UNICODE support? Message-ID: <1651.199803181541@naomi.cogsci.ed.ac.uk> The THIRD message I got from the public about XED was as follows: > I tried XED alpha version on Japanese-Windows95. > But I colud not input Japanese-Code(Double Byte Character). > Do you have schedule support Japanese-Code and XML-DTD ? So what's the story, folks? The best way to really spike the perl/xml people's guns is to get 16-bit support out ASAP. ht From hugunin@cnri.reston.va.us Wed Mar 18 15:44:30 1998 From: hugunin@cnri.reston.va.us (Jim Hugunin) Date: Wed, 18 Mar 1998 10:44:30 -0500 Subject: [XML-SIG] Re: [JPYTHON] saxlib and enumerations References: <350FDDBA.6C3CF7BF@cnri.reston.va.us> Message-ID: <350FEBDE.F14D4FB6@cnri.reston.va.us> > Here's what my Java-trasnlated code to handle attributes looks like: > > attNames = atts.getAttributeNames() > while (attNames.hasMoreElements()): > aname = attNames.nextElement() > ... > attNames is a Java "enumeration object". Of course the Python equivalent > is a sequence. So the code would look like this if I was using a Python > parser: > > attNames = atts.getAttributeNames() > for (aname in attNames.hasMoreElements()): > ... > > I can think of three ways to address this: > > #1. We could abandon the idea of using Java parsers directly in > JPython, and always require a thin "wrapper" that translates Java > enumerations to Python sequences. > > #2. We could ask Jim to wrap *all* Java enumerations in Python > sequences. > > #3. We could port the Java "enumeration" concept to Python for use with > saxlib. I feel very strongly that either 1 or 2 are the right options. This is clearly an area where Python's syntax is far superior to Java's and there's no reason to cripple it. Which one depends on a few factors. I am willing to wrap all Java objects that implement the java.util.Enumeration interface so that they behave reasonably under JPython. I've been willing to do this for quite some time, but nobody has convinced me it was important enough to invest the time until now. There is one important issue here though: An Enumeration is not the same a Python list. The only thing you'd be able to do with an Enumeration object is to iterate over its items one time in the forward direction. So, this would work: attNames = atts.getAttributeNames() for aname in attNames: #do something But these wouldn't: len(attNames) attNames[-1] ... This is an important characteristic of enumerations, and it's not something I plan to get rid of. I don't know enough about SAX to be able to say whether or not this is the appropriate abstraction here. If what is really wanted is a Python list, then you will need to go with option #1. I hope this makes sense, and I'm eager to learn whether or not Enumerations are the appropriate abstraction here (and thus something I should implement before I get the next JPython release out the door). -Jim PS - In JDK 1.2 Java defines a number of much better generic container interfaces. I have every intention of making all Java objects that support these container interfaces as easy to use as Python's builtin containers. From akuchlin@cnri.reston.va.us Wed Mar 18 16:02:38 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Wed, 18 Mar 1998 11:02:38 -0500 (EST) Subject: [XML-SIG] So where is UNICODE support? In-Reply-To: <1651.199803181541@naomi.cogsci.ed.ac.uk> References: <1651.199803181541@naomi.cogsci.ed.ac.uk> Message-ID: <199803181602.LAA22125@newcnri.cnri.reston.va.us> Henry S. Thompson writes: >The THIRD message I got from the public about XED was as follows: > >> I tried XED alpha version on Japanese-Windows95. >> But I colud not input Japanese-Code(Double Byte Character). >> Do you have schedule support Japanese-Code and XML-DTD ? > >So what's the story, folks? The best way to really spike the perl/xml >people's guns is to get 16-bit support out ASAP. There's been some discussion on the String-SIG (string-sig@python.org) of how to implement fairly transparent Unicode support, and a fairly detailed proposal, but no one has done a testbed implementation yet. An implementation is needed to allow experimenting with the proposal to see what breaks, what needs to be fixed, etc. Follow-ups to the String-SIG, please. A.M. Kuchling http://starship.skyport.net/crew/amk/ It begins to rain fish. Mackerel. Herring. Sea bass. Pike. Sturgeon. Tench. Plaice. Salmon. From a clear sky. Trout. No cod. -- Father McGarry's last sight, in DOOM PATROL #20. From papresco@technologist.com Wed Mar 18 17:16:21 1998 From: papresco@technologist.com (Paul Prescod) Date: Wed, 18 Mar 1998 12:16:21 -0500 Subject: [XML-SIG] Re: [JPYTHON] saxlib and enumerations References: <350FDDBA.6C3CF7BF@cnri.reston.va.us> <350FEBDE.F14D4FB6@cnri.reston.va.us> Message-ID: <35100165.B96ACE57@technologist.com> Jim Hugunin wrote: > > This is an important characteristic of enumerations, and it's not something I > plan to get rid of. I don't know enough about SAX to be able to say whether > or not this is the appropriate abstraction here. attribute lists are essentially a dictionary. Getting the "attnames" enumeration is like getting the keys() of the dictionary. Order shouldn't matter. Indexed access is more or less meaningless. > If what is really wanted is a Python list, then you will need to go with > option #1. No. We just want an enumeration. If Python had an enumeration protocol (hopefully it will soon!) we would use that instead. > PS - In JDK 1.2 Java defines a number of much better generic container > interfaces. I have every intention of making all Java objects that support > these container interfaces as easy to use as Python's builtin containers. Great! Paul Prescod - http://itrc.uwaterloo.ca/~papresco "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" From larsga@ifi.uio.no Fri Mar 20 21:43:11 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 20 Mar 1998 22:43:11 +0100 Subject: [XML-SIG] SAX revision Message-ID: It looks as though the SAX specification will be changed soon. David Megginson (the SAX "maintainer") posted a proposal this morning, which can be viewed at and the ensuing discussion is available from One item of particular interest is that it looks as though AttributeMap will be substantially changed, so that the JPython integration problem may perhaps be resolved that way. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From papresco@technologist.com Fri Mar 20 23:14:04 1998 From: papresco@technologist.com (Paul Prescod) Date: Fri, 20 Mar 1998 18:14:04 -0500 Subject: [XML-SIG] SAX revision and JPython progress References: Message-ID: <3512F83B.6A7F88FA@technologist.com> I've just posted my comments on this new SAX revision. My comments basically come down to: "Don't take out the enumerations! In the next version of JPython they will be very naturally handled!" and "Don't introduce a dependency on java.io.InputStream. Non-java languages don't have it and it isn't internationalized anyhow!" This is as good a time as any for a progress report from me. Integrating JPython and SAX is as easy as integrating JPython with any other Java code (e.g. easy once you find your way around python paths, class paths and packages). I've written a small example applet (which becomes big once all of the required files are sucked in) but I won't release it while SAX and JPython are still up in the air. Once JPython supports enumeration mappng automatically and the SAX API allows me to read input from a textarea (i.e. not a URL!) I'll make it into a very (very!) simple editable tree view. JPython is shaping up to be a really wonderful XML processing system. It isn't perfect yet, though. There are three problems from my point of view: * start up time * difficulty of finding packages * error reporting I know Jim is doing everything he can to solve these. Error reporting is probably not too hard, but I think that the nature of the JVM makes the other two tricky. Paul Prescod - http://itrc.uwaterloo.ca/~papresco The United Nations Declaration of Human Rights will be 50 years old on December 10, 1998. These are your fundamental rights: http://www.udhr.org/history/default.htm From larsga@ifi.uio.no Sat Mar 21 17:03:34 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 21 Mar 1998 18:03:34 +0100 Subject: [XML-SIG] xmlproc & saxlib updates Message-ID: xmlproc version 0.20 has now been released. It's now very nearly a full well-formedness parser[1]. I've also added a command-line interface and some more documentation. I'm going to start on adding validation right away. I've updated saxlib (and xmlproc) to handle URLs as system identifiers. They're still at: [1] Some far-out things like whitespace normalization in formal public identifiers and attribute values and references to external entities in entity declarations still remain. They will eventually be added, but are not given priority right now. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From akuchlin@cnri.reston.va.us Mon Mar 23 14:15:26 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Mon, 23 Mar 1998 09:15:26 -0500 (EST) Subject: [XML-SIG] XML package organization Message-ID: <199803231415.JAA15521@newcnri.cnri.reston.va.us> This issue was raised a little while ago, but got no responses. If the tools are all going to be part of a Python package named 'XML', how should the package be organized? Into XML.DOM.something, XML.SAX.something? I'd really like to see some discussion of this, so that the authors can start supporting this organization. A.M. Kuchling http://starship.skyport.net/crew/amk/ Well, to be fair I did have a couple of gadgets he probably didn't, like a teaspoon and an open mind. -- The Doctor, in David Fisher's _The Creature From the Pit_ From jeff@Digicool.com Mon Mar 23 17:29:17 1998 From: jeff@Digicool.com (Jeffrey P Shell) Date: Mon, 23 Mar 1998 12:29:17 -0500 Subject: [XML-SIG] what is SAX? Message-ID: <199803231833.NAA28076@albert.digicool.com> I'm apparently even more out of the loop than I though, but: what is SAX? I thought I knew most of the modern XML-related TLA's (Three Letter Acronyms) such as XSL and XLL. Is SAX part of the XML spec that I've missed? Or is it just something really groovy that I'm just not aware of yet? :-) -- "Green Tony squeeled and I'm off to Galaxy X" .jPS jeff@Digicool.com Digital Creations http://www.digicool.com/ From larsga@ifi.uio.no Mon Mar 23 17:55:56 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 23 Mar 1998 18:55:56 +0100 Subject: [XML-SIG] what is SAX? In-Reply-To: <199803231833.NAA28076@albert.digicool.com> References: <199803231833.NAA28076@albert.digicool.com> Message-ID: * Jeffrey P. Shell | | I'm apparently even more out of the loop than I though, but: what is | SAX? It's a common API for XML parsers, hence the name: Simple API for XML. It's event-driven and rather straightforward to use. There are now Java and Python implementations with drivers for most of the available parsers in those two languages. | Is SAX part of the XML spec that I've missed? No, it was developed by the members of the xml-dev list to solve the parser API problem before DOM[1] arrived. (And SAX is also useful for many simpler applications where DOM would be overkill.) | Or is it just something really groovy that I'm just not aware of | yet? :-) It's really groovy, yes. :) Especially since it means that you may write your Python XML application without caring whether the user uses the C XML parser module or a pure Python one. [1] Document Object Model, the W3Cs common tree-based parser API. (Well, it's really a bit more than that since it allows you to change the document as well.) -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From larsga@ifi.uio.no Mon Mar 23 18:00:32 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 23 Mar 1998 19:00:32 +0100 Subject: [XML-SIG] XML package organization In-Reply-To: <199803231415.JAA15521@newcnri.cnri.reston.va.us> References: <199803231415.JAA15521@newcnri.cnri.reston.va.us> Message-ID: * Andrew Kuchling | | I'd really like to see some discussion of this, so that the authors | can start supporting this organization. I must admit that I haven't the faintest idea what I mean about this as long as the issue of distribution isn't settled. What I mean is: should all the XML tools be gathered into a single package and distributed together? Who should maintain it? What do we do when only one of the pieces is updated? Who decides what's a part of the package and what isn't? Should it be a part of the standard Python distribution? And so on. Did I miss something obvious here, since nobody else seems to wonder about this? -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From dkuhlman@enterpriselink.com Mon Mar 23 17:54:25 1998 From: dkuhlman@enterpriselink.com (Dave Kuhlman) Date: Mon, 23 Mar 1998 09:54:25 -0800 Subject: [XML-SIG] what is SAX? References: <199803231833.NAA28076@albert.digicool.com> Message-ID: <3516A1D1.A4467CA6@EnterpriseLink.com> Take a look at: http://www.microstar.com/XML/SAX/ Dave Jeffrey P Shell wrote: > > I'm apparently even more out of the loop than I though, but: what is > SAX? I > thought I knew most of the modern XML-related TLA's (Three Letter > Acronyms) > such as XSL and XLL. Is SAX part of the XML spec that I've missed? > Or is > it just something really groovy that I'm just not aware of yet? :-) > > -- > "Green Tony squeeled and I'm off to Galaxy X" > .jPS jeff@Digicool.com Digital Creations > http://www.digicool.com/ > > ------------------------------------------------------ > XML-SIG maillist - XML-SIG@python.org > http://www.python.org/mailman/listinfo/xml-sig -- Dave Kuhlman EnterpriseLink Technology Corp 2542 S. Bascom Ave., Suite #203 Campbell, CA 95008 dkuhlman@EnterpriseLink.com 408-558-2011 From larsga@ifi.uio.no Mon Mar 23 18:10:54 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 23 Mar 1998 19:10:54 +0100 Subject: [XML-SIG] XML-SIG wish-list? Message-ID: Just an idea: should we add to the SIG status page a list of projects we'd like to see done, but which no-one has volunteered for yet? Some things I can think of are: - an XSL implementation with Python scripting instead of JavaScript - an XLL implementation that navigates Stephanes DOM tree - an overview of the different XML-based standards and which of those it would be interesting to have Python support for (RDF, WIDL, CDF, MathML etc) This overview could obviously spawn some new entries... :) - UNICODE support -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From akuchlin@cnri.reston.va.us Mon Mar 23 18:48:24 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Mon, 23 Mar 1998 13:48:24 -0500 (EST) Subject: [XML-SIG] XML package organization In-Reply-To: References: <199803231415.JAA15521@newcnri.cnri.reston.va.us> Message-ID: <199803231848.NAA24718@newcnri.cnri.reston.va.us> Lars Marius Garshol writes: >What I mean is: should all the XML tools be gathered into a single >package and distributed together? Who should maintain it? What do we >do when only one of the pieces is updated? Who decides what's a part >of the package and what isn't? Should it be a part of the standard >Python distribution? And so on. IMHO, the authors of various components such as the XMLTok interface, or SAX, or DOM, will write the code and make it available. Someone else will maintain a complete distribution containing all the components; as SIG owner I'm willing to do that, though I'd also willing to have someone else volunteer to do that. When a component gets updated, the author will send a note to this mailing list, or to the maintainer of the complete distribution, who'll update the master package. I don't know if we'll lobby to make it a part of the standard distribution; that depends on the size of the code, and on whether Guido is interested. Perhaps a usable Python-only subset can be produced for the standard distribution, with optimizations such as the C XMLTok interface available in the master package for those who need them. >Did I miss something obvious here, since nobody else seems to wonder >about this? No; no one's brought it up before. A.M. Kuchling http://starship.skyport.net/crew/amk/ The shortest unit of time in the multiverse is the New York Second, defined as the period of time between the traffic lights turning green and the cab behind you honking. -- Terry Pratchett, _Lords and Ladies_ From akuchlin@cnri.reston.va.us Mon Mar 23 19:11:03 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Mon, 23 Mar 1998 14:11:03 -0500 (EST) Subject: [XML-SIG] XML-SIG wish-list? In-Reply-To: References: Message-ID: <199803231911.OAA25559@newcnri.cnri.reston.va.us> Lars Marius Garshol writes: >Just an idea: should we add to the SIG status page a list of projects >we'd like to see done, but which no-one has volunteered for yet? Some Good idea. I've added the first three ideas. Is anyone in the XML community maintaining a list of DTDs? > - UNICODE support This isn't really XML-specific. Jim Huginin posted a proposal to the String-SIG, suggesting a way to add Unicode support to CPython. Guido thought it was interesting, but doesn't have time to try implementing it. If you want to see Unicode support happen, it would really help to look at Jim's proposal (see the String-SIG archives) and do a trial implementation. That would give us something to experiment with, and see what breaks, what deficiencies turn up, etc. A.M. Kuchling http://starship.skyport.net/crew/amk/ Sorry about the writing. Robot fingers, you know? -- Cliff Steele in DOOM PATROL #23 From larsga@ifi.uio.no Mon Mar 23 20:10:05 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 23 Mar 1998 21:10:05 +0100 Subject: [XML-SIG] XML-SIG wish-list? In-Reply-To: <199803231911.OAA25559@newcnri.cnri.reston.va.us> References: <199803231911.OAA25559@newcnri.cnri.reston.va.us> Message-ID: * Andrew Kuchling | | Is anyone in the XML community maintaining a list of DTDs? Robin Cover has a fairly extensive list, but the DTDs are mixed with SGML DTDs: James Tauber has a pure XML list which is probably easier to use: | [Unicode] | | This isn't really XML-specific. I agree, but perhaps it should be listed anyway since it's so important. (Possibly with a pointer to the String-SIG.) | If you want to see Unicode support happen, it would really help to | look at Jim's proposal (see the String-SIG archives) and do a trial | implementation. Alas, I don't have the required skills to do that. But perhaps you should add what you wrote in this email on the page and link to that proposal so people know the current status? -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From Jack.Jansen@cwi.nl Mon Mar 23 21:39:52 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Mon, 23 Mar 1998 22:39:52 +0100 Subject: [XML-SIG] XML package organization In-Reply-To: Message by Andrew Kuchling , Mon, 23 Mar 1998 13:48:24 -0500 (EST) , <199803231848.NAA24718@newcnri.cnri.reston.va.us> Message-ID: Even if the people maintaining the various distributions would do so separately it would be nice to have a general overall structure. After all, the various SAX implementations should all be API-equivalent, so it would be nice if you could do something like try: from xml.sax.xmllib import Parser except ImportError: from xml.sax import Parser which would give you the xmllib-based sax parser if available, and otherwise any available sax parser. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From boyko@mail.bc.rogers.wave.ca Mon Mar 23 22:07:06 1998 From: boyko@mail.bc.rogers.wave.ca (Adrian Boyko) Date: Mon, 23 Mar 1998 14:07:06 -0800 Subject: [XML-SIG] Python + XSL In-Reply-To: Message-ID: <009201bd56a8$04349010$0a357118@boyko> - an XSL implementation with Python scripting instead of JavaScript I don't think a new implementations of XSL is the way to go. I'd rather see us making sure that: (1) The language definition for XSL allows the stylesheet author to specify a scripting language. I don't have the spec in front of me, but I don't think it does. (2) Existing XSL efforts (e.g. MSXSL) hook into their platform's language-neutral scripting system (like ASP on Windows hooks into the Windows scripting system). (3) Python is available via the scripting system of various platforms (this is already the case under Windows). On the other hand, a non-standard implementation of XSL would serve all those who work in an environment lacking a generic scripting system. Adrian From papresco@technologist.com Mon Mar 23 22:52:26 1998 From: papresco@technologist.com (Paul Prescod) Date: Mon, 23 Mar 1998 17:52:26 -0500 Subject: [XML-SIG] Python + XSL References: <009201bd56a8$04349010$0a357118@boyko> Message-ID: <3516E7AA.D0A83C04@technologist.com> Adrian Boyko wrote: > > I don't think a new implementations of XSL is the way to go. I'd rather see > us making sure that: > (1) The language definition for XSL allows the stylesheet author to specify > a scripting language. I don't have the spec in front of me, but I don't > think it does. It doesn't and I wouldn't be surprised if it never does. Web page designers don't want to require every surfer to install half a dozen XSL processors for every different languages. Nevertheless, XSL is straightforwardly translatable into Python and such a tool would be useful to Python programmers. It wouldn't be "XSL", but we could probably call it "PyXSL" without getting sued. It might also be an interesting project to implement a "real" XSL parser in Python, since Python's features are a superset of ECMAScript's. > (2) Existing XSL efforts (e.g. MSXSL) hook into their platform's > language-neutral scripting system (like ASP on Windows hooks into the > Windows scripting system). That's possible. A command line switch could change the parser from "XSL" to "PyXSL". > (3) Python is available via the scripting system of various platforms (this > is already the case under Windows). Unfortunately, Unix doesn't have a "scripting system" in that sense. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "I want to give beauty pageants the respectability they deserve." - Brooke Ross, Miss Canada International From larsga@ifi.uio.no Mon Mar 23 23:00:03 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 24 Mar 1998 00:00:03 +0100 Subject: [XML-SIG] PyDOM Message-ID: I've finally gotten round to reading the core DOM spec as well as DOM/core.py. Overall I think it looks quite good, and I really like builder.py, which makes it quite easy to add new builder classes should they be needed. But perhaps the classes now implemented as lists/hashes should be implemented as classes, but obey the standard list/hash protocols in Python so as to not cause inconveniences for users? This would help JPython integration and make it consistent with the way SAX has been translated. Even so it certainly has my blessing, FWIW. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From jtauber@jtauber.com Tue Mar 24 08:08:43 1998 From: jtauber@jtauber.com (James K. Tauber) Date: Tue, 24 Mar 1998 16:08:43 +0800 Subject: [XML-SIG] XML-SIG wish-list? Message-ID: <01BD573F.220745A0.jtauber@jtauber.com> On Tuesday, 24 March 1998 4:10, Lars Marius Garshol [SMTP:larsga@ifi.uio.no] wrote: > * Andrew Kuchling > | > | Is anyone in the XML community maintaining a list of DTDs? > > Robin Cover has a fairly extensive list, but the DTDs are mixed with > SGML DTDs: > > > > James Tauber has a pure XML list which is probably easier to use: > > I am actually wanting to expand this and include copies of the DTDs on my site along with metadata about the DTDs. Please let me know in what way I could do this to make it helpful to you guys. BTW, I've just added a link about this mailing list to my pages :-) James -- James K. Tauber / jtauber@jtauber.com Now working at AlphaWest and Curtin University Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ XML Tutorial: http://www7.conf.au/tutorialsday.html From digitome@iol.ie Tue Mar 24 11:05:53 1998 From: digitome@iol.ie (Sean McGrath) Date: Tue, 24 Mar 1998 11:05:53 GMT Subject: [XML-SIG] XML-SIG wish-list? Message-ID: <199803241105.LAA11852@mail.iol.ie> At 19:10 23/03/98 +0100, you wrote: >Just an idea: should we add to the SIG status page a list of projects >we'd like to see done, but which no-one has volunteered for yet? Some >things I can think of are: > > - an XSL implementation with Python scripting instead of JavaScript > - an XLL implementation that navigates Stephanes DOM tree > - an overview of the different XML-based standards and which of those > it would be interesting to have Python support for (RDF, WIDL, CDF, > MathML etc) This overview could obviously spawn some new > entries... :) > - UNICODE support > > I would like to add one to this list :- An XML/XSL/XLL *browser* in Python. Linked to XED for editing of course. Lets throw in WEBDAV for read/write work while we are at it:-) No harm to aim high right? From fleck@informatik.uni-bonn.de Tue Mar 24 16:19:02 1998 From: fleck@informatik.uni-bonn.de (Markus Fleck) Date: Tue, 24 Mar 1998 17:19:02 +0100 Subject: [XML-SIG] XML-SIG wish-list? References: <199803241105.LAA11852@mail.iol.ie> Message-ID: <3517DCF6.4BC1@informatik.uni-bonn.de> Sean McGrath wrote: > I would like to add one to this list :- An XML/XSL/XLL *browser* > in Python. Linked to XED for editing of course. Lets throw in > WEBDAV for read/write work while we are at it:-) Wow. Please do. I have started a Python-based free groupware project, and I am planning to include WebDAV server functionality using mod_pyapache and/or FastCGI. I could use all the help that I can get. :-) > No harm to aim high right? Not for the *wish list*, anyway. :-) Yours, Markus. -- //////////////////////////////////////////////////////////////////////////// Markus B Fleck - University of Bonn - CS Department IV - fleck@isoc.de UNIX Administrator - comp.lang.python.announce Moderator Mediator Free Groupware Project - http://mediator.cs.uni-bonn.de/mediator/ \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ From fleck@informatik.uni-bonn.de Mon Mar 30 16:01:28 1998 From: fleck@informatik.uni-bonn.de (Markus Fleck) Date: Mon, 30 Mar 1998 18:01:28 +0200 Subject: [XML-SIG] New working drafts: XML Namespaces, XLink, XPointer Message-ID: <351FC1D8.51AF@informatik.uni-bonn.de> Hi! FYI: XML Namespaces are now a "working draft", finally giving them some official status. There are also two other new working drafts: "XML Linking Language (XLink)" "XML Pointer Language (XPointer)" While XLink is supposed to be work like a generalized form of HTML's hypertext links, XPointer is used to refer to internal structures of HTML documents on different levels of detail (entity, characters, ...). I'm not sure if I really understand the utility of these two working drafts. It would be nice if someone with deeper insight could give a short summary of what they are for and how they relate to other standardization efforts such as DOM, HyTime etc. More info at W3C's XML site, . Yours, Markus. From larsga@ifi.uio.no Mon Mar 30 18:18:29 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 30 Mar 1998 20:18:29 +0200 Subject: [XML-SIG] New working drafts: XML Namespaces, XLink, XPointer In-Reply-To: <351FC1D8.51AF@informatik.uni-bonn.de> References: <351FC1D8.51AF@informatik.uni-bonn.de> Message-ID: * Markus Fleck | | [about XLink and XPointer] | | I'm not sure if I really understand the utility of these two working | drafts. What they are for is simple enough: since XML lets you make your own elements an XML browser has no way of knowing which elements describe links in your DTD. These two WDs give XML authors a standard way of describing links. | [...] how they relate to other standardization efforts such as DOM, They are pretty much orthogonal to DOM. | HyTime etc. HyTime describes both a linking language and a structuring language for time-based media (like sound, MIDI, video etc). These two WDs are more or less a simplified XML version of the linking parts of HyTime. I hope that helped. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From boyko@mail.bc.rogers.wave.ca Mon Mar 30 19:38:23 1998 From: boyko@mail.bc.rogers.wave.ca (Adrian Boyko) Date: Mon, 30 Mar 1998 11:38:23 -0800 Subject: [XML-SIG] New working drafts: XML Namespaces, XLink, XPointer In-Reply-To: <351FC1D8.51AF@informatik.uni-bonn.de> Message-ID: <001101bd5c13$6718bb50$0a357118@boyko> >I'm not sure if I really understand the utility of >these two working drafts. It would be nice if someone >with deeper insight could give a short summary of >what they are for and how they relate to other >standardization efforts such as DOM, HyTime etc. The XPointer material used to be part of the XLink document. It was only recently pulled out into a separate working draft. Lars already mentioned that XLink is a simplified version of the Linking part of HyTime. XPointer is based on the work of the Text Encoding Initiative (TEI). From larsga@ifi.uio.no Mon Mar 30 20:10:33 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 30 Mar 1998 22:10:33 +0200 Subject: [XML-SIG] saxlib Message-ID: I'm currently polishing xmlproc before release 0.30, which will have XML validation. I plan to release a new saxlib version at the same time and was wondering whether to add some new methods to it. (I'm not updating to the proposed revision until there is agreement on the revision.) These are the methods I'm thinking of: - module methods: - sax_version, returns the version number of the SAX version implemented - saxlib_version, returns the version number of saxlib - create_parser, imports a parser module (much like whichdb) - get_parser_list, returns a list of the parser modules known to be supported in the order they will be tried by create_parser - set_parser_list, lets the user decide the parser order - parser methods: - parser_module, returns the name of the parser used (to allow users to check which parser they are using) - parser_version, returns the parser version number Although one should be veeery careful with extending standards these extensions seem harmless enough to me. Might they cause any JPython problems? Anyone who can think of a reason why this should not be implemented or who has an opinion on this? -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From larsga@ifi.uio.no Mon Mar 30 20:19:16 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 30 Mar 1998 22:19:16 +0200 Subject: [XML-SIG] XML package organization In-Reply-To: <199803231848.NAA24718@newcnri.cnri.reston.va.us> References: <199803231415.JAA15521@newcnri.cnri.reston.va.us> <199803231848.NAA24718@newcnri.cnri.reston.va.us> Message-ID: I've now read the documentation of the package system in Python 1.5. Something like this might be the way to go: xml. dom.* -> DOM stuff sax. -> SAX stuff parsers.* -> one package for each parser The "parsers" package is proposed because I think it will make some of the features in my proposed saxlib update easier to implement and generally make the whole thing more tidy. Oh, and the XML browser should of course have its own package. :) -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From akuchlin@cnri.reston.va.us Mon Mar 30 20:27:14 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Mon, 30 Mar 1998 15:27:14 -0500 (EST) Subject: [XML-SIG] Experimental Unicode/Python Message-ID: <199803302027.PAA06425@newcnri.cnri.reston.va.us> [Follow-ups set to string-sig@python.org] Earlier today I told the String-SIG about an experimental version of Python that's been modified to support Unicode strings. I won't repeat that announcement here; consult the String-SIG archives for the full announcement: http://www.python.org/pipermail/1998q1.string-sig/ The code is in /pub/tmp/ on ftp.python.org. The experimental interpreter has a new built-in function, unicode(), that takes a regular string and returns a wide string object containing it. Wide strings and regular 8-bit strings are mostly interchangeable in Python code. C extensions that use PyArg_ParseTuple("s") will attempt to collapse a wide string down to 8 bits, raising an exception if it can't be done because of characters >255. So code like this will work: filename = unicode('/tmp/py-unicode') f = open(filename, 'w') stropmodule and pcremodule haven't yet been modified to support wide strings; string.py has been changed to not use strop, but no such simple fix is possible for re. We'll worry about those for later releases of the code. So, please experiment with this hacked interpreter, and send bug reports and API shortcomings to the String-SIG. A.M. Kuchling http://starship.skyport.net/crew/amk/ I am afraid of the worst, but I am not sure what that is. -- Abraham Rotstein From akuchlin@cnri.reston.va.us Mon Mar 30 21:03:51 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Mon, 30 Mar 1998 16:03:51 -0500 (EST) Subject: [XML-SIG] XML package organization In-Reply-To: References: <199803231415.JAA15521@newcnri.cnri.reston.va.us> <199803231848.NAA24718@newcnri.cnri.reston.va.us> Message-ID: <199803302103.QAA07349@newcnri.cnri.reston.va.us> Lars Marius Garshol writes: >xml. > dom.* -> DOM stuff > sax. -> SAX stuff > parsers.* -> one package for each parser >The "parsers" package is proposed because I think it will make some of >the features in my proposed saxlib update easier to implement and >generally make the whole thing more tidy. OK, but should it be in sax, as in your diagram, or as a separate subpackage under xml? It seems to make sense to change the dom subpackage to use such alternate parsers, and if other XML representations become useful, they'd use these parsers as well. >Oh, and the XML browser should of course have its own package. :) xml.browser, of course. Lars, unless someone speaks up and points out flaws in this proposal, you can go ahead and implement it. We can't wait forever for feedback that never comes. A.M. Kuchling http://starship.skyport.net/crew/amk/ [On applications for Microsoft Windows] Testing? That's scheduled for first thing after 3.0 ships. Quality is job Floating Point Error; Execution Terminated. -- Benjamin Ketcham, in _comp.os.unix.advocacy_. From akuchlin@cnri.reston.va.us Mon Mar 30 21:16:28 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Mon, 30 Mar 1998 16:16:28 -0500 (EST) Subject: [XML-SIG] saxlib In-Reply-To: References: Message-ID: <199803302116.QAA07665@newcnri.cnri.reston.va.us> [I sent this to the main list by mistake, an error which makes my .sig quote sort of amusing.] Lars Marius Garshol writes: >Although one should be veeery careful with extending standards these >extensions seem harmless enough to me. Might they cause any JPython >problems? Anyone who can think of a reason why this should not be >implemented or who has an opinion on this? The proposed extensions don't seem to be very difficult ones to implement, and fairly harmless to SAX compatibility. Paul Prescod will doubtless let us know if they present problems for JPython. I assume that you're not standardizing the low-level interfaces to modules in the parsers package; rather, the modules in xml.parsers will implement what interface they like, and xml.sax and xml.dom are stable adapters on top of these varying modules. Correct? A.M. Kuchling http://starship.skyport.net/crew/amk/ Science itself, therefore, may be regarded as a minimal problem, consisting of the completest possible presentment of facts with the least possible expenditure of thought. -- Ernst Mach From larsga@ifi.uio.no Mon Mar 30 21:18:54 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 30 Mar 1998 23:18:54 +0200 Subject: [XML-SIG] XML package organization In-Reply-To: <199803302103.QAA07349@newcnri.cnri.reston.va.us> References: <199803231415.JAA15521@newcnri.cnri.reston.va.us> <199803231848.NAA24718@newcnri.cnri.reston.va.us> <199803302103.QAA07349@newcnri.cnri.reston.va.us> Message-ID: * Andrew Kuchling | | OK, but should it be in sax, as in your diagram, or as a separate | subpackage under xml? It could be as a separate subpackage, in fact I think that's better as we don't get quite as many very.long.and.tedious.package.names that way. I had some vague notions about how that might cause problems for the parser auto-detection, but I now see that there won't be any more problems that way. | It seems to make sense to change the dom subpackage to use such | alternate parsers, and if other XML representations become useful, | they'd use these parsers as well. Unless DOM should build on SAX. I know Stephane has several different | Lars, unless someone speaks up and points out flaws in this | proposal, you can go ahead and implement it. OK. I hope to be able to release both xmlproc 0.30 and a new version of saxlib this week. Both will then support this package scheme. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From larsga@ifi.uio.no Mon Mar 30 21:32:28 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 30 Mar 1998 23:32:28 +0200 Subject: [XML-SIG] saxlib In-Reply-To: <199803302116.QAA07665@newcnri.cnri.reston.va.us> References: <199803302116.QAA07665@newcnri.cnri.reston.va.us> Message-ID: * Andrew Kuchling | | The proposed extensions don't seem to be very difficult ones to | implement, I don't think they will be, either, but I think they may be quite useful. | I assume that you're not standardizing the low-level interfaces to | modules in the parsers package; rather, the modules in xml.parsers | will implement what interface they like, and xml.sax and xml.dom are | stable adapters on top of these varying modules. Correct? Absolutely correct. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ From papresco@technologist.com Mon Mar 30 21:43:01 1998 From: papresco@technologist.com (Paul Prescod) Date: Mon, 30 Mar 1998 16:43:01 -0500 (EST) Subject: [XML-SIG] saxlib In-Reply-To: Message-ID: Nothing you propose will break JPython, for the simple reason that JPython is Python. But none of them will work with Java parsers "out of the box." And some of them are not even possible to implement for Java parsers (I guess you could call that "breaking JPython"): On 30 Mar 1998, Lars Marius Garshol wrote: > - module methods: > - sax_version, returns the version number of the SAX version implemented > - saxlib_version, returns the version number of saxlib You should propose this to David for all SAX implementations. > - create_parser, imports a parser module (much like whichdb) This would be possible to implement in a wrapper layer. > - get_parser_list, returns a list of the parser modules known to > be supported in the order they will be tried by create_parser This is not implementable for Java parsers because Java packages are not first class. There would have to be a list of valid parsers "hanging around". > - parser methods: > - parser_module, returns the name of the parser used (to allow > users to check which parser they are using) This is not too bad because classes are first class in Java. You would just ask the class for its name (as you would in Python). > - parser_version, returns the parser version number Java parsers won't have version numbers. You should ask for this to be added to the SAX API. Paul Prescod From akuchlin@cnri.reston.va.us Mon Mar 30 21:54:07 1998 From: akuchlin@cnri.reston.va.us (Andrew Kuchling) Date: Mon, 30 Mar 1998 16:54:07 -0500 (EST) Subject: [XML-SIG] saxlib In-Reply-To: References: Message-ID: <199803302154.QAA08762@newcnri.cnri.reston.va.us> Paul Prescod writes: >> - get_parser_list, returns a list of the parser modules known to >> be supported in the order they will be tried by create_parser >This is not implementable for Java parsers because Java packages are not >first class. There would have to be a list of valid parsers "hanging around". I would assume that this returns a list of module names, not the actual module objects, since the latter requires importing every single XML parser available (unless you do something sneaky with dummy module objects). So this would simply return a list of strings, naming the parser classes that create_parser knows about, and that number is finite and known ahead of time. A.M. Kuchling http://starship.skyport.net/crew/amk/ I must confess, I have always wondered what lay beyond life, my dear. Yeah, everybody wonders. And sooner or later everybody gets to find out. -- Norton I and Death, in SANDMAN: "Three Septembers and a January" From papresco@technologist.com Mon Mar 30 22:07:27 1998 From: papresco@technologist.com (Paul Prescod) Date: Mon, 30 Mar 1998 17:07:27 -0500 (EST) Subject: [XML-SIG] saxlib In-Reply-To: <199803302154.QAA08762@newcnri.cnri.reston.va.us> Message-ID: On Mon, 30 Mar 1998, Andrew Kuchling wrote: > I would assume that this returns a list of module names, not > the actual module objects, since the latter requires importing every > single XML parser available (unless you do something sneaky with dummy > module objects). So this would simply return a list of strings, > naming the parser classes that create_parser knows about, and that > number is finite and known ahead of time. Right, but unless the list is made by a human being, there is no way to generate it in JPython. I would have presumed that the CPython version would "peek" in the xml/sax/parsers directory, but JPython can't do that (AFAIK). I'm reasonably confident that JPython can't "peek" at Java package directories and modules. Paul Prescod From larsga@ifi.uio.no Mon Mar 30 22:22:53 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 31 Mar 1998 00:22:53 +0200 Subject: [XML-SIG] saxlib In-Reply-To: References: Message-ID: * Andrew Kuchling | | [on get_parser_list] | | I would assume that this returns a list of module names, not the | actual module objects, Aha! Thanks! Rereading the method explanation I see that this is really two different methods: get_parser_list and get_present_parsers. The first should be implementable anywhere, while the second is only available in CPython (if Paul is right, and he usually is :). Consider this as two methods from now on. (They will be in the proposal I'm submitting to xml-dev in a moment.) Both should return a string of module names. * Paul Prescod | | I would have presumed that the CPython version would "peek" in the | xml/sax/parsers directory, Yes, this is how I planned to implement get_present_parsers. | but JPython can't do that (AFAIK). The JPython version of get_present_parsers should then return None and get_parser_list should just return the predetermined list regardless of whether CPython or JPython is used. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/