From junkc at fh-trier.de Mon Jan 3 16:40:52 2005 From: junkc at fh-trier.de (Christian Junk) Date: Mon Jan 3 16:40:56 2005 Subject: [XML-SIG] XBEL xslt stylesheet In-Reply-To: <41D49DEA.7020806@machina.no> References: <41D49DEA.7020806@machina.no> Message-ID: <200501031641.00210.junkc@fh-trier.de> Am Freitag, 31. Dezember 2004 01:31 schrieb Narve Saetre: > There seem to be an invalid URL (another web page gone dead) on the XBEL > > page: > >Joris Graaumans (joris@cs.uu.nl) has developed a couple of XSLT > > stylesheets for XBEL . Yes, this URL is no longer valid, but Joris sent me his XSLT stylesheets a month ago. If you like, I can upload them to 'secure' webspace. I'm able to provide webspace for other stylesheets, too. So if you're interested, we can collect good styles and offer them under a unique url? > In my hunt for a better XBEL stylesheet I discovered a superior one > (nice, recursive, graphic, icons, fast, cross-browser): > > http://cgi29.plala.or.jp/mozzarel/xbel/ > > so I suggest the web master changes the URL to this one (and hope that > one stays for a while). Thanks for the link. It's a really good stylesheet! > [..] Regards, Christian From andy at itasoftware.com Mon Jan 3 19:29:43 2005 From: andy at itasoftware.com (Andy Meyer) Date: Mon Jan 3 19:29:56 2005 Subject: [XML-SIG] xml.dom.minidom.toprettyxml whitespace question Message-ID: <41D98F17.9080805@itasoftware.com> Hello all, I have a question about the xml.dom.minidom.toprettyxml method's insertion of whitespace into text elements, e.g. 'Hello!' getting transformed by toprettyxml to: Hello! with the addition of tabs and newlines around 'Hello!', instead of: Hello! Since a SAX-style parser would read the second example as identical to the raw XML, to me the second way is more correct than the first, but I'm new to XML and handling whitespace seems to be an unresolved issue. Is this behavior by design? Andy Meyer From rsalz at datapower.com Mon Jan 3 20:02:27 2005 From: rsalz at datapower.com (Rich Salz) Date: Mon Jan 3 19:52:23 2005 Subject: [XML-SIG] xml.dom.minidom.toprettyxml whitespace question In-Reply-To: <41D98F17.9080805@itasoftware.com> References: <41D98F17.9080805@itasoftware.com> Message-ID: <41D996C3.5040103@datapower.com> > I'm new to XML and handling whitespace seems to be an unresolved issue. You might want to look up the "xml:space" attribute, and "whitespace normalization" (e.g., XPath 1.0) > Is this behavior by design? Yes. Note that the name is "toprettyxml" not just "dump" or "print" -- it's designed to make the output pretty, not to make the output be exactly like the input. /r$ -- Rich Salz, Chief Security Architect DataPower Technology http://www.datapower.com XS40 XML Security Gateway http://www.datapower.com/products/xs40.html XML Security Overview http://www.datapower.com/xmldev/xmlsecurity.html From walter at livinglogic.de Mon Jan 3 23:26:14 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon Jan 3 23:26:18 2005 Subject: [XML-SIG] XIST 2.8 has been released Message-ID: <41D9C686.7090809@livinglogic.de> XIST 2.8 has been released! What is it? =========== XIST is an extensible HTML/XML generator written in Python. XIST is also a DOM parser (built on top of SAX2) with a very simple and Pythonesque tree API. Every XML element type corresponds to a Python class, and these Python classes provide a conversion method to transform the XML tree (e.g. into HTML). XIST can be considered "object oriented XSL". What's new in version 2.8? ========================== * XIST requires Python 2.4 now. * ll.xist.ns.specials.x has been renamed to ll.xist.ns.specials.ignore. ll.xist.utils.findAttr has been renamed to ll.xist.utils.findattr. * ll.xist.xfind.item no longer handles slices. * XFind has been enhanced to support item and slice operators, i.e. if foo is an XFind operator, foo[0] is an operator that will produce the first node from foo (if there is one). Negative values and slices are supported too. * Operators can be chained via division: html.a/html.b is an operator that can be passed around and applied to a node. * XIST requires the new core module and makes use of the new "cooperative displayhook" functionality defined there: If you install the displayhook you can tweak or replace ll.xist.presenters.hookpresenter to change the output. For changes in older versions see: http://www.livinglogic.de/Python/xist/History.html As the package structure has changed, there is a new version of every other LivingLogic package too. Furthermore the web pages have been redesigned. Where can I get it? =================== XIST can be downloaded from http://ftp.livinglogic.de/xist/ or ftp://ftp.livinglogic.de/pub/livinglogic/xist/ Web pages are at http://www.livinglogic.de/Python/xist/ ViewCVS access is available at http://www.livinglogic.de/viewcvs/ For information about the mailing lists go to http://www.livinglogic.de/Python/xist/Mailinglists.html Bye, Walter D?rwald From malcolm at commsecure.com.au Tue Jan 4 04:23:31 2005 From: malcolm at commsecure.com.au (Malcolm Tredinnick) Date: Tue Jan 4 04:23:38 2005 Subject: [XML-SIG] xml.dom.minidom.toprettyxml whitespace question In-Reply-To: <41D98F17.9080805@itasoftware.com> References: <41D98F17.9080805@itasoftware.com> Message-ID: <1104809011.11379.23.camel@ws14.commsecure.com.au> On Mon, 2005-01-03 at 13:29 -0500, Andy Meyer wrote: > Hello all, > > I have a question about the xml.dom.minidom.toprettyxml method's > insertion of whitespace into text elements, e.g. > 'Hello!' getting transformed by toprettyxml to: > > > > Hello! > > > > with the addition of tabs and newlines around 'Hello!', instead of: > > > Hello! > > > Since a SAX-style parser would read the second example as identical to > the raw XML, to me the second way is more correct than the first, but > I'm new to XML and handling whitespace seems to be an unresolved issue. > Is this behavior by design? Rich has already answered your question, but I thought I would just point out that, in fact, the second example would not generally produce the same SAX events as the raw XML. For the raw XML, you would see (using a bad summary of SAX events): - start "foo" element - start "bar" element - characters ("Hello!") - end "bar" element - end "foo" element whereas the second layout will produce - start "foo" element - characters (newline + tabs or spaces) - start "bar" element - characters ("Hello!") - end "bar" element - characters (newline) - end "foo" element In other words, the SAX parser will not normally discard things your eye will gloss over; it cannot tell that they are not significant. Cheers, Malcolm From wenavwtrw at 1st-home-business-idea.com Tue Jan 4 09:20:40 2005 From: wenavwtrw at 1st-home-business-idea.com (Dylan Calloway) Date: Tue Jan 4 21:20:03 2005 Subject: [XML-SIG] Let the system work for your benefit... Message-ID: <3dCVE9PaxO9@1st-home-business-idea.com When about dissident is gratifying, of girl scout approach burglar of fundraiser.He called her Lila (or was it Lila?).pickup truck related to crank case procrastinates, but snow for roller coaster negotiate a prenuptial agreement with over waif. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20050104/506d2682/attachment.htm From noreply at sourceforge.net Thu Jan 6 03:27:35 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Thu Jan 6 03:27:48 2005 Subject: [XML-SIG] [ pyxml-Bugs-1096906 ] SAX2 wrecks marshal.generic Message-ID: Bugs item #1096906, was opened at 2005-01-05 18:27 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1096906&group_id=6473 Category: SAX Group: None Status: Open Resolution: None Priority: 5 Submitted By: James S (laskovortex) Assigned to: Nobody/Anonymous (nobody) Summary: SAX2 wrecks marshal.generic Initial Comment: Using PyXML-0.8.4, mandrake linux 10.1 and python 2.3.4 Not sure what is happening, but if this is the xml file (called "eraseme.prefs"): color blue max 3 min 0 Then the following code does not work: --- #! /usr/bin/env python from xml.marshal import generic from xml.dom.ext.reader import Sax2 Sax2.Reader() print generic.load(open("eraseme.prefs")) --- However, the following code works fine: #! /usr/bin/env python from xml.marshal import generic from xml.dom.ext.reader import Sax Sax.Reader() print generic.load(open("eraseme.prefs")) --- The error messages follow follow Traceback (most recent call last): File "./test.py", line 7, in ? print generic.load(open("eraseme.prefs")) File "/data1/users/jstroud/Programs/lib/python2.3/site-packages/_xmlplus/marshal/generic.py", line 312, in load return m._load(file) File "/data1/users/jstroud/Programs/lib/python2.3/site-packages/_xmlplus/marshal/generic.py", line 329, in _load p = saxexts.make_parser() File "/data1/users/jstroud/Programs/lib/python2.3/site-packages/_xmlplus/sax/saxexts.py", line 168, in make_parser return XMLParserFactory.make_parser(parser_list) File "/data1/users/jstroud/Programs/lib/python2.3/site-packages/_xmlplus/sax/saxexts.py", line 64, in make_parser return self._create_parser(parser_name) File "/data1/users/jstroud/Programs/lib/python2.3/site-packages/_xmlplus/sax/saxexts.py", line 43, in _create_parser return drv_module.create_parser() File "/data1/users/jstroud/Programs/lib/python2.3/site-packages/_xmlplus/sax/drivers/drv_pyexpat.py", line 228, in create_parser return SAX_expat() File "/data1/users/jstroud/Programs/lib/python2.3/site-packages/_xmlplus/sax/drivers/drv_pyexpat.py", line 31, in __init__ self.reset() File "/data1/users/jstroud/Programs/lib/python2.3/site-packages/_xmlplus/sax/drivers/drv_pyexpat.py", line 117, in reset self.parser=expat.ParserCreate() AttributeError: 'module' object has no attribute 'ParserCreate' ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1096906&group_id=6473 From narve at machina.no Thu Jan 6 21:46:40 2005 From: narve at machina.no (Narve Saetre) Date: Thu Jan 6 21:42:41 2005 Subject: [XML-SIG] XBEL xslt stylesheet In-Reply-To: <200501031641.00210.junkc@fh-trier.de> References: <41D49DEA.7020806@machina.no> <200501031641.00210.junkc@fh-trier.de> Message-ID: <41DDA3B0.4080103@machina.no> >Yes, this URL is no longer valid, but Joris sent me his XSLT stylesheets a >month ago. If you like, I can upload them to 'secure' webspace. I'm able to >provide webspace for other stylesheets, too. So if you're interested, we can >collect good styles and offer them under a unique url? > > Good idea -- having nice, working stylesheets available is always good if you have xml files and you want to display them quickly. Since your web site seems to be the main page for XBEL related work, it would be the natural place to host XBEL stylesheets. And of course, broken links are evil, so it is better to host the stylesheets yourself:) -- Narve S?tre Partner / Machina Networks as / +47 41915331 From n.poppelier at xs4all.nl Fri Jan 7 19:32:43 2005 From: n.poppelier at xs4all.nl (Nico Poppelier) Date: Fri Jan 7 19:32:45 2005 Subject: [XML-SIG] SAX2 support Message-ID: <41DED5CB.3030002@xs4all.nl> Dear SIG members, I have some XML tools written in Perl that I would like to rewrite in Python and make namespace-aware at the same time. What I now have is based on xml.sax, but yesterday I noticed that some of the namespace features are not supported (yet). The section on xml.sax.handler in the library reference says: " In addition to these classes, xml.sax.handler provides symbolic constants for the feature and property names" and then gives a list of features and properties. The source for xml/sax/xmlreader.py, however, does not implement features and properites, and instead throws an exception when you call e.g. getFeature or setFeature. When will features and properties of SAX2 be properly supported in xml.sax? Or should I forget about xml.sax and move to xml.dom, for example? The latter doesn't appeal to me, since I prefer minimal XML APIs and I find DOM too big and too cluttered. Regards, Nico Poppelier P.S. Some years ago, I was editor of the W3C Math Working Group and did of lot of XML-related work, but with my present job XML is an activity for the occasional free weekend. From nathan at byu.edu Fri Jan 7 23:52:08 2005 From: nathan at byu.edu (Nathan Given) Date: Fri Jan 7 23:52:10 2005 Subject: [XML-SIG] Trouble installing PyXML-0.8.4 Message-ID: <41DF1298.4010209@byu.edu> Dear Python Representative, I am having trouble installing PyXMO-0.8.4. Here is some output from my console: ebiz:/usr/local/include:! > python -V Python 2.3.3 ebiz:/usr/local/include:! > ebiz:/home/ng32/downloads/PyXML-0.8.4:! > python setup.py build Traceback (most recent call last): File "setup.py", line 127, in ? config_h_vars = parse_config_h(open(config_h)) IOError: [Errno 2] No such file or directory: '/usr/local/include/python2.3/pyconfig.h' ebiz:/home/ng32/downloads/PyXML-0.8.4:! > ebiz:/home/ng32/downloads/PyXML-0.8.4:! > which python /usr/local/bin/python ebiz:/home/ng32/downloads/PyXML-0.8.4:! > ebiz:/usr/local/include:! > ls -l | grep python drwxr-xr-x 2 root sys 2048 Jan 6 15:05 python/ drwxr-xr-x 2 swliddle deg 2048 Sep 9 2002 python2.2/ ebiz:/usr/local/include:! > I can't seem to figure it out... I guess it thinks python is in /usr/local/include/python2.3... but it doesn't appear like that directory exists. Any ideas? Thanks! -- Nathan -- Check out our online forum! http://www.bethandnathan.com - Check out the BYU Book Exchange! http://bookexchange.byu.edu - Check out a National Book Exchange! http://www.booksoncampus.com From martin at v.loewis.de Sat Jan 8 00:49:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Jan 8 00:49:19 2005 Subject: [XML-SIG] Trouble installing PyXML-0.8.4 In-Reply-To: <41DF1298.4010209@byu.edu> References: <41DF1298.4010209@byu.edu> Message-ID: <41DF2007.1030501@v.loewis.de> Nathan Given wrote: > ebiz:/usr/local/include:! > ls -l | grep python > drwxr-xr-x 2 root sys 2048 Jan 6 15:05 python/ > drwxr-xr-x 2 swliddle deg 2048 Sep 9 2002 python2.2/ > ebiz:/usr/local/include:! > > > I can't seem to figure it out... I guess it thinks python is in > /usr/local/include/python2.3... but it doesn't appear like that > directory exists. > > Any ideas? There is something wrong with your Python installation. The include directory is supposed to be called /usr/local/include/python2.3, not /usr/local/include/python. It might be that your sysadmin has messed with the installation, renaming the directory. Please ask her to undo this change. Regards, Martin From dieter at handshake.de Sat Jan 8 19:36:43 2005 From: dieter at handshake.de (Dieter Maurer) Date: Sat Jan 8 21:00:54 2005 Subject: [XML-SIG] SAX2 support In-Reply-To: <41DED5CB.3030002@xs4all.nl> References: <41DED5CB.3030002@xs4all.nl> Message-ID: <16864.10299.398194.495984@gargle.gargle.HOWL> Nico Poppelier wrote at 2005-1-7 19:32 +0100: >The section on xml.sax.handler in the library reference says: " In >addition to these classes, xml.sax.handler provides symbolic constants >for the feature and property names" and then gives a list of features >and properties. The source for xml/sax/xmlreader.py, however, does not >implement features and properites, and instead throws an exception when >you call e.g. getFeature or setFeature. The "setFeature" and "getFeature" are methods of the parser (and apparently not the "xmlreader"). I use them this way: from xml.sax import make_parser from xml.sax.handler import feature_external_ges ... parser= make_parser(); parser.setDTDHandler(None) parser.setFeature(feature_external_ges,0) -- Dieter From bills2018 at hotmail.com Sun Jan 9 02:44:24 2005 From: bills2018 at hotmail.com (bill) Date: Sun Jan 9 02:45:35 2005 Subject: [XML-SIG] The limitation of the Photon Hypothesis Message-ID: <20050109014534.9FD6F1E4008@bag.python.org> Please reply to hdgbyi@public.guangzhou.gd.cn, thank you ! The limitation of the Photon Hypothesis According to the electromagnetic theory of light, its energy is related to the amplitude of the electric field of the electromagnetic wave, W=eE^2V(where E is the amplitude and V is the volume). It apparently has nothing to do with the light's frequency f. To explain the photoelectric effect, Einstein put forward the photon hypothesis. His paper hypothesized light was made of quantum packets of energy called photons. Each photon carried a specific energy related to its frequency f, W=hf. This has nothing to do with the amplitude of the electromagnetic wave E. For the electromagnetic wave that the amplitude E has nothing to do with the light's frequency f, if the light's frequency f is high enough, the energy of the photon in light is greater than the light's energy, hf>eE^2V. Apparently, this is incompatible with the electromagnetic theory of light. THE UNCERTAINTY PRINCIPLE IS UNTENABLE By re-analysing Heisenberg's Gamma-Ray Microscope experiment and one of the thought experiment from which the uncertainty principle is demonstrated, it is actually found that the uncertainty principle cannot be demonstrated by them. It is therefore found to be untenable. Key words: uncertainty principle; Heisenberg's Gamma-Ray Microscope Experiment; thought experiment The History Of The Uncertainty Principle If one wants to be clear about what is meant by "position of an object," for example of an electron., then one has to specify definite experiments by which the "position of an electron" can be measured; otherwise this term has no meaning at all. --Heisenberg, in uncertainty paper, 1927 Are the uncertainty relations that Heisenberg discovered in 1927 just the result of the equations used, or are they really built into every measurement? Heisenberg turned to a thought experiment, since he believed that all concepts in science require a definition based on actual, or possible, experimental observations. Heisenberg pictured a microscope that obtains very high resolution by using high-energy gamma rays for illumination. No such microscope exists at present, but it could be constructed in principle. Heisenberg imagined using this microscope to see an electron and to measure its position. He found that the electron's position and momentum did indeed obey the uncertainty relation he had derived mathematically. Bohr pointed out some flaws in the experiment, but once these were corrected the demonstration was fully convincing. Thought Experiment 1 The corrected version of the thought experiment Heisenberg's Gamma-Ray Microscope Experiment A free electron sits directly beneath the center of the microscope's lens (please see AIP page http://www.aip.org/history/heisenberg/p08b.htm or diagram below) . The circular lens forms a cone of angle 2A from the electron. The electron is then illuminated from the left by gamma rays--high-energy light which has the shortest wavelength. These yield the highest resolution, for according to a principle of wave optics, the microscope can resolve (that is, "see" or distinguish) objects to a size of dx, which is related to and to the wavelength L of the gamma ray, by the expression: dx = L/(2sinA) (1) However, in quantum mechanics, where a light wave can act like a particle, a gamma ray striking an electron gives it a kick. At the moment the light is diffracted by the electron into the microscope lens, the electron is thrust to the right. To be observed by the microscope, the gamma ray must be scattered into any angle within the cone of angle 2A. In quantum mechanics, the gamma ray carries momentum as if it were a particle. The total momentum p is related to the wavelength by the formula, p = h / L, where h is Planck's constant. (2) In the extreme case of diffraction of the gamma ray to the right edge of the lens, the total momentum would be the sum of the electron's momentum P'x in the x direction and the gamma ray's momentum in the x direction: P' x + (h sinA) / L', where L' is the wavelength of the deflected gamma ray. In the other extreme, the observed gamma ray recoils backward, just hitting the left edge of the lens. In this case, the total momentum in the X direction is: P''x - (h sinA) / L''. The final x momentum in each case must equal the initial X momentum, since momentum is conserved. Therefore, the final X moment are equal to each other: P'x + (h sinA) / L' = P''x - (h sinA) / L'' (3) If A is small, then the wavelengths are approximately the same, L' ~ L" ~ L. So we have P''x - P'x = dPx ~ 2h sinA / L (4) Since dx = L/(2 sinA), we obtain a reciprocal relationship between the minimum uncertainty in the measured position, dx, of the electron along the X axis and the uncertainty in its momentum, dPx, in the x direction: dPx ~ h / dx or dPx dx ~ h. (5) For more than minimum uncertainty, the "greater than" sign may added. Except for the factor of 4pi and an equal sign, this is Heisenberg's uncertainty relation for the simultaneous measurement of the position and momentum of an object. Re-analysis The original analysis of Heisenberg's Gamma-Ray Microscope Experiment overlooked that the microscope cannot see the object whose size is smaller than its resolving limit, dx, thereby overlooking that the electron which relates to dx and dPx respectively is not the same. According to the truth that the microscope can not see the object whose size is smaller than its resolving limit, dx, we can obtain that what we can see is the electron where the size is larger than or equal to the resolving limit dx and has a certain position, dx = 0. The microscope can resolve (that is, "see" or distinguish) objects to a size of dx, which is related to and to the wavelength L of the gamma ray, by the expression: dx = L/(2sinA) (1) This is the resolving limit of the microscope and it is the uncertain quantity of the object's position. The microscope cannot see the object whose size is smaller than its resolving limit, dx. Therefore, to be seen by the microscope, the size of the electron must be larger than or equal to the resolving limit. But if the size of the electron is larger than or equal to the resolving limit dx, the electron will not be in the range dx. Therefore, dx cannot be deemed to be the uncertain quantity of the electron's position which can be seen by the microscope, but deemed to be the uncertain quantity of the electron's position which can not be seen by the microscope. To repeat, dx is uncertainty in the electron's position which cannot be seen by the microscope. To be seen by the microscope, the gamma ray must be scattered into any angle within the cone of angle 2A, so we can measure the momentum of the electron. But if the size of the electron is smaller than the resolving limit dx, the electron cannot be seen by the microscope, we cannot measure the momentum of the electron. Only the size of the electron is larger than or equal to the resolving limit dx, the electron can be seen by the microscope, we can measure the momentum of the electron. According to Heisenberg's Gamma-Ray Microscope Experiment, the electron¡¯s momentum is uncertain, the uncertainty in its momentum is dPx. dPx is the uncertainty in the electron's momentum which can be seen by microscope. What relates to dx is the electron where the size is smaller than the resolving limit. When the electron is in the range dx, it cannot be seen by the microscope, so its position is uncertain, and its momentum is not measurable, because to be seen by the microscope, the gamma ray must be scattered into any angle within the cone of angle 2A, so we can measure the momentum of the electron. If the electron cannot be seen by the microscope, we cannot measure the momentum of the electron. What relates to dPx is the electron where the size is larger than or equal to the resolving limit dx .The electron is not in the range dx, so it can be seen by the microscope and its position is certain, its momentum is measurable. Apparently, the electron which relates to dx and dPx respectively is not the same. What we can see is the electron where the size is larger than or equal to the resolving limit dx and has a certain position, dx = 0. Quantum mechanics does not rely on the size of the object, but on Heisenberg's Gamma-Ray Microscope experiment. The use of the microscope must relate to the size of the object. The size of the object which can be seen by the microscope must be larger than or equal to the resolving limit dx of the microscope, thus the uncertain quantity of the electron's position does not exist. The gamma ray which is diffracted by the electron can be scattered into any angle within the cone of angle 2A, where we can measure the momentum of the electron. What we can see is the electron which has a certain position, dx = 0, so that in no other position can we measure the momentum of the electron. In Quantum mechanics, the momentum of the electron can be measured accurately when we measure the momentum of the electron only, therefore, we have gained dPx = 0. And, dPx dx =0. (6) Thought Experiment 2 Single Slit Diffraction Experiment Suppose a particle moves in the Y direction originally and then passes a slit with width dx(Please see diagram below) . The uncertain quantity of the particle's position in the X direction is dx, and interference occurs at the back slit . According to Wave Optics , the angle where No.1 min of interference pattern can be calculated by following formula: sinA=L/2dx (1) and L=h/p where h is Planck's constant. (2) So the uncertainty principle can be obtained dPx dx ~ h (5) Re-analysis The original analysis of Single Slit Diffraction Experiment overlooked the corpuscular property of the particle and the Energy-Momentum conservation laws and mistook the uncertain quantity of the particle's position in the X direction is the slit's width dx. According to Newton first law , if an external force in the X direction does not affect the particle, it will move in a uniform straight line, ( Motion State or Static State) , and the motion in the Y direction is unchanged .Therefore , we can learn its position in the slit from its starting point. The particle can have a certain position in the slit and the uncertain quantity of the position is dx =0. According to Newton first law , if the external force at the X direction does not affect particle, and the original motion in the Y direction is not changed , the momentum of the particle in the X direction will be Px=0 and the uncertain quantity of the momentum will be dPx =0. This gives: dPx dx =0. (6) No experiment negates NEWTON FIRST LAW. Whether in quantum mechanics or classical mechanics, it applies to the microcosmic world and is of the form of the Energy-Momentum conservation laws. If an external force does not affect the particle and it does not remain static or in uniform motion, it has disobeyed the Energy-Momentum conservation laws. Under the above thought experiment , it is considered that the width of the slit is the uncertain quantity of the particle's position. But there is certainly no reason for us to consider that the particle in the above experiment has an uncertain position, and no reason for us to consider that the slit's width is the uncertain quantity of the particle. Therefore, the uncertainty principle, dPx dx ~ h (5) which is demonstrated by the above experiment is unreasonable. Conclusion Every physical principle is based on the Experiments, not based on MATHEMATICS, including heisenberg uncertainty principle. Einstein said, One Experiment is enough to negate a physical principle. >From the above re-analysis , it is realized that the thought experiment demonstration for the uncertainty principle is untenable. Therefore, the uncertainty principle is untenable. Reference: 1. Max Jammer. (1974) The philosophy of quantum mechanics (John wiley & sons , Inc New York ) Page 65 2. Ibid, Page 67 3. http://www.aip.org/history/heisenberg/p08b.htm Single Particles Do Not Exhibit Wave-Like Behavior Through a qualitative analysis of the experiment, it is shown that the presumed wave-like behavior of a single particle contradicts the energy-momentum conservation laws and may be expained solely through particle interactions. DUAL SLIT INTERFERENCE EXPERIMENT PART I If a single particle has wave-like behavior, it will create an interference image when it has passed through a single slit. But the experimental result shows that this is not the case Only a large number of particles can create an interference image when they pass through the two slits. PART II In the dual slit interference experiment, the single particle is thought to pass through both slits and interfere with itself at the same time due to its wave-like behavior. The motion of the wave is the same direction as the particle. If the particle passes through a single slit only, it can not be assumed that it has wave-like behavior. If it passes through two slits, it, and also the acompanying wave must be assumed to have motion in two directions. But a wave only has one direction of motion. PART III If one slit is obstructed in the dual slit interference experiment and a particle is launched in this direction, then according to Newton¡¯s first law, (assuming no external forces,) it will travel in a uniform straight line. It will not pass through the closed slit and will not make contact with the screen. If it has wavelike behavior, there is a probability that it will make contact. But this will negate Newton¡¯s first law and the law of conservation of energy and momentum. Both quantum mechanics and classical mechanics are dependent on this law. THE EXPLANATION FOR THE WAVE-LIKE BEHAVIOR OF THE PARTICLE In the dual slit interference experiment, if one slit is unobstructed, particles will impact at certain positions on the screen. But when two slits are open, the particles cannot reach these positions. This phenomenon brings us to the greatest puzzle regarding the image of the particle. But when we consider that the particle may experience two or more reflections, the puzzle may be resolved. As indicated, when one of the slits is obstructed, the particles that move towards this slit cannot get to the screen. However, they can return to the particle source by reflection and then pass through the open slit and reach the above positions since they have different paths when one or two slits are open. This indicates that wave-like behavior may be explained solely on the basis of particle interactions. EXPERIMENTAL TEST The above may be tested by an experiment that can absorb all the particles that move towards the closed slit. If one slit is obstructed by the stuff that can absorb all the particles that move towards it, the intensity of some positions on the screen should decrease THE CONCLUSION Single particles do not exhibit wave-like behavior. The similarity of wave and particle behavior may be attributed to initial impulse and path. The quantum mechanical explanation is suspect, since the probability of one particle and one particle among a large quantity reaching the screen is equal in mathematics and physics. Author : BingXin Gong Postal address : P.O.Box A111 YongFa XiaoQu XinHua HuaDu GuangZhou 510800 P.R.China E-mail: hdgbyi@public.guangzhou.gd.cn Tel: 86---20---86856616 From bills2018 at hotmail.com Mon Jan 10 03:25:10 2005 From: bills2018 at hotmail.com (bill) Date: Mon Jan 10 03:26:31 2005 Subject: [XML-SIG] The limitation of the Photon Hypothesis Message-ID: <20050110022630.274771E400A@bag.python.org> Please reply to hdgbyi@public.guangzhou.gd.cn, thank you ! The limitation of the Photon Hypothesis According to the electromagnetic theory of light, its energy is related to the amplitude of the electric field of the electromagnetic wave, W=eE^2V(where E is the amplitude and V is the volume). It apparently has nothing to do with the light's frequency f. To explain the photoelectric effect, Einstein put forward the photon hypothesis. His paper hypothesized light was made of quantum packets of energy called photons. Each photon carried a specific energy related to its frequency f, W=hf. This has nothing to do with the amplitude of the electromagnetic wave E. For the electromagnetic wave that the amplitude E has nothing to do with the light's frequency f, if the light's frequency f is high enough, the energy of the photon in light is greater than the light's energy, hf>eE^2V. Apparently, this is incompatible with the electromagnetic theory of light. THE UNCERTAINTY PRINCIPLE IS UNTENABLE By re-analysing Heisenberg's Gamma-Ray Microscope experiment and one of the thought experiment from which the uncertainty principle is demonstrated, it is actually found that the uncertainty principle cannot be demonstrated by them. It is therefore found to be untenable. Key words: uncertainty principle; Heisenberg's Gamma-Ray Microscope Experiment; thought experiment The History Of The Uncertainty Principle If one wants to be clear about what is meant by "position of an object," for example of an electron., then one has to specify definite experiments by which the "position of an electron" can be measured; otherwise this term has no meaning at all. --Heisenberg, in uncertainty paper, 1927 Are the uncertainty relations that Heisenberg discovered in 1927 just the result of the equations used, or are they really built into every measurement? Heisenberg turned to a thought experiment, since he believed that all concepts in science require a definition based on actual, or possible, experimental observations. Heisenberg pictured a microscope that obtains very high resolution by using high-energy gamma rays for illumination. No such microscope exists at present, but it could be constructed in principle. Heisenberg imagined using this microscope to see an electron and to measure its position. He found that the electron's position and momentum did indeed obey the uncertainty relation he had derived mathematically. Bohr pointed out some flaws in the experiment, but once these were corrected the demonstration was fully convincing. Thought Experiment 1 The corrected version of the thought experiment Heisenberg's Gamma-Ray Microscope Experiment A free electron sits directly beneath the center of the microscope's lens (please see AIP page http://www.aip.org/history/heisenberg/p08b.htm or diagram below) . The circular lens forms a cone of angle 2A from the electron. The electron is then illuminated from the left by gamma rays--high-energy light which has the shortest wavelength. These yield the highest resolution, for according to a principle of wave optics, the microscope can resolve (that is, "see" or distinguish) objects to a size of dx, which is related to and to the wavelength L of the gamma ray, by the expression: dx = L/(2sinA) (1) However, in quantum mechanics, where a light wave can act like a particle, a gamma ray striking an electron gives it a kick. At the moment the light is diffracted by the electron into the microscope lens, the electron is thrust to the right. To be observed by the microscope, the gamma ray must be scattered into any angle within the cone of angle 2A. In quantum mechanics, the gamma ray carries momentum as if it were a particle. The total momentum p is related to the wavelength by the formula, p = h / L, where h is Planck's constant. (2) In the extreme case of diffraction of the gamma ray to the right edge of the lens, the total momentum would be the sum of the electron's momentum P'x in the x direction and the gamma ray's momentum in the x direction: P' x + (h sinA) / L', where L' is the wavelength of the deflected gamma ray. In the other extreme, the observed gamma ray recoils backward, just hitting the left edge of the lens. In this case, the total momentum in the X direction is: P''x - (h sinA) / L''. The final x momentum in each case must equal the initial X momentum, since momentum is conserved. Therefore, the final X moment are equal to each other: P'x + (h sinA) / L' = P''x - (h sinA) / L'' (3) If A is small, then the wavelengths are approximately the same, L' ~ L" ~ L. So we have P''x - P'x = dPx ~ 2h sinA / L (4) Since dx = L/(2 sinA), we obtain a reciprocal relationship between the minimum uncertainty in the measured position, dx, of the electron along the X axis and the uncertainty in its momentum, dPx, in the x direction: dPx ~ h / dx or dPx dx ~ h. (5) For more than minimum uncertainty, the "greater than" sign may added. Except for the factor of 4pi and an equal sign, this is Heisenberg's uncertainty relation for the simultaneous measurement of the position and momentum of an object. Re-analysis The original analysis of Heisenberg's Gamma-Ray Microscope Experiment overlooked that the microscope cannot see the object whose size is smaller than its resolving limit, dx, thereby overlooking that the electron which relates to dx and dPx respectively is not the same. According to the truth that the microscope can not see the object whose size is smaller than its resolving limit, dx, we can obtain that what we can see is the electron where the size is larger than or equal to the resolving limit dx and has a certain position, dx = 0. The microscope can resolve (that is, "see" or distinguish) objects to a size of dx, which is related to and to the wavelength L of the gamma ray, by the expression: dx = L/(2sinA) (1) This is the resolving limit of the microscope and it is the uncertain quantity of the object's position. The microscope cannot see the object whose size is smaller than its resolving limit, dx. Therefore, to be seen by the microscope, the size of the electron must be larger than or equal to the resolving limit. But if the size of the electron is larger than or equal to the resolving limit dx, the electron will not be in the range dx. Therefore, dx cannot be deemed to be the uncertain quantity of the electron's position which can be seen by the microscope, but deemed to be the uncertain quantity of the electron's position which can not be seen by the microscope. To repeat, dx is uncertainty in the electron's position which cannot be seen by the microscope. To be seen by the microscope, the gamma ray must be scattered into any angle within the cone of angle 2A, so we can measure the momentum of the electron. But if the size of the electron is smaller than the resolving limit dx, the electron cannot be seen by the microscope, we cannot measure the momentum of the electron. Only the size of the electron is larger than or equal to the resolving limit dx, the electron can be seen by the microscope, we can measure the momentum of the electron. According to Heisenberg's Gamma-Ray Microscope Experiment, the electron¡¯s momentum is uncertain, the uncertainty in its momentum is dPx. dPx is the uncertainty in the electron's momentum which can be seen by microscope. What relates to dx is the electron where the size is smaller than the resolving limit. When the electron is in the range dx, it cannot be seen by the microscope, so its position is uncertain, and its momentum is not measurable, because to be seen by the microscope, the gamma ray must be scattered into any angle within the cone of angle 2A, so we can measure the momentum of the electron. If the electron cannot be seen by the microscope, we cannot measure the momentum of the electron. What relates to dPx is the electron where the size is larger than or equal to the resolving limit dx .The electron is not in the range dx, so it can be seen by the microscope and its position is certain, its momentum is measurable. Apparently, the electron which relates to dx and dPx respectively is not the same. What we can see is the electron where the size is larger than or equal to the resolving limit dx and has a certain position, dx = 0. Quantum mechanics does not rely on the size of the object, but on Heisenberg's Gamma-Ray Microscope experiment. The use of the microscope must relate to the size of the object. The size of the object which can be seen by the microscope must be larger than or equal to the resolving limit dx of the microscope, thus the uncertain quantity of the electron's position does not exist. The gamma ray which is diffracted by the electron can be scattered into any angle within the cone of angle 2A, where we can measure the momentum of the electron. What we can see is the electron which has a certain position, dx = 0, so that in no other position can we measure the momentum of the electron. In Quantum mechanics, the momentum of the electron can be measured accurately when we measure the momentum of the electron only, therefore, we have gained dPx = 0. And, dPx dx =0. (6) Thought Experiment 2 Single Slit Diffraction Experiment Suppose a particle moves in the Y direction originally and then passes a slit with width dx(Please see diagram below) . The uncertain quantity of the particle's position in the X direction is dx, and interference occurs at the back slit . According to Wave Optics , the angle where No.1 min of interference pattern can be calculated by following formula: sinA=L/2dx (1) and L=h/p where h is Planck's constant. (2) So the uncertainty principle can be obtained dPx dx ~ h (5) Re-analysis The original analysis of Single Slit Diffraction Experiment overlooked the corpuscular property of the particle and the Energy-Momentum conservation laws and mistook the uncertain quantity of the particle's position in the X direction is the slit's width dx. According to Newton first law , if an external force in the X direction does not affect the particle, it will move in a uniform straight line, ( Motion State or Static State) , and the motion in the Y direction is unchanged .Therefore , we can learn its position in the slit from its starting point. The particle can have a certain position in the slit and the uncertain quantity of the position is dx =0. According to Newton first law , if the external force at the X direction does not affect particle, and the original motion in the Y direction is not changed , the momentum of the particle in the X direction will be Px=0 and the uncertain quantity of the momentum will be dPx =0. This gives: dPx dx =0. (6) No experiment negates NEWTON FIRST LAW. Whether in quantum mechanics or classical mechanics, it applies to the microcosmic world and is of the form of the Energy-Momentum conservation laws. If an external force does not affect the particle and it does not remain static or in uniform motion, it has disobeyed the Energy-Momentum conservation laws. Under the above thought experiment , it is considered that the width of the slit is the uncertain quantity of the particle's position. But there is certainly no reason for us to consider that the particle in the above experiment has an uncertain position, and no reason for us to consider that the slit's width is the uncertain quantity of the particle. Therefore, the uncertainty principle, dPx dx ~ h (5) which is demonstrated by the above experiment is unreasonable. Conclusion Every physical principle is based on the Experiments, not based on MATHEMATICS, including heisenberg uncertainty principle. Einstein said, One Experiment is enough to negate a physical principle. >From the above re-analysis , it is realized that the thought experiment demonstration for the uncertainty principle is untenable. Therefore, the uncertainty principle is untenable. Reference: 1. Max Jammer. (1974) The philosophy of quantum mechanics (John wiley & sons , Inc New York ) Page 65 2. Ibid, Page 67 3. http://www.aip.org/history/heisenberg/p08b.htm Single Particles Do Not Exhibit Wave-Like Behavior Through a qualitative analysis of the experiment, it is shown that the presumed wave-like behavior of a single particle contradicts the energy-momentum conservation laws and may be expained solely through particle interactions. DUAL SLIT INTERFERENCE EXPERIMENT PART I If a single particle has wave-like behavior, it will create an interference image when it has passed through a single slit. But the experimental result shows that this is not the case Only a large number of particles can create an interference image when they pass through the two slits. PART II In the dual slit interference experiment, the single particle is thought to pass through both slits and interfere with itself at the same time due to its wave-like behavior. The motion of the wave is the same direction as the particle. If the particle passes through a single slit only, it can not be assumed that it has wave-like behavior. If it passes through two slits, it, and also the acompanying wave must be assumed to have motion in two directions. But a wave only has one direction of motion. PART III If one slit is obstructed in the dual slit interference experiment and a particle is launched in this direction, then according to Newton¡¯s first law, (assuming no external forces,) it will travel in a uniform straight line. It will not pass through the closed slit and will not make contact with the screen. If it has wavelike behavior, there is a probability that it will make contact. But this will negate Newton¡¯s first law and the law of conservation of energy and momentum. Both quantum mechanics and classical mechanics are dependent on this law. THE EXPLANATION FOR THE WAVE-LIKE BEHAVIOR OF THE PARTICLE In the dual slit interference experiment, if one slit is unobstructed, particles will impact at certain positions on the screen. But when two slits are open, the particles cannot reach these positions. This phenomenon brings us to the greatest puzzle regarding the image of the particle. But when we consider that the particle may experience two or more reflections, the puzzle may be resolved. As indicated, when one of the slits is obstructed, the particles that move towards this slit cannot get to the screen. However, they can return to the particle source by reflection and then pass through the open slit and reach the above positions since they have different paths when one or two slits are open. This indicates that wave-like behavior may be explained solely on the basis of particle interactions. EXPERIMENTAL TEST The above may be tested by an experiment that can absorb all the particles that move towards the closed slit. If one slit is obstructed by the stuff that can absorb all the particles that move towards it, the intensity of some positions on the screen should decrease THE CONCLUSION Single particles do not exhibit wave-like behavior. The similarity of wave and particle behavior may be attributed to initial impulse and path. The quantum mechanical explanation is suspect, since the probability of one particle and one particle among a large quantity reaching the screen is equal in mathematics and physics. Author : BingXin Gong Postal address : P.O.Box A111 YongFa XiaoQu XinHua HuaDu GuangZhou 510800 P.R.China E-mail: hdgbyi@public.guangzhou.gd.cn Tel: 86---20---86856616 From junkc at fh-trier.de Mon Jan 10 13:58:29 2005 From: junkc at fh-trier.de (Christian Junk) Date: Mon Jan 10 13:58:20 2005 Subject: [XML-SIG] XBEL xslt stylesheet In-Reply-To: <41DDA3B0.4080103@machina.no> References: <41D49DEA.7020806@machina.no> <200501031641.00210.junkc@fh-trier.de> <41DDA3B0.4080103@machina.no> Message-ID: <200501101358.29131.junkc@fh-trier.de> Am Donnerstag, 6. Januar 2005 21:46 schrieb Narve Saetre: > >Yes, this URL is no longer valid, but Joris sent me his XSLT stylesheets a > >month ago. If you like, I can upload them to 'secure' webspace. I'm able > > to provide webspace for other stylesheets, too. So if you're interested, > > we can collect good styles and offer them under a unique url? > > Good idea -- having nice, working stylesheets available is always good > if you have xml files and you want to display them quickly. Since your > web site seems to be the main page for XBEL related work, it would be > the natural place to host XBEL stylesheets. And of course, broken links > are evil, so it is better to host the stylesheets yourself:) I arranged webspace under the subdomain http://xbel.webinternals.de/ and uploaded the stylesheets. If you have other stylesheets please send them to me. Regards, Christian From junkc at fh-trier.de Mon Jan 10 14:03:50 2005 From: junkc at fh-trier.de (Christian Junk) Date: Mon Jan 10 14:03:41 2005 Subject: [XML-SIG] XBEL (dis-)continued? In-Reply-To: <200412182240.37392.junkc@fh-trier.de> References: <200412182240.37392.junkc@fh-trier.de> Message-ID: <200501101403.50532.junkc@fh-trier.de> Perhaps we can use my page http://xbel.webinternals.de , which was arranged for the discussion "[XML-SIG] XBEL xslt stylesheet" in this mailinglist, for the work on XBEL? I designed a logo for XBEL, too. If you like it, we can use it for banner creation and revitalise the project? Looking forward to read your comments ... ;) Regards, Christian From narve at machina.no Mon Jan 10 14:15:41 2005 From: narve at machina.no (=?ISO-8859-1?Q?Narve_S=E6tre?=) Date: Mon Jan 10 14:15:38 2005 Subject: [XML-SIG] XBEL xslt stylesheet In-Reply-To: <200501101358.29131.junkc@fh-trier.de> References: <41D49DEA.7020806@machina.no> <200501031641.00210.junkc@fh-trier.de> <41DDA3B0.4080103@machina.no> <200501101358.29131.junkc@fh-trier.de> Message-ID: <41E27FFD.1090808@machina.no> >I arranged webspace under the subdomain > >http://xbel.webinternals.de/ > >and uploaded the stylesheets. If you have other stylesheets please send them >to me. > > I am happy with the one i've got :) But if I find more, I'll let u know -- Narve s?tre Partner / Machina Networks as / +47 41915331 -------------- next part -------------- A non-text attachment was scrubbed... Name: narve.vcf Type: text/x-vcard Size: 255 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20050110/4666fe59/narve.vcf From administrator at sohomail.co.uk Tue Jan 11 09:44:56 2005 From: administrator at sohomail.co.uk (administrator@sohomail.co.uk) Date: Tue Jan 11 09:41:38 2005 Subject: [XML-SIG] An inbound email security scan detected unscannable content in a message sent from your address (SYM:42621808434095427799) Message-ID: <009601c4f7b9$d380a390$0200a8c0@SOHOMail.local> Subject of the message: Re: Your text Recipient of the message: info From fredrik at pythonware.com Tue Jan 11 21:17:11 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Jan 11 21:17:06 2005 Subject: [XML-SIG] ANN: cElementTree 0.8 (january 11, 2005) Message-ID: effbot.org proudly presents the cElementTree library, a fast and very efficient implementation of the ElementTree API, for Python 2.1 and later. On typical documents, it's 15-20 times faster than the Python version of ElementTree, and uses 2-5 times less memory. Here are some benchmark figures, using a number of popular XML tool- kits to parse a 3405k document-style XML file from disk. library memory time ------------------------------------------------------------ minidom (python 2.1) 80000k 6.5s minidom (python 2.4) 53000k 1.4s ElementTree 1.3 14500k 1.1s pyRXPU 11500k 0.22s cElementTree 0.8 5700k 0.058s ------------------------------------------------------------ readlines (read as text) 5050k 0.032s ------------------------------------------------------------ The library is available as C source code, and as Windows installers for all recent Python versions. Get your copy here: http://effbot.org/downloads#celementtree The cElementTree module uses some support functions from the standard ElementTree library, and will not work properly without it. If you haven't installed it already, you can get it from: http://effbot.org/downloads#elementtree enjoy /F From willismonroe at verizon.net Tue Jan 11 23:07:14 2005 From: willismonroe at verizon.net (Willis) Date: Tue Jan 11 23:07:18 2005 Subject: [XML-SIG] Help writing out xml Message-ID: <20050111220714.IHKE7873.out006.verizon.net@outgoing.verizon.net> I'm writing a script that edits and writes xml to a gaim(http://gaim.sourceforge.net) configuration file. However the problem I'm having is I've found no way of outputting xml in python that preserves text nodes. i.e. gaim writes ... willis ... in it's xml file somehow, and the best I can get python to do is: ... willis ... which gaim won't parse. but I was curious how can the toprettyxml() function change the xml so drastically that it parses itself differently as illustrated in these to lines, where the only difference is the writing function (toprettyxml, and toxml. xml.dom.minidom.parseString(xml.dom.minidom.parseString("willis").toprettyxml("","\n","UTF-8")).getElementsByTagName("account")[0].childNodes[0].data = u'\n willis\n' xml.dom.minidom.parseString(xml.dom.minidom.parseString("willis").toxml("UTF-8")).getElementsByTagName("account")[0].childNodes[0].data = u'willis' I know that printing with toxml() works, however toxml() also writes everything on one line, something that is very impractical for supposedly readable files. basically I'm looking for a way to print xml, where the output is nice looking, but the text nodes are preserved perfectly, as in no extra line breaks and tabs that toprettyxml() inserts on it's own. thanks -Willis From hgg9140 at seanet.com Wed Jan 12 03:33:39 2005 From: hgg9140 at seanet.com (Harry George) Date: Wed Jan 12 02:34:35 2005 Subject: [XML-SIG] Help writing out xml In-Reply-To: <20050111220714.IHKE7873.out006.verizon.net@outgoing.verizon.net> References: <20050111220714.IHKE7873.out006.verizon.net@outgoing.verizon.net> Message-ID: <20050111183339.79d94688@fred.site> An option is to write the xml manually. That is, each object in your "model" (as in model-view-controller) has a "to_xml" method which takes a fileobject (sys.stdout is default), and writes itself in XML to that file. Indents are controlled by some formatting mechanism. I personally use my "tabbedwriter": http://www.seanet.com/~hgg9140/comp/index.html I've done this with a lot of XML's and find it is easy to control and trivial to setup. Where there is inheritance in the "model", a lot of the XML output can be done by the base object(s). On Tue, 11 Jan 2005 17:07:14 -0500 Willis wrote: > I'm writing a script that edits and writes xml to a > gaim(http://gaim.sourceforge.net) configuration file. However the > problem I'm having is I've found no way of outputting xml in python > that preserves text nodes. > > i.e. gaim writes > ... > willis > ... > > in it's xml file somehow, and the best I can get python to do is: > ... > > willis > > ... > which gaim won't parse. > > but I was curious how can the toprettyxml() function change the xml so > drastically that it parses itself differently as illustrated in these > to lines, where the only difference is the writing function > (toprettyxml, and toxml. > > > xml.dom.minidom.parseString(xml.dom.minidom.parseString("wil > lis").toprettyxml("","\n","UTF-8")).getElementsByTagName("a > ccount")[0].childNodes[0].data = u'\n willis\n' > > xml.dom.minidom.parseString(xml.dom.minidom.parseString("wil > lis").toxml("UTF-8")).getElementsByTagName("account")[0].ch > ildNodes[0].data = u'willis' > > I know that printing with toxml() works, however toxml() also writes > everything on one line, something that is very impractical for > supposedly readable files. > > basically I'm looking for a way to print xml, where the output is nice > looking, but the text nodes are preserved perfectly, as in no extra > line breaks and tabs that toprettyxml() inserts on it's own. > > thanks > -Willis > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > -- Harry George hgg9140@seanet.com www.seanet.com/~hgg9140 From fredrik at pythonware.com Wed Jan 12 20:12:57 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Jan 12 20:12:58 2005 Subject: [XML-SIG] Re: cElementTree 0.8 (january 11, 2005) References: Message-ID: several people have asked for libxml2 figures, since libxml2 is known as the fastest parser under the sun (with the possible exception of RXP, which is known as quite possibly the fastest parser anywhere). here's an updated table: library memory time ------------------------------------------------------------ minidom (python 2.1) 80000k 6.5s minidom (python 2.4) 53000k 1.4s ElementTree 1.3 14500k 1.1s pyRXPU 11500k 0.22s libxml2 16000k 0.098s cElementTree 0.8 5700k 0.058s ------------------------------------------------------------ readlines (read as text) 5050k 0.032s ------------------------------------------------------------ (gnosis.objectify and pyrxp both failed to parse the source file, and pyrxpu seems to ignore namespaces -- setting the namespace flag only makes it throw away the xmlns attributes, which isn't exactly helpful...) (btw, soon-to-be-released cET 0.9 is faster than 0.8. more about that one later). From ping at pingyeh.net Wed Jan 12 20:28:18 2005 From: ping at pingyeh.net (Ping Yeh) Date: Wed Jan 12 20:30:05 2005 Subject: [XML-SIG] XML for scientific data storage and search Message-ID: <41E57A52.1010502@pingyeh.net> Hello, I'm a newbie to XML, just wrote a program that can store my scientific data objects as an XML file and restore them later (like marshaling). However, I found it is extremely slow... I changed the implementation from minidom to sax. It speeds up somewhat (30% or so) for small files but not enough. If I go back to using binary data the speed is ~ 5 times faster or more. Are there widely used ways to speed up parsing? Another problem is memory footprint. My XML data file can be large: 10s of megabytes with 100 thousands of objects. If I use xml.sax.parseString() it parses the whole string into memory objects which inflats. I only need to loop over the objects in the XML file once. Are there common ways to do a delayed read? I'm looking for something like xml.sax.parseFile('data0.xml', myContentHandler) objects = myContentHandler.getObjects() # returns an iterator for obj in objects: # reading occurs here (delayed reading) # do something with obj... But I haven't found any. I'm not sure this is possible with current architecture of parsers. Any advise is highly appreciated. Thanks, Ping From fredrik at pythonware.com Wed Jan 12 20:54:52 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Jan 12 20:54:46 2005 Subject: [XML-SIG] Re: XML for scientific data storage and search References: <41E57A52.1010502@pingyeh.net> Message-ID: "Ping Yeh" wrote: > I'm looking for something like > > xml.sax.parseFile('data0.xml', myContentHandler) > objects = myContentHandler.getObjects() # returns an iterator > for obj in objects: # reading occurs here (delayed reading) > # do something with obj... > > But I haven't found any. I'm not sure this is possible with current > architecture of parsers. Any advise is highly appreciated. http://online.effbot.org/2004_12_01_archive.htm#element-generator http://online.effbot.org/2004_12_01_archive.htm#element-generator-2 From fredrik at pythonware.com Wed Jan 12 21:00:21 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Jan 12 21:00:39 2005 Subject: [XML-SIG] Re: XML for scientific data storage and search References: <41E57A52.1010502@pingyeh.net> Message-ID: >> But I haven't found any. I'm not sure this is possible with current >> architecture of parsers. Any advise is highly appreciated. > > http://online.effbot.org/2004_12_01_archive.htm#element-generator > http://online.effbot.org/2004_12_01_archive.htm#element-generator-2 also: http://www-106.ibm.com/developerworks/xml/library/x-tipulldom.html http://cvs.sourceforge.net/viewcvs.py/splice/kid/pulltree.py?view=markup From rtomayko at gmail.com Thu Jan 13 00:47:15 2005 From: rtomayko at gmail.com (Ryan Tomayko) Date: Thu Jan 13 00:47:20 2005 Subject: [XML-SIG] Re: cElementTree 0.8 (january 11, 2005) In-Reply-To: References: Message-ID: <49CEA9A7-64F4-11D9-B15C-000D9336497A@gmail.com> On Jan 12, 2005, at 2:12 PM, Fredrik Lundh wrote: > pyRXPU 11500k 0.22s > libxml2 16000k 0.098s > cElementTree 0.8 5700k 0.058s > ------------------------------------------------------------ > readlines (read as text) 5050k 0.032s > ------------------------------------------------------------ > :) Looks great, Fredrik. I should be able to start hitting this pretty hard within the next month or so. btw, the readlines comparison is a nice touch. Ryan From rtomayko at gmail.com Thu Jan 13 00:51:34 2005 From: rtomayko at gmail.com (Ryan Tomayko) Date: Thu Jan 13 00:51:38 2005 Subject: [XML-SIG] Re: XML for scientific data storage and search In-Reply-To: References: <41E57A52.1010502@pingyeh.net> Message-ID: On Jan 12, 2005, at 3:00 PM, Fredrik Lundh wrote: > http://cvs.sourceforge.net/viewcvs.py/splice/kid/pulltree.py? > view=markup Just a heads up regarding this module... I'm not extremely happy with it right now. It could use a few rounds of simplification and I have a feeling some basic function/method names and signatures will change. Ryan From fredrik at pythonware.com Thu Jan 13 01:05:43 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Jan 13 01:05:36 2005 Subject: [XML-SIG] Re: Re: cElementTree 0.8 (january 11, 2005) References: <49CEA9A7-64F4-11D9-B15C-000D9336497A@gmail.com> Message-ID: Ryan Tomayko wrote: > btw, the readlines comparison is a nice touch. yeah, but I should really have used codecs.open instead of just a plain open... readlines (read as utf-8 text) 8850k 0.093s cElementTree 0.9 4900k 0.047s From uche.ogbuji at fourthought.com Thu Jan 13 01:36:17 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Thu Jan 13 01:36:26 2005 Subject: [XML-SIG] ANN: Amara XML Toolkit 0.9.2 Message-ID: <1105576577.32542.165.camel@borgia> http://uche.ogbuji.net/tech/4Suite/amara ftp://ftp.4suite.org/pub/Amara/ Changes in this release: * Use local names rather than QNames for default bindings * Add attribute support to XPath * Add amara.binderytools.preserve_attribute_details rule * Reorg and fix demos and tests * Add Flextyper DTLL implementation * Add parsing functions binderytools.bind_string, binderytools.bind_stream * Add binderytools.pushbind. This one deserves some elaboration. The following is complete code for iterating through address labels in an XML document, while never loading more memory than needed to hold one label element: from amara import binderytools for subtree in binderytools.pushbind('/labels/label', source='labels.xml'): print subtree.label.name, 'of', subtree.label.address.city Amara XML Toolkit is a collection of Python tools for XML processing-- not just tools that happen to be written in Python, but tools built from the ground up to use Python idioms and take advantage of the many advantages of Python. Amara builds on 4Suite [http://4Suite.org], but whereas 4Suite focuses more on literal implementation of XML standards in Python, Amara focuses on Pythonic idiom. It provides tools you can trust to conform with XML standards without losing the familiar Python feel. The components of Amara are: * Bindery: data binding tool (fancy way of saying: a very Pythonic XML API) * Scimitar: implementation of the ISO Schematron schema language for XML; converts Schematron files to Python scripts * domtools: set of tools to augment Python DOMs * saxtools: set of tools to make SAX easier to use in Python * Flextyper: user-defined datatypes in Python for XML processing There's a lot in Amara, but here are highlights: Amara Bindery: XML as easy as py -------------------------------- Based on the retired project Anobind, but updated to use SAX rather than DOM to create bindings. Bindery reads an XML document and returns a data structure of Python objects corresponding to the vocabulary used in the XML document, for maximum clarity. Bindery turns the document What do you mean "bleh" But I was looking for argument Into a set of objects such that you can write binding.monty.python.spam In order to get the value "eggs" or binding.monty.python[1] In order to get the value "But I was looking for argument". There are other such tools for Python, and what makes Anobind unique is that it's driven by a very declarative rules-based system for binding XML to the Python data. You can register rules that are triggered by XPattern expressions specialized binding behavior. It includes XPath support and supports mutation. Bindery is very efficient, using SAX to generate bindings. Scimitar: exceptional schema language for an exceptional programming language ----------------------------------------------------------------------------- Merged in from a separate project, Scimitar is an implementation of ISO Schematron that compiles a Schematron schema into a Python validator script. You typically use scimitar in two phases. Say you have a schematron schema schema1.stron and you want to validate multiple XML files against it, instance1.xml, instance2.xml, instance3.xml. First you run schema1.stron through the scimitar compiler script, scimitar.py: scimitar.py schema1.stron A file, schema1.py is generated and can be used to validate XML instances: python schema1.py instance1.xml Which emits a validation report. Amara DOM Tools: giving DOM a more Pythonic face ------------------------------------------------ DOM came from the Java world, hardly the most Pythonic API possible. Some DOM-like implementations such as 4Suite's Domlettes mix in some Pythonic idiom. Amara DOM Tools goes even further. Amara DOM Tools feature pushdom, similar to xml.dom.pulldom, but easier to use. It also includes Python generator-based tools for DOM processing, and a function to return an XPath location for any DOM node. Amara SAX Tools: SAX without the brain explosion ------------------------------------------------ Tenorsax (amara.saxtools.tenorsax) is a framework for "linerarizing" SAX logic so that it flows more naturally, and needs a lot less state machine wizardry. License ------- Amara is open source, provided under the 4Suite variant of the Apache license. See the file COPYING for details. Installation ------------ Amara requires Python 2.3 or more recent and 4Suite 1.0a3 or more recent. Make sure these are installed, unpack Amara to a convenient location and run python setup.py install -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Full XML Indexes with Gnosis - http://www.xml.com/pub/a/2004/12/08/py-xml.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Use Universal Feed Parser to tame RSS - http://www.ibm.com/developerworks/xml/library/x-tipufp.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/ The State of Python-XML in 2004 - http://www.xml.com/pub/a/2004/10/13/py-xml.html From ping at pingyeh.net Thu Jan 13 08:33:49 2005 From: ping at pingyeh.net (Ping Yeh) Date: Thu Jan 13 08:34:04 2005 Subject: [XML-SIG] Re: XML for scientific data storage and search In-Reply-To: References: <41E57A52.1010502@pingyeh.net> Message-ID: <41E6245D.5070500@pingyeh.net> Thanks a lot for the reference pointers! I'm now studying pull DOM, and will go through other modules later. I'll make performance comparisons available just in case they might be useful. cheers, Ping Fredrik Lundh wrote: >>>But I haven't found any. I'm not sure this is possible with current >>>architecture of parsers. Any advise is highly appreciated. >> >>http://online.effbot.org/2004_12_01_archive.htm#element-generator >>http://online.effbot.org/2004_12_01_archive.htm#element-generator-2 > > > also: > > http://www-106.ibm.com/developerworks/xml/library/x-tipulldom.html > http://cvs.sourceforge.net/viewcvs.py/splice/kid/pulltree.py?view=markup > > > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig From arid at eieretsher.org Thu Jan 13 12:36:44 2005 From: arid at eieretsher.org (arid@eieretsher.org) Date: Thu Jan 13 12:37:01 2005 Subject: [XML-SIG] Mail System Error - Returned Mail Message-ID: <200501131130.j0DBUT2u062225@smtp2.ste.net.sy> Welcome to STE ISP, this attachment was rejected by AntiSpam, If you sure it is safe and you want to recieve it, Please ask the sender to attach it again like a zip file.. Thanks for your cooperation. An attachment named transcript.scr was removed from this document as it constituted a security hazard. If you require this document, please contact the sender and arrange an alternate means of receiving it. -------------- next part -------------- The original message was received at Thu, 13 Jan 2005 13:36:44 +0200 from eieretsher.org [195.203.208.39] ----- The following addresses had permanent fatal errors ----- ----- Transcript of the session follows ----- ... while talking to mail server 127.167.212.6: >>> RCPT To: <<< 550 5.1.1 ... User unknown -------------- next part -------------- This mail is probably spam. The original message has been attached along with this report, so you can recognize or block similar unwanted mail in future. See http://spamassassin.org/tag/ for more details. Content preview: The original message was received at Thu, 13 Jan 2005 13:36:44 +0200 from eieretsher.org [195.203.208.39] ----- The following addresses had permanent fatal errors ----- [...] Content analysis details: (7.90 points, 5 required) MSGID_NO_HOST (2.8 points) Message-Id has no hostname NO_REAL_NAME (1.0 points) From: does not include a real name MICROSOFT_EXECUTABLE (0.1 points) RAW: Message includes Microsoft executable program FORGED_MUA_OUTLOOK (3.7 points) Forged mail pretending to be from MS Outlook INVALID_MSGID (0.3 points) Message-Id is not valid, according to RFC 2822 From wunder at verity.com Thu Jan 13 17:16:16 2005 From: wunder at verity.com (Walter Underwood) Date: Thu Jan 13 17:16:35 2005 Subject: [XML-SIG] Re: XML for scientific data storage and search In-Reply-To: <41E6245D.5070500@pingyeh.net> References: <41E57A52.1010502@pingyeh.net> <41E6245D.5070500@pingyeh.net> Message-ID: <7645741383DBD38784EE7B0B@adsl-64-166-133-243.dsl.snfc21.pacbell.net> There are some XML speed issues that won't go away with a better parser. Sending floating point numbers as formatted ASCII is never going to be really fast. wunder --On January 13, 2005 3:33:49 PM +0800 Ping Yeh wrote: > Thanks a lot for the reference pointers! I'm now studying > pull DOM, and will go through other modules later. I'll make > performance comparisons available just in case they might be useful. > > cheers, > Ping > > Fredrik Lundh wrote: >>>> But I haven't found any. I'm not sure this is possible with current >>>> architecture of parsers. Any advise is highly appreciated. >>> >>> http://online.effbot.org/2004_12_01_archive.htm#element-generator >>> http://online.effbot.org/2004_12_01_archive.htm#element-generator-2 >> >> >> also: >> >> http://www-106.ibm.com/developerworks/xml/library/x-tipulldom.html >> http://cvs.sourceforge.net/viewcvs.py/splice/kid/pulltree.py?view=markup >> >> >> >> >> >> _______________________________________________ >> XML-SIG maillist - XML-SIG@python.org >> http://mail.python.org/mailman/listinfo/xml-sig > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig -- Walter Underwood Principal Architect, Verity From paul at boddie.org.uk Thu Jan 13 17:37:59 2005 From: paul at boddie.org.uk (Paul Boddie) Date: Thu Jan 13 17:41:13 2005 Subject: [XML-SIG] Re: cElementTree 0.8 (january 11, 2005) In-Reply-To: References: Message-ID: <200501131737.59247.paul@boddie.org.uk> On Wednesday 12 January 2005 20:12, Fredrik Lundh wrote: > several people have asked for libxml2 figures, since libxml2 is known as > the fastest parser under the sun (with the possible exception of RXP, which > is known as quite possibly the fastest parser anywhere). > > here's an updated table: [...] > libxml2 16000k 0.098s > cElementTree 0.8 5700k 0.058s cElementTree looks really impressive, but having run various tests comparing libxml2 and cElementTree with some of the larger test documents in the libxml2 distribution, libxml2 still seems faster. I've used GNU time to report things like the elapsed, system and user times as well as measuring the elapsed time in Python, but I couldn't get the memory usage. How should one go about getting these figures under Linux? Should I turn process accounting on or something like that? One thing that may explain the discrepancy between the above results and the ones I've been getting is the unfortunate need to explicitly free each libxml2 document after finishing with it - I found that otherwise libxml2 does indeed get slower after loading a few documents, and I'd imagine that the memory requirements start to affect my resource-challenged laptop as a result. Of course, this depends on how one does the tests, but in order to diminish start-up times and to time a single process loading many documents, I looped over a number of files, parsing each one, looping over this entire process many times. Still, cElementTree looks like a very promising addition to the range of Python XML tools, especially given the uncomplicated installation process (compared to some of the other top performers, notably libxml2 and cDomlette). Paul From fredrik at pythonware.com Thu Jan 13 21:35:18 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Jan 13 21:35:11 2005 Subject: [XML-SIG] Re: Re: cElementTree 0.8 (january 11, 2005) References: <200501131737.59247.paul@boddie.org.uk> Message-ID: Paul Boddie wrote: > > libxml2 16000k 0.098s > > cElementTree 0.8 5700k 0.058s > > cElementTree looks really impressive, but having run various tests comparing > libxml2 and cElementTree with some of the larger test documents in the > libxml2 distribution, libxml2 still seems faster. After chatting a little with people who've benchmarked cElementTree and other toolkits on a variety of platforms, I think the general conclusion seems to be that both libraries can parse most stuff in about the same number of milliseconds. The main differences seems to come from 1) compilers, and 2) what Python version you're using (2.4 can be a lot faster). My benchmarks all use "official" binary distributions, and I have no idea what compilers the other developers have used. Nor has an ordinary user, of course. If people want better results for their favourite toolkit, they should release better binaries ;-) > I've used GNU time to report things like the elapsed, system and user times > as well as measuring the elapsed time in Python, but I couldn't get the memory > usage. My test harness is basically: import stuff raw_input("check process size") t0 = time.clock() # use time.time() on unix parse(file) t1 = time.clock() # see above print t1 - t0 raw_input("check process size") clean up where the process size is checked in the usual way, and the "memory used by the dom" is the difference between the two values. To check for anomalies, I also run the above in a loop (minus the raw_input calls), and watch how performance and memory use vary over time. Some toolkits are extremely unstable, timewise (GC issues?). And I run the tests several times over a day, to make sure the system load doesn't impact too much. Benchmarking stuff is always hard, and when you're dealing with things that take 0.0-0.2 seconds *and* consume lots of memory, it's even harder. When comparing such benchmarks from different machines, you better use a rather large fudge factor... > Still, cElementTree looks like a very promising addition to the range of > Python XML tools, especially given the uncomplicated installation process > (compared to some of the other top performers, notably libxml2 and > cDomlette). As someone just pointed in private mail, libxml2 may be on par on the parsing side, but since cElementTree creates *Python* objects, it has a major advant- age over libxml2 once you start digging into the tree from Python. cElement- Tree doesn't have to create any proxy objects; everything you can reach is al- ready a Python object. But sure, it's hard to beat libxml2 if you want both speed *and* support for every XML standard you've ever heard of (and then some)... From veillard at redhat.com Thu Jan 13 22:44:49 2005 From: veillard at redhat.com (Daniel Veillard) Date: Thu Jan 13 22:44:52 2005 Subject: [XML-SIG] Re: Re: cElementTree 0.8 (january 11, 2005) In-Reply-To: References: <200501131737.59247.paul@boddie.org.uk> Message-ID: <20050113214449.GX8569@redhat.com> On Thu, Jan 13, 2005 at 09:35:18PM +0100, Fredrik Lundh wrote: > As someone just pointed in private mail, libxml2 may be on par on the parsing > side, but since cElementTree creates *Python* objects, it has a major advant- > age over libxml2 once you start digging into the tree from Python. cElement- > Tree doesn't have to create any proxy objects; everything you can reach is al- > ready a Python object. I'm following from the distance, I don't care about a speed war so I won't comment on any benchmark :-). If you can build a C layer dedicated to Python you should be able to get better performances than a generic engine with autogenerated python bindings, and yes the proxy objects are a bit nasty, I'm not claiming the bindings are perfect (I would be toasted very quickly here I'm sure :-). I't also refreshing to see somone from the Python side caring about performances (now I'm sure I'm will get some fan mail ;-) Seriously, with respect to performances one of the trouble I have seen when doing a bit of profiling is that interning strings, i.e. the process of taking string coming from C and turning them into Python string objects, to be extremely costly, I don't know if it's the hash function or the way the string hash works but it was one of the biggest cost when I tried (with python 2.3 or 2.2 I can't remember precisely when it was). Daniel -- Daniel Veillard | Red Hat Desktop team http://redhat.com/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ From ping at pingyeh.net Fri Jan 14 03:34:02 2005 From: ping at pingyeh.net (Ping Yeh) Date: Fri Jan 14 03:34:20 2005 Subject: [XML-SIG] Re: XML for scientific data storage and search In-Reply-To: <7645741383DBD38784EE7B0B@adsl-64-166-133-243.dsl.snfc21.pacbell.net> References: <41E57A52.1010502@pingyeh.net> <41E6245D.5070500@pingyeh.net> <7645741383DBD38784EE7B0B@adsl-64-166-133-243.dsl.snfc21.pacbell.net> Message-ID: <41E72F9A.2000308@pingyeh.net> That's true. I'm willing to trade some speed for language neutrality. But after some study it seems the speed is really toooooooo slow... One big reason is that with DOM and SAX the whole data tree has to be built before I can do anything with the data. Pull DOM is my hope of using smaller memory footprint to reduce memory allocation overheads in python. I may be forced to use binary format if I can't find better performance. I'm reluctant to do that because the type and amount of data my experiment will produce is not yet known exactly and adding flexibility to binary format is painful. Ping Walter Underwood wrote: > There are some XML speed issues that won't go away with a better parser. > Sending floating point numbers as formatted ASCII is never going to be > really fast. > > wunder > > --On January 13, 2005 3:33:49 PM +0800 Ping Yeh wrote: > > >>Thanks a lot for the reference pointers! I'm now studying >>pull DOM, and will go through other modules later. I'll make >>performance comparisons available just in case they might be useful. >> >>cheers, >>Ping >> >>Fredrik Lundh wrote: >> >>>>>But I haven't found any. I'm not sure this is possible with current >>>>>architecture of parsers. Any advise is highly appreciated. >>>> >>>>http://online.effbot.org/2004_12_01_archive.htm#element-generator >>>>http://online.effbot.org/2004_12_01_archive.htm#element-generator-2 >>> >>> >>>also: >>> >>>http://www-106.ibm.com/developerworks/xml/library/x-tipulldom.html >>>http://cvs.sourceforge.net/viewcvs.py/splice/kid/pulltree.py?view=markup >>> >>> >>> >>> >>> >>>_______________________________________________ >>>XML-SIG maillist - XML-SIG@python.org >>>http://mail.python.org/mailman/listinfo/xml-sig >> >>_______________________________________________ >>XML-SIG maillist - XML-SIG@python.org >>http://mail.python.org/mailman/listinfo/xml-sig > > > > > -- > Walter Underwood > Principal Architect, Verity From tve at vormig.net Fri Jan 14 11:39:29 2005 From: tve at vormig.net (Tim van Erven) Date: Fri Jan 14 11:39:30 2005 Subject: [XML-SIG] Problem using schema instead of DTD Message-ID: <20050114103929.GC8335@mould.vormig.net> Dear all, I'm using pyxml to read an xml file from stdin, modify it a bit, and write it out again to stdout. In the process a doctype is added automatically. In the output a line is added. I'm using a schema to validate my output, but it won't validate when the !DOCTYPE is present, so I think I'd like to get rid of that. I'm doing something like: import sys from xml.dom.ext.reader import Sax2 from xml.dom.ext import PrettyPrint reader = Sax2.Reader() doc = reader.fromStream(sys.stdin) # modify doc PrettyPrint(doc) The xml files I'm using look like this: This is some sample text. And the output of my program looks like this, which won't validate: This is some modified sample text. Does anyone know how to get rid of the !DOCTYPE line? Is that the right thing to do anyway? I'm rather new to xml, so any conceptual corrections would be much appreciated as well. Regards, Tim -- Tim van Erven From fredrik at pythonware.com Fri Jan 14 11:43:04 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Jan 14 11:43:11 2005 Subject: [XML-SIG] Re: Re: Re: cElementTree 0.8 (january 11, 2005) References: <200501131737.59247.paul@boddie.org.uk> <20050113214449.GX8569@redhat.com> Message-ID: Daniel Veillard wrote: > Seriously, with respect to performances one of the trouble I have seen when > doing a bit of profiling is that interning strings, i.e. the process of > taking string coming from C and turning them into Python string objects, > to be extremely costly, I don't know if it's the hash function or the way > the string hash works but it was one of the biggest cost when I tried > (with python 2.3 or 2.2 I can't remember precisely when it was). in python, conversion and interning and hash calculations are three different things, so I'm not sure what your problem really was. but I'm curious. can you elaborate? From veillard at redhat.com Fri Jan 14 12:00:41 2005 From: veillard at redhat.com (Daniel Veillard) Date: Fri Jan 14 12:00:54 2005 Subject: [XML-SIG] Re: Re: Re: cElementTree 0.8 (january 11, 2005) In-Reply-To: References: <20050113214449.GX8569@redhat.com> Message-ID: <20050114110041.GB8569@redhat.com> On Fri, Jan 14, 2005 at 11:43:04AM +0100, Fredrik Lundh wrote: > Daniel Veillard wrote: > > > Seriously, with respect to performances one of the trouble I have seen when > > doing a bit of profiling is that interning strings, i.e. the process of > > taking string coming from C and turning them into Python string objects, > > to be extremely costly, I don't know if it's the hash function or the way > > the string hash works but it was one of the biggest cost when I tried > > (with python 2.3 or 2.2 I can't remember precisely when it was). > > in python, conversion and interning and hash calculations are three different > things, so I'm not sure what your problem really was. but I'm curious. can > you elaborate? You have a python function calling a native function. That function returns a string. That C string is translated to a Python string by the wrapper using PyString_FromString(). That operation seems to be extremely expensive. Daniel -- Daniel Veillard | Red Hat Desktop team http://redhat.com/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ From tve at vormig.net Fri Jan 14 14:18:05 2005 From: tve at vormig.net (Tim van Erven) Date: Fri Jan 14 14:18:04 2005 Subject: [XML-SIG] Problem using schema instead of DTD In-Reply-To: <20050114103929.GC8335@mould.vormig.net> References: <20050114103929.GC8335@mould.vormig.net> Message-ID: <20050114131805.GA27682@mould.vormig.net> On Fri, 14/01/2005 11:39 +0100, Tim van Erven wrote: > I'm using pyxml to read an xml file from stdin, modify it a bit, and > write it out again to stdout. In the process a doctype is added > automatically. In the output a line is added. I'm using a > schema to validate my output, but it won't validate when the !DOCTYPE is > present, so I think I'd like to get rid of that. Correction: it turns out I'm using the built-in XML libraries, not pyxml. I'm using python 2.2, by the way. Regards, Tim -- Tim van Erven From mike at skew.org Fri Jan 14 18:19:38 2005 From: mike at skew.org (Mike Brown) Date: Fri Jan 14 18:19:41 2005 Subject: [XML-SIG] Problem using schema instead of DTD In-Reply-To: <20050114103929.GC8335@mould.vormig.net> Message-ID: <200501141719.j0EHJcjc041954@chilled.skew.org> Tim van Erven wrote: > And the output of my program looks like this, which won't validate: > > > > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://www.example.org/info someschema.xsd"> > > This is some modified sample text. > > > Does anyone know how to get rid of the !DOCTYPE line? Is that the right > thing to do anyway? > > I'm rather new to xml, so any conceptual corrections would be much > appreciated as well. What are you parsing your generated XML with, and what is the error you are getting? It could be happening for a couple of reasons. Whatever you are using to do the parsing and validation of that XML may be assuming (incorrectly) that if it sees a DOCTYPE, it has to do DTD-based validation. Whether or not validation is being done, and what kind of validation, is up to the user of the tool doing the parsing. (Note: Pointing to or directly providing schematic info is just one of several capabilities of a the document type declaration; it is also a source of entity and default attribute declarations, and is normally checked for those regardless of whether validation is being done.) Another possibility is that the tool you are using does not recognize the minimal DOCTYPE declaration that is being added to your document. Such a declaration serves no real purpose, but is permitted to exist by the XML spec. It is an often-overlooked 'feature' of XML though; some parsers may reject it out of ignorance. -Mike From faassen at infrae.com Fri Jan 14 18:23:41 2005 From: faassen at infrae.com (Martijn Faassen) Date: Fri Jan 14 18:22:39 2005 Subject: [XML-SIG] Re: Re: Re: cElementTree 0.8 (january 11, 2005) In-Reply-To: <20050114110041.GB8569@redhat.com> References: <20050113214449.GX8569@redhat.com> <20050114110041.GB8569@redhat.com> Message-ID: <41E8001D.9040607@infrae.com> Daniel Veillard wrote: > On Fri, Jan 14, 2005 at 11:43:04AM +0100, Fredrik Lundh wrote: > >>Daniel Veillard wrote: >> >>> Seriously, with respect to performances one of the trouble I have seen when >>>doing a bit of profiling is that interning strings, i.e. the process of >>>taking string coming from C and turning them into Python string objects, >>>to be extremely costly, I don't know if it's the hash function or the way >>>the string hash works but it was one of the biggest cost when I tried >>>(with python 2.3 or 2.2 I can't remember precisely when it was). >> >>in python, conversion and interning and hash calculations are three different >>things, so I'm not sure what your problem really was. but I'm curious. can >>you elaborate? > > You have a python function calling a native function. That function returns > a string. That C string is translated to a Python string by the wrapper > using PyString_FromString(). That operation seems to be extremely expensive. That's nothing. It's even worse if you have to transform the UTF-8 strings that libxml2 delivers into Python unicode strings.:) By the way, I'm at least one of the persons Fredrik has been mailing with as concerning the speed comparisons, as I've been implement the ElementTree API on top of libxml2. This now works, without having to clean up your memory after yourself, and with unicode strings, etc. You can also do xpath and XSLT a lot more easily with lxml.etree, though especially XSLT support is still coming together. lxml.etree is likely to be a lot slower than a more low-level binding at various operations, but it's a ton more convenient (aka "Pythonic"). You can do things like this: >>> from lxml import etree >>> tree = etree.parse('ot.xml') >>> tree.xpath('(//v)[5]/text()') [u'And God called the light Day, and the darkness he called Night. And the evening and the morning were the first day.\n'] or, even this: >>> result = tree.xpath('(//v)[5]') >>> result[0].text = 'The day and night verse.' >>> tree.xpath('(//v)[5]/text()') [u'The day and night verse.'] i.e. the result of xpath queries are ElementTree style objects and the whole XML tree is navigable using the ElementTree API. Regards, Martijn From wunder at verity.com Fri Jan 14 19:02:35 2005 From: wunder at verity.com (Walter Underwood) Date: Fri Jan 14 18:46:15 2005 Subject: [XML-SIG] Re: XML for scientific data storage and search In-Reply-To: <41E72F9A.2000308@pingyeh.net> References: <41E57A52.1010502@pingyeh.net> <41E6245D.5070500@pingyeh.net> <7645741383DBD38784EE7B0B@adsl-64-166-133-243.dsl.snfc21.pacbell.net> <41E72F9A.2000308@pingyeh.net> Message-ID: Instead of inventing something from scratch, I'd recommend an existing, standard, flexible, free format, like NetCDF. It even has Python bindings. http://my.unidata.ucar.edu/content/software/netcdf/ If you must invent it yourself, use Python marshal format over HTTP. Works fine. XML is wonderful stuff, but it is really unsuitable for large amounts of data or fast transfer. SAX should not require reading in the whole document, though. It is almost always the right choice if you are extracting the data from a document instead of manipulating it. wunder --On Friday, January 14, 2005 10:34:02 AM +0800 Ping Yeh wrote: > That's true. I'm willing to trade some speed for language neutrality. > But after some study it seems the speed is really toooooooo slow... > One big reason is that with DOM and SAX the whole data tree has to > be built before I can do anything with the data. Pull DOM is my > hope of using smaller memory footprint to reduce memory allocation > overheads in python. > > I may be forced to use binary format if I can't find better performance. > I'm reluctant to do that because the type and amount of data my experiment > will produce is not yet known exactly and adding flexibility to binary > format is painful. > > Ping > > Walter Underwood wrote: >> There are some XML speed issues that won't go away with a better parser. >> Sending floating point numbers as formatted ASCII is never going to be >> really fast. >> >> wunder >> >> --On January 13, 2005 3:33:49 PM +0800 Ping Yeh wrote: >> >> >>> Thanks a lot for the reference pointers! I'm now studying >>> pull DOM, and will go through other modules later. I'll make >>> performance comparisons available just in case they might be useful. >>> >>> cheers, >>> Ping >>> >>> Fredrik Lundh wrote: >>> >>>>>> But I haven't found any. I'm not sure this is possible with current >>>>>> architecture of parsers. Any advise is highly appreciated. >>>>> >>>>> http://online.effbot.org/2004_12_01_archive.htm#element-generator >>>>> http://online.effbot.org/2004_12_01_archive.htm#element-generator-2 >>>> >>>> >>>> also: >>>> >>>> http://www-106.ibm.com/developerworks/xml/library/x-tipulldom.html >>>> http://cvs.sourceforge.net/viewcvs.py/splice/kid/pulltree.py?view=markup >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> XML-SIG maillist - XML-SIG@python.org >>>> http://mail.python.org/mailman/listinfo/xml-sig >>> >>> _______________________________________________ >>> XML-SIG maillist - XML-SIG@python.org >>> http://mail.python.org/mailman/listinfo/xml-sig >> >> >> >> >> -- >> Walter Underwood >> Principal Architect, Verity > -- Walter Underwood Principal Architect Verity Ultraseek From and-xml at doxdesk.com Fri Jan 14 20:53:46 2005 From: and-xml at doxdesk.com (Andrew Clover) Date: Fri Jan 14 20:53:36 2005 Subject: [XML-SIG] Problem using schema instead of DTD In-Reply-To: <20050114103929.GC8335@mould.vormig.net> References: <20050114103929.GC8335@mould.vormig.net> Message-ID: <41E8234A.4080107@doxdesk.com> Tim van Erven wrote: > In the process a doctype is added automatically. Yeah, 4DOM will do that. A null doctype is added at parse-time, for some reason. You can throw the doctype node out after parsing or before printing: if document.doctype is not None: document.removeChild(document.doctype) or use one of the other DOM libraries that does not exhibit this behaviour. -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From fredrik at pythonware.com Sat Jan 15 12:22:34 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Jan 15 12:22:34 2005 Subject: [XML-SIG] ANN: cElementTree 0.9.2 (january 15, 2005) Message-ID: Time for a new release. The 0.9.2 release is 10-20% faster than 0.8 on my benchmarks, and uses 5-15% less memory. You can consider this to be the first 1.0 release candidate. Here are some benchmark results, using a number of popular XML tool- kits to parse a 3405k source file on my development machine: library memory time ------------------------------------------------------------ minidom (python 2.1) 80000k 6.5s minidom (python 2.4) 53000k 1.4s ElementTree 1.3 14500k 1.1s cElementTree 0.8 5700k 0.058s cElementTree 0.9 4900k 0.047s ------------------------------------------------------------ readlines (read as utf-8) 8850k 0.093s readlines (read as ascii) 5050k 0.032s ------------------------------------------------------------ For more information on this library, including download instructions, additional benchmark figures, and more, see: http://effbot.org/zone/celementtree.htm enjoy /F From fredrik at pythonware.com Sat Jan 15 15:09:10 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Jan 15 15:09:04 2005 Subject: [XML-SIG] Re: Re: Re: Re: cElementTree 0.8 (january 11, 2005) References: <20050113214449.GX8569@redhat.com> <20050114110041.GB8569@redhat.com> Message-ID: Daniel Veillard wrote: > You have a python function calling a native function. That function returns > a string. That C string is translated to a Python string by the wrapper > using PyString_FromString(). That operation seems to be extremely expensive. PyString basically boils down to: determine the length of the string call fast allocator copy string to area allocated by fast allocator for UTF-8 data, the steps are: determine maximum possible length of the string call fast allocator copy string to area allocated by fast allocator, character by character. handle UTF-8 code sequences. adjust size of allocated area, if necessary cElementTree has to do all this for all strings in the document, of course, and the time it takes is included in my parsing benchmark. and I guess libxml2 is doing something very similar, but using your own allocator and object layout. but parsing is one thing, using the data from Python code is another. to return data to Python, all cElementTree has to do (in the normal case) is to return the string object it created during the parse. that's a pointer copy, not a buffer copy. libxml2, in contrast, has to copy the strings once again, using Python's allocator and Python's string object layout. and if you don't cache stuff, you end up doing this every time someone accesses a node... From fredrik at pythonware.com Mon Jan 17 21:01:15 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon Jan 17 21:01:13 2005 Subject: [XML-SIG] Re: XML for scientific data storage and search References: <41E57A52.1010502@pingyeh.net> Message-ID: I wrote: > http://online.effbot.org/2004_12_01_archive.htm#element-generator > http://online.effbot.org/2004_12_01_archive.htm#element-generator-2 here's an update for cElementTree 0.9.2, btw: http://online.effbot.org/2005_01_01_archive.htm#celementtree-xmlfile on my machine, looping over all elements on a 3400k file takes about twice as long as it takes to build a tree, but other hand, xmlfile needs no more than 200k memory to process that file. and "twice as long to build a tree" still means that it's faster than anything, except libxml2... From pf_moore at yahoo.co.uk Mon Jan 17 20:00:42 2005 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Tue Jan 18 14:02:56 2005 Subject: [XML-SIG] Re: ANN: cElementTree 0.9.2 (january 15, 2005) References: Message-ID: "Fredrik Lundh" writes: > Time for a new release. The 0.9.2 release is 10-20% faster than 0.8 on > my benchmarks, and uses 5-15% less memory. You can consider this to > be the first 1.0 release candidate. > > Here are some benchmark results, using a number of popular XML tool- > kits to parse a 3405k source file on my development machine: I'd be interested in the benchmark results for Uche Ogbuji's Amara XML toolkit, which seems to aim to target a similar audience as cElementTree ("Pythonic" XML processing). Is the benchmark code/data available anywhere? Paul. -- Advice is what we ask for when we already know the answer but wish we didn't. -- Erica Jong From tve at vormig.net Tue Jan 18 16:10:48 2005 From: tve at vormig.net (Tim van Erven) Date: Tue Jan 18 16:10:47 2005 Subject: [XML-SIG] Problem using schema instead of DTD In-Reply-To: <200501141719.j0EHJcjc041954@chilled.skew.org> References: <20050114103929.GC8335@mould.vormig.net> <200501141719.j0EHJcjc041954@chilled.skew.org> Message-ID: <20050118151048.GC4915@mould.vormig.net> On Fri, 14/01/2005 10:19 -0700, Mike Brown wrote: > Tim van Erven wrote: >> And the output of my program looks like this, which won't validate: >> >> >> >> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >> xsi:schemaLocation="http://www.example.org/info someschema.xsd"> >> >> This is some modified sample text. >> >> >> Does anyone know how to get rid of the !DOCTYPE line? Is that the right >> thing to do anyway? > Another possibility is that the tool you are using does not recognize the > minimal DOCTYPE declaration that is being added to your document. Such a > declaration serves no real purpose, but is permitted to exist by the XML spec. > It is an often-overlooked 'feature' of XML though; some parsers may reject it > out of ignorance. I'm using the validator at: http://apps.gotdotnet.com/xmltools/xsdvalidator/ It stopped complaining after removing the DOCTYPE declaration - thanks Andrew Clover!: if document.doctype is not None: document.removeChild(document.doctype) Thanks for the explanation. This solves my problem. Regards, Tim -- Tim van Erven From fredrik at pythonware.com Tue Jan 18 16:30:52 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Jan 18 16:30:53 2005 Subject: [XML-SIG] Re: ANN: cElementTree 0.9.2 (january 15, 2005) References: Message-ID: Paul Moore wrote: > I'd be interested in the benchmark results for Uche Ogbuji's Amara > XML toolkit, which seems to aim to target a similar audience as > cElementTree ("Pythonic" XML processing). Is the benchmark code/data > available anywhere? Amara is based on cDomlette, right? if so, you'll find benchmark figures here: http://effbot.org/zone/celementtree.htm#benchmarks From DanMarconi1 at aol.com Wed Jan 19 23:07:17 2005 From: DanMarconi1 at aol.com (DanMarconi1@aol.com) Date: Wed Jan 19 23:07:24 2005 Subject: [XML-SIG] Buy Vicodin online today, overnight shipping xyiz kccg v Message-ID: <74.4b78d178.2f203415@aol.com> Looking for 7.5 es /generic ok.Need ASAP -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20050119/c5087b57/attachment.htm From pf_moore at yahoo.co.uk Thu Jan 20 20:07:05 2005 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Thu Jan 20 20:07:19 2005 Subject: [XML-SIG] Re: ANN: cElementTree 0.9.2 (january 15, 2005) References: Message-ID: "Fredrik Lundh" writes: > Amara is based on cDomlette, right? if so, you'll find benchmark > figures here: > > http://effbot.org/zone/celementtree.htm#benchmarks Thanks, I hadn't realised that. Paul. -- The brain is a wonderful organ. It starts working the moment you get up in the morning and does not stop until you get into the office. -- Robert Frost From fredrik at pythonware.com Thu Jan 20 22:46:15 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Jan 20 22:46:20 2005 Subject: [XML-SIG] Re: ANN: cElementTree 0.9.2 (january 15, 2005) References: Message-ID: Paul Moore wrote: >> Amara is based on cDomlette, right? if so, you'll find benchmark >> figures here: >> >> http://effbot.org/zone/celementtree.htm#benchmarks > > Thanks, I hadn't realised that. I was wrong. same benchmark as above, on slightly slower hardware. amara 0.9.2 bind_file: 8.85 seconds (~20000k) amara 0.9.2 pushdom: 5.62 seconds (~12000k) amara 0.9.2 pushbind: 160.5 seconds (~6500k) cdomlette: 0.66 seconds (~20000k) celementtree parse: 0.06 seconds (4900k) celementtree iterparse: 0.12 seconds (~200k) From fredrik at pythonware.com Thu Jan 20 23:21:24 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Jan 20 23:21:30 2005 Subject: [XML-SIG] ANN: cElementTree 0.9.3 (january 18, 2005) Message-ID: BTW, I forgot to announce my latest cET release. The 0.9.3 release fixes a bug in getchildren() (introduced in 0.9.2), and adds a new interface, "iterparse", which is discussed here: http://online.effbot.org/2005_01_01_archive.htm#treebuilder-observer http://online.effbot.org/2005_01_01_archive.htm#celementtree-iterparse http://online.effbot.org/2005_01_01_archive.htm#celementtree-xmlrpc Note that "iterparse" is experimental, and the interface may change somewhat before the final relase. For more information on this library, including download instructions, additional benchmark figures, and more, see: http://effbot.org/zone/celementtree.htm enjoy /F From fredrik at pythonware.com Sat Jan 22 14:19:17 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Jan 22 14:19:25 2005 Subject: [XML-SIG] Re: ANN: cElementTree 0.9.2 (january 15, 2005) References: Message-ID: Fredrik Lundh wrote: > same benchmark as above, on slightly slower hardware. I've posted updated benchmarks for a variety of event-driven/incremental parsers here: http://online.effbot.org/2005_01_01_archive.htm#20050122 Amara is surprisingly slow; I haven't looked at the code, but it must surely be possibly to process more than 25k per second on a 3GHz PC with that interface... From uche.ogbuji at fourthought.com Sat Jan 22 20:57:54 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Sat Jan 22 20:58:41 2005 Subject: [XML-SIG] ANN: Amara XML Toolkit 0.9.3 Message-ID: <1106423874.4598.246.camel@borgia> http://uche.ogbuji.net/tech/4Suite/amara ftp://ftp.4suite.org/pub/Amara/ Changes in this release: * Removed some cruft code, apparently leading to a huge speedup in bindery and pushbind * binderytools.pushdom now returns elements rather than subtree root nodes * Added ns:* form of match to patterns support * Added many docstrings * More demos and tests * Bug fixes The code sample from the last announcement is now simplified. The following is complete Amara 0.9.3 code for iterating through address labels in an XML document, generally not using more memory to process 10,000 labels than 100: from amara import binderytools for label in binderytools.pushbind('/labels/label', source='labels.xml'): print label.name, 'of', label.address.city Amara XML Toolkit is a collection of Python tools for XML processing-- not just tools that happen to be written in Python, but tools built from the ground up to use Python idioms and take advantage of the many advantages of Python. Amara builds on 4Suite [http://4Suite.org], but whereas 4Suite focuses more on literal implementation of XML standards in Python, Amara focuses on Pythonic idiom. It provides tools you can trust to conform with XML standards without losing the familiar Python feel. The components of Amara are: * Bindery: data binding tool (fancy way of saying: a very Pythonic XML API) * Scimitar: implementation of the ISO Schematron schema language for XML; converts Schematron files to Python scripts * domtools: set of tools to augment Python DOMs * saxtools: set of tools to make SAX easier to use in Python * Flextyper: user-defined datatypes in Python for XML processing There's a lot in Amara, but here are highlights: Amara Bindery: XML as easy as py -------------------------------- Based on the retired project Anobind, but updated to use SAX rather than DOM to create bindings. Bindery reads an XML document and returns a data structure of Python objects corresponding to the vocabulary used in the XML document, for maximum clarity. Bindery turns the document What do you mean "bleh" But I was looking for argument Into a set of objects such that you can write binding.monty.python.spam In order to get the value "eggs" or binding.monty.python[1] In order to get the value "But I was looking for argument". There are other such tools for Python, and what makes Anobind unique is that it's driven by a very declarative rules-based system for binding XML to the Python data. You can register rules that are triggered by XPattern expressions specialized binding behavior. It includes XPath support and supports mutation. Bindery is very efficient, using SAX to generate bindings. Scimitar: exceptional schema language for an exceptional programming language ----------------------------------------------------------------------------- Merged in from a separate project, Scimitar is an implementation of ISO Schematron that compiles a Schematron schema into a Python validator script. You typically use scimitar in two phases. Say you have a schematron schema schema1.stron and you want to validate multiple XML files against it, instance1.xml, instance2.xml, instance3.xml. First you run schema1.stron through the scimitar compiler script, scimitar.py: scimitar.py schema1.stron A file, schema1.py is generated and can be used to validate XML instances: python schema1.py instance1.xml Which emits a validation report. Amara DOM Tools: giving DOM a more Pythonic face ------------------------------------------------ DOM came from the Java world, hardly the most Pythonic API possible. Some DOM-like implementations such as 4Suite's Domlettes mix in some Pythonic idiom. Amara DOM Tools goes even further. Amara DOM Tools feature pushdom, similar to xml.dom.pulldom, but easier to use. It also includes Python generator-based tools for DOM processing, and a function to return an XPath location for any DOM node. Amara SAX Tools: SAX without the brain explosion ------------------------------------------------ Tenorsax (amara.saxtools.tenorsax) is a framework for "linerarizing" SAX logic so that it flows more naturally, and needs a lot less state machine wizardry. License ------- Amara is open source, provided under the 4Suite variant of the Apache license. See the file COPYING for details. Installation ------------ Amara requires Python 2.3 or more recent and 4Suite 1.0a4 or more recent. Make sure these are installed, unpack Amara to a convenient location and run python setup.py install -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html From fredrik at pythonware.com Sat Jan 22 21:05:36 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Jan 22 21:05:32 2005 Subject: [XML-SIG] Re: Amara XML Toolkit 0.9.3 References: <1106423874.4598.246.camel@borgia> Message-ID: Uche Ogbuji wrote: > * Removed some cruft code, apparently leading to a huge speedup > in bindery and pushbind any figures? (as I've pointed out elsewhere, pushbind in 0.9.2 wasn't just slow, it was amazingly slow). From fredrik at pythonware.com Sat Jan 22 21:22:25 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Jan 22 21:22:24 2005 Subject: [XML-SIG] Re: Amara XML Toolkit 0.9.3 References: <1106423874.4598.246.camel@borgia> Message-ID: I wrote: >> * Removed some cruft code, apparently leading to a huge speedup >> in bindery and pushbind > > any figures? (as I've pointed out elsewhere, pushbind in 0.9.2 wasn't > just slow, it was amazingly slow). here's what I get on my machine: bind_file: 7.5 => 6.4 seconds pushdom: 4.4 => 5.3 seconds (?) pushbind: 128 => 10.5 seconds (!) you're getting closer, but you still have some work to do if you want to play with the fast guys ;-) (sax needs 0.3 seconds, cET <0.1 seconds) From Uche.Ogbuji at fourthought.com Sat Jan 22 21:22:30 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Sat Jan 22 21:22:55 2005 Subject: [XML-SIG] Re: ANN: cElementTree 0.9.2 (january 15, 2005) In-Reply-To: References: Message-ID: <1106425351.4598.264.camel@borgia> On Mon, 2005-01-17 at 19:00 +0000, Paul Moore wrote: > "Fredrik Lundh" writes: > > > Time for a new release. The 0.9.2 release is 10-20% faster than 0.8 on > > my benchmarks, and uses 5-15% less memory. You can consider this to > > be the first 1.0 release candidate. > > > > Here are some benchmark results, using a number of popular XML tool- > > kits to parse a 3405k source file on my development machine: > > I'd be interested in the benchmark results for Uche Ogbuji's Amara > XML toolkit, which seems to aim to target a similar audience as > cElementTree ("Pythonic" XML processing). Is the benchmark code/data > available anywhere? A few notes: 1) Amara 0.9.2 had some crufty code that (unbeknownst to me) was killing performance. I removed it for 0.9.3 and saw a near 15X. Fair benchmarks should really start with 0.9.3. 2) Amara will never compete with cElementTree in raw performance until I somehow conjure up the time to write it in C. I wouldn't hold my breath waiting for that. Winning the raw benchmark race is not my intention with Amara. My goal is rather combining maximum Python idiom with maximum declarative power. I'll get another dramatic speedup in Amara (3X-4X in my sandbox) once the next 4Suite release is out and I switch to Jeremy Kloth's super-fast low-level C/Domlette/SAX implementation, but even then I'll expect cElementTree to be faster. Frankly, the speed of cElementTree amazes me. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html From fredrik at pythonware.com Sun Jan 23 10:44:06 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun Jan 23 10:43:59 2005 Subject: [XML-SIG] the pyxml topic guide Message-ID: at http://pyxml.sourceforge.net/topics/ looks a bit aged to me... especially pages like http://pyxml.sourceforge.net/topics/software.html http://pyxml.sourceforge.net/topics/docs.html etc could need some updating, link fixing, etc. or maybe they should all be moved into the python.org wiki at http://www.python.org/moin/PythonXml which currently only says "Its all very confusing" which might be true, and "4Suite has the best api." which is debatable. any maintainers out there? From fredrik at pythonware.com Sun Jan 23 16:19:02 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun Jan 23 16:18:54 2005 Subject: [XML-SIG] ANN: cElementTree 0.9.8 (january 23, 2005) Message-ID: Time for a new release. The 0.9.8 release is pretty much identical to 0.9.3, except for a revised "iterparse" mechanism: for event, elem in iterparse(source): ... By default, iterparse now only returns "end" events (issued when an element has been completed, and all child elements are available). This speeds things up a bit, and simplifies the event-handling code. An example: for event, elem in iterparse(source): if elem.tag == "title": print "document title is", repr(elem.text) break To request other events, including extended information about namespaces, use the "events" option (see the CHANGES document for details). Like the rest of cElementTree, the iterparse mechanism is fast. On my test machine, it's over four times faster than xml.sax, 2.5 times faster than pyexpat, and even a bit faster than my own sgmlop. For more information on this library, including download instructions, detailed benchmark figures, and more, see: http://effbot.org/zone/celementtree.htm enjoy /F From noreply at sourceforge.net Mon Jan 24 17:11:22 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Mon Jan 24 17:11:25 2005 Subject: [XML-SIG] [ pyxml-Bugs-1108441 ] WDDXMarshaller fails on boolean in Python 2.3 Message-ID: Bugs item #1108441, was opened at 2005-01-24 16:11 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1108441&group_id=6473 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: David C. Fox (dcfox) Assigned to: Nobody/Anonymous (nobody) Summary: WDDXMarshaller fails on boolean in Python 2.3 Initial Comment: With Python 2.3, pyxml 0.8.4, if I attempt to call WDDXMarshaller.dumps with a dictionary containing a boolean value, the dumps method fails because the marshaller has no m_bool method: str_mess = self.marshaller.dumps(mess_argvals) File "C:\Python23\Lib\site-packages\_xmlplus\marshal\generic.py", line 59, in dumps L = [self.PROLOGUE + self.DTD] + self.m_root(value, dict) File "C:\Python23\Lib\site-packages\_xmlplus\marshal\wddx.py", line 111, in m_ root L = L + self._marshal(value, dict) File "C:\Python23\Lib\site-packages\_xmlplus\marshal\generic.py", line 92, in _marshal return getattr(self, meth)(value, dict) File "C:\Python23\Lib\site-packages\_xmlplus\marshal\generic.py", line 191, in m_dict return self.m_dictionary(value, dict) File "C:\Python23\Lib\site-packages\_xmlplus\marshal\wddx.py", line 188, in m_ dictionary L = L + self._marshal(v, dict) File "C:\Python23\Lib\site-packages\_xmlplus\marshal\generic.py", line 92, in _marshal return getattr(self, meth)(value, dict) AttributeError: WDDXMarshaller instance has no attribute 'm_bool' ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1108441&group_id=6473 From martin at v.loewis.de Tue Jan 25 00:33:56 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Jan 25 00:33:55 2005 Subject: [XML-SIG] the pyxml topic guide In-Reply-To: References: Message-ID: <41F585E4.2070801@v.loewis.de> Fredrik Lundh wrote: > any maintainers out there? If you want to, you can update the pages yourself. Just check out and edit pyxml/www. Regards, Martin From jimmy at retzlaff.com Tue Jan 25 05:58:14 2005 From: jimmy at retzlaff.com (Jimmy Retzlaff) Date: Tue Jan 25 05:58:19 2005 Subject: [XML-SIG] cElementTree.iterparse missing text in some start events Message-ID: I'm using cElementTree.iterparse to iterate over an XML file. I think iterparse is a wonderful idea - I've found it to be much more convenient than SAX for iterative processing. I have come across a problem though... For the majority of my elements, both the start and end events contain the text of the element (i.e., element.text). For a handful of the elements, the text is only in the end event (i.e., element.text is None in the start event but it is not None in the end event). The text is found without any problem when using cElementTree.parse on the file instead. A small test to reproduce this behavior is at the end of this note and an 80KB sample xml file is at http://www.averdevelopment.com/python/test.xml. The test file is whittled down from a much larger file which had the problem with several more elements (but only a very small percentage of the total). I couldn't seem to delete any elements before the element in question without changing the behavior. Am I misunderstanding something or is this perhaps a bug? I'm using: http://effbot.org/downloads/cElementTree-0.9.8-20050123.win32-py2.3.exe http://effbot.org/downloads/elementtree-1.2.4-20041228.win32.exe http://python.org/ftp/python/2.3.4/Python-2.3.4.exe Windows XP SP2 Thanks, Jimmy #################################################### import sets from cElementTree import dump, iterparse, parse values = dict(start=sets.Set(), end=sets.Set()) i = 0 for event, element in iterparse('test.xml', ('start', 'end')): if element.tag.endswith('}ele') and element.text: values[event].add(element.text) if element.tag.endswith('}ele') and element.text is None: print i, event + ' ' dump(element) if element.text == '297.257582': print i, event + ' ' dump(element) i += 1 print 'In start but not end:', values['start'] - values['end'] print 'In end but not start:', values['end'] - values['start'] print # Finding the same text with ElementTree is no problem gpx = parse('test.xml').getroot() trk = element.findall('{http://www.topografix.com/GPX/1/1}trk')[-1] trkseg = trk.findall('{http://www.topografix.com/GPX/1/1}trkseg')[-1] trkpt = trkseg.findall('{http://www.topografix.com/GPX/1/1}trkpt')[-2] ele = trkpt.findall('{http://www.topografix.com/GPX/1/1}ele')[0] print ele.text #################################################### Output: 3622 start 3623 end 297.257582 In start but not end: Set([]) In end but not start: Set(['297.257582']) 297.257582 From fredrik at pythonware.com Tue Jan 25 09:09:16 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Jan 25 09:09:11 2005 Subject: [XML-SIG] Re: cElementTree.iterparse missing text in some start events References: Message-ID: Jimmy Retzlaff wrote: > I'm using cElementTree.iterparse to iterate over an XML file. I think > iterparse is a wonderful idea - I've found it to be much more convenient > than SAX for iterative processing. I have come across a problem > though... > > For the majority of my elements, both the start and end events contain > the text of the element (i.e., element.text). For a handful of the > elements, the text is only in the end event (i.e., element.text is None > in the start event but it is not None in the end event). The text is > found without any problem when using cElementTree.parse on the file > instead. > Am I misunderstanding something or is this perhaps a bug? it needs more documentation ;-) here's what the comment in the CHANGES document says: The elem object is the current element; for "start" events, the element itself has been created (including attributes), but its contents may not be complete; for "end" events, all child elements has been processed as well. You can use "start" tags to count elements, check attributes, and check if certain tags are present in a tree. For all other purposes, use "end" handlers instead. in that text, "may not" really means "may or may not". that is, the contents may be complete, but that's nothing you can or should rely on. the reason for this is that events don't fire in perfect lockstep with the build process; in the current version, the parser may be up to 16k further ahead. this means that when you get a "start" event, the parser has often processed everything inside the event (especially if it's small enough), but you cannot rely on that. or in other words, for a start event, the following attributes are valid: elem.tag elem.attrib tags and attributes for parent elements (use a stack if you need to track them) (not elem.text) (not elem.tail) (not elem[:]) you may modify the tag and attrib attributes you may stop parsing and for an end event, the following applies: elem.tag elem.attrib elem.text elem[:] (i.e. the children) complete contents for all children (including the tail) (not elem.tail) (but all child tails) you may modify all attributes, except elem.tail you may reorder/update children you may remove children (e.g. calling elem.clear() to mark that you're done with this level) you may stop parsing clearer? I think I need to draw a couple of diagrams... From jimmy at retzlaff.com Tue Jan 25 09:58:59 2005 From: jimmy at retzlaff.com (Jimmy Retzlaff) Date: Tue Jan 25 09:59:02 2005 Subject: [XML-SIG] Re: cElementTree.iterparse missing text in some startevents Message-ID: Fredrik Lundh wrote: > > Jimmy Retzlaff wrote: > > > I'm using cElementTree.iterparse to iterate over an XML file. I think > > iterparse is a wonderful idea - I've found it to be much more convenient > > than SAX for iterative processing. I have come across a problem > > though... > > > > For the majority of my elements, both the start and end events contain > > the text of the element (i.e., element.text). For a handful of the > > elements, the text is only in the end event (i.e., element.text is None > > in the start event but it is not None in the end event). The text is > > found without any problem when using cElementTree.parse on the file > > instead. > > > Am I misunderstanding something or is this perhaps a bug? > > it needs more documentation ;-) > > here's what the comment in the CHANGES document says: > > The elem object is the current element; for "start" events, > the element itself has been created (including attributes), but its > contents may not be complete; for "end" events, all child elements > has been processed as well. You can use "start" tags to count > elements, check attributes, and check if certain tags are present > in a tree. For all other purposes, use "end" handlers instead. > > in that text, "may not" really means "may or may not". that is, the > contents may be complete, but that's nothing you can or should rely on. > > the reason for this is that events don't fire in perfect lockstep with the > build process; in the current version, the parser may be up to 16k further > ahead. ... > clearer? Yes, thanks! Just a thought... would it be better to artificially hide the attributes that can't be counted on in a start event or are the tradeoffs in doing so too ugly? With small elements like mine and a buffer as large as 16KB then things will almost always be available in the start event. That'll lead learn-by-trail-and-error folks (i.e., those of us who don't read :) to miss the distinction altogether. I was lucky enough to have a unit test that noticed I had ~10 or so empty values out of many thousands, but otherwise I wouldn't have known about the problem (especially if empty values were occasionally expected). Thanks for all the wonderful libraries. Jimmy From fredrik at pythonware.com Tue Jan 25 18:33:29 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Jan 25 18:33:32 2005 Subject: [XML-SIG] Re: cElementTree.iterparse missing text in some startevents References: Message-ID: Jimmy Retzlaff wrote: > > clearer? > > Yes, thanks! Just a thought... would it be better to artificially hide > the attributes that can't be counted on in a start event or are the > tradeoffs in doing so too ugly? my guess is it'll either be incredibly ugly and be very inefficient, or require major surgery and still be very inefficient (maybe a bit less "very", but a lot slower than the current design). > With small elements like mine and a buffer as large as 16KB then things > will almost always be available in the start event. That'll lead learn-by-trail- > and-error folks (i.e.,those of us who don't read :) to miss the distinction > altogether. well, you have to read something to find out about the "start" event (the default is to issue "end" events only, which is almost always what you want anyway). but of course, if lots of people end up being bitten by this, I may have to come up with something better, but I don't have any good ideas at the moment, so things will have to be this way in 1.0... From jedp at ilm.com Wed Jan 26 02:23:19 2005 From: jedp at ilm.com (Jed Parsons) Date: Wed Jan 26 02:23:26 2005 Subject: [XML-SIG] Locating my document locator Message-ID: <20050125172319.E11196@ilm.com> I'm having trouble locating my document locator. For me, the following asks, "Dude, where's my Locator?" class FooHandler(xml.sax.ContentHandler): def __init__(self): self.locator = None pass def setDocumentLocator(self, locator): self.locator = locator def startDocument(self): if self.locator is None: raise RuntimeError("Dude, where's my Locator?") def characters(self, text): line = self.locator.getLineNumber() col = self.locator.getColumnNumber() print line,',',col, ':', text parser = xml.sax.make_parser() handler = FooHandler() parser.setContentHandler(handler) parser.parse('/path/to/some/file.xml') That snippet based on http://www.xml.com/pub/a/2004/11/24/py-xml.html, which taunts me with the assertion that, "Every SAX driver I know of that comes with Python or on PyXML supports locators." I know there's a locator there somewhere, because when my xml is broken, the parser is quick to let me know where :) Any suggestions much appreciated! Cheers, Jed -- Jed Parsons Industrial Light + Magic (415) 448-2974 grep(do{for(ord){(!$_&&print"$s\n")||(($O+=(($_-1)%6+1)and grep(vec($s,$O++,1)=1,1..int(($_-6*6-1)/6))))}},(split(//, "++,++2-27,280,481=1-7.1++2,800+++2,8310/1+4131+1++2,80\0. What!?"))); From fredrik at pythonware.com Wed Jan 26 15:31:51 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Jan 26 15:32:08 2005 Subject: [XML-SIG] Re: Locating my document locator References: <20050125172319.E11196@ilm.com> Message-ID: Jed Parsons wrote: > I'm having trouble locating my document locator. > > For me, the following asks, "Dude, where's my Locator?" > > /snip/ > > That snippet based on http://www.xml.com/pub/a/2004/11/24/py-xml.html, which > taunts me with the assertion that, "Every SAX driver I know of that comes with > Python or on PyXML supports locators." > > I know there's a locator there somewhere, because when my xml is broken, the > parser is quick to let me know where :) I added print statements to your sample, and ran under Python 2.4: setDocumentLocator: startDocument characters: 3 , 573 : Heading 1 ... what From fredrik at pythonware.com Wed Jan 26 15:36:00 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Jan 26 15:41:23 2005 Subject: [XML-SIG] Re: Locating my document locator References: <20050125172319.E11196@ilm.com> Message-ID: > what sorry, my 2-year old assistant thought that the message was complete. here's what I meant to say: what do you get if you add a "print parser" statement after the make_parser() call? on my stock Python, I get: under at least Python 2.1, 2.3, and 2.4. maybe you're getting some less capable parser? (the locator type varies slightly, but it's there in all three versions). From fredrik at pythonware.com Wed Jan 26 23:26:49 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Jan 26 23:26:43 2005 Subject: [XML-SIG] ANN: cElementTree 1.0 (january 26, 2005) Message-ID: effbot.org proudly presents version 1.0 final of the cElementTree library, a fast and very efficient implementation of the ElementTree API, for Python 2.1 and later. On typical documents, cElementTree is 15-20 times faster than the Python version of ElementTree, and uses 2-5 times less memory. The combination of low memory use and high parsing/building speed means that you can easily work with documents in the 50-200 MB range in memory, on modern hardware. And for the cases when that's not good enough, the library provides a new, iterator-based API that lets you inspect, modify, and trim the tree while it is being built. For more information on this library, including download instructions, detailed benchmark figures, and more, see: http://effbot.org/zone/celementtree.htm enjoy /F PS. If you're using ElementTree, but would like more support for document validation and transformations, you might be interested in Martijn Faassens lxml.etree implementation, which is based on Daniel Veillard's libxml2/libxslt. For more information, see Martijn's blog: http://faassen.n--tree.net/blog/ From noreply at sourceforge.net Thu Jan 27 04:23:50 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Thu Jan 27 04:23:53 2005 Subject: [XML-SIG] [ pyxml-Bugs-1110409 ] xmlns attribute not output for docs from impl.createdocument Message-ID: Bugs item #1110409, was opened at 2005-01-27 11:23 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1110409&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Greg Wogan-Browne (wogan) Assigned to: Nobody/Anonymous (nobody) Summary: xmlns attribute not output for docs from impl.createdocument Initial Comment: I am having some trouble figuring out what is going on here - is this a bug, or correct behaviour? Basically, when I create an XML document with a namespace using xml.dom.minidom.parse() or parseString(), the namespace exists as an xmlns attribute in the DOM (fair enough, as it's in the original source document). However, if I use the DOM implementation to create an identical document with a namespace, the xmlns attribute is not present. This mainly affects me when I go to print out the document again using Document.toxml(), as the xmlns attribute is not printed for documents I create dynamically, and therefore XSLT does not kick in (I'm using an external processor). Python 2.3.3 (#1, May 7 2004, 10:31:40) [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import xml.dom.minidom >>> raw = '' >>> doc = xml.dom.minidom.parseString(raw) >>> print doc.documentElement.namespaceURI http://example.com/namespace >>> print doc.documentElement.getAttribute('xmlns') http://example.com/namespace >>> impl = xml.dom.minidom.getDOMImplementation() >>> doc2 = impl.createDocument('http://example.com/namespace','test',None) >>> print doc2.documentElement.namespaceURI http://example.com/namespace >>> print doc2.documentElement.getAttribute('xmlns') >>> ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1110409&group_id=6473 From postmaster at python.org Thu Jan 27 13:21:26 2005 From: postmaster at python.org (Mail Delivery Subsystem) Date: Thu Jan 27 13:21:00 2005 Subject: [XML-SIG] Message could not be delivered Message-ID: <200501271213.j0RCDpaW050446@smtp2.ste.net.sy> Welcome to STE ISP, this attachment was rejected by AntiSpam, If you sure it is safe and you want to recieve it, Please ask the sender to attach it again like a zip file.. Thanks for your cooperation. An attachment named INSTRUCTION.SCR was removed from this document as it constituted a security hazard. If you require this document, please contact the sender and arrange an alternate means of receiving it. -------------- next part -------------- The message was not delivered due to the following reason(s): Your message could not be delivered because the destination computer was not reachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message could not be delivered within 7 days: Server 7.221.54.96 is not responding. The following recipients did not receive this message: Please reply to postmaster@python.org if you feel this message to be in error. -------------- next part -------------- This mail is probably spam. The original message has been attached along with this report, so you can recognize or block similar unwanted mail in future. See http://spamassassin.org/tag/ for more details. Content preview: The message was not delivered due to the following reason(s): Your message could not be delivered because the destination computer was not reachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. [...] Content analysis details: (6.90 points, 5 required) MSGID_NO_HOST (2.8 points) Message-Id has no hostname MICROSOFT_EXECUTABLE (0.1 points) RAW: Message includes Microsoft executable program FORGED_MUA_OUTLOOK (3.7 points) Forged mail pretending to be from MS Outlook INVALID_MSGID (0.3 points) Message-Id is not valid, according to RFC 2822 From kaouther.meddeb at cni.tn Wed Jan 26 14:52:19 2005 From: kaouther.meddeb at cni.tn (kaouther) Date: Thu Jan 27 15:33:17 2005 Subject: [XML-SIG] PyXML Message-ID: <000e01c3fc66$01de8590$773210ac@KAOUTHAR> salutations, j'ai install? Zope-2.6.2-win32-x86.exe qui a python 2.1, win32-x86 et je voulais mettre en place silva, j'ai t?l?charg? Silva-1.0rc4-all.tgz j'ai trouv? une indication que je dois installer PyXML MAIS AUCUNE VERSION NE SEMBLE MARCHER SUR MA PLATEFORME WINDOWS J'AI BIEN ESSAYER PyXML-0.7.1.win32-py2.1.exe, PyXML-0.8.1.win32-py2.1.exe, PyXML-0.8.4.win32-py2.2.exe pouvez vous m'aider et m'indiquer la version de PyXML qui me convient j'attends votre message merci -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20050126/03b9cb78/attachment.html From Sylvain.Thenault at logilab.fr Thu Jan 27 15:39:46 2005 From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=) Date: Thu Jan 27 15:39:49 2005 Subject: [XML-SIG] PyXML In-Reply-To: <000e01c3fc66$01de8590$773210ac@KAOUTHAR> References: <000e01c3fc66$01de8590$773210ac@KAOUTHAR> Message-ID: <20050127143946.GA4621@logilab.fr> On Thursday 26 February ? 13:42, kaouther wrote: > salutations, salut > j'ai install? Zope-2.6.2-win32-x86.exe > qui a python 2.1, win32-x86 > et je voulais mettre en place silva, > j'ai t?l?charg? Silva-1.0rc4-all.tgz > j'ai trouv? une indication que je dois installer PyXML > MAIS AUCUNE VERSION NE SEMBLE MARCHER SUR MA PLATEFORME WINDOWS > J'AI BIEN ESSAYER PyXML-0.7.1.win32-py2.1.exe, > PyXML-0.8.1.win32-py2.1.exe, PyXML-0.8.4.win32-py2.2.exe > pouvez vous m'aider et m'indiquer la version de PyXML qui me convient [Note: cette liste de discussion est en anglais] Google semble indiquer que PyXML-0.8.3.win32-py2.1.exe fait l'affaire, et dispo sur http://sourceforge.net/project/showfiles.php?group_id=6473 -- Sylvain Th?nault LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org From Uche.Ogbuji at fourthought.com Fri Jan 28 07:38:43 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 07:38:47 2005 Subject: [XML-SIG] XBEL / Konqueror In-Reply-To: <41C00C6B.9040807@gmail.com> References: <41C00C6B.9040807@gmail.com> Message-ID: <1106894323.8243.30.camel@borgia> On Wed, 2004-12-15 at 11:05 +0100, David Vevar wrote: > Since you mention Konqueror on your page > (http://pyxml.sourceforge.net/topics/xbel/) I was wondering what was > your official position about Konqueror's bookmark format. It's supposed > to be XBEL but from what I see there's an icon argument in a bookmark > element. Since I can't find it in a dtd I thought that such data should > be put into metadata (in info). In your pdf document describing XBEL > there's no mention of icons (and how to handle such data) either. Is > there an offocial position regarding that? I'd be most grateful for your > insight. When it's come up before, I've argued that icons shouldn't be a matter for the main XBEL DTD, but that implementations should be allowed to add extension data for icons. XBEL 1.1 does have But NMTOKEN doesn't really make sense for an icon (ENTITY, anyone? ). -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From Uche.Ogbuji at fourthought.com Fri Jan 28 07:54:40 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 07:54:53 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <41C072E5.7020305@noviforum.si> References: <41C072E5.7020305@noviforum.si> Message-ID: <1106895280.8243.39.camel@borgia> On Wed, 2004-12-15 at 18:22 +0100, David Vevar wrote: > I contacted you not long ago about an "icon issue" regarding XBEL in > general and Konquerer in particular. > > I also contacted David Faure of KDE who implemented XBEL (well, almost > ;-)) for Konquerer bookmarks. After some exchanged messages (having in > mind that most of todays browsers tend to be fairly colourful ;-)) we > came to a conclusion that it would be a good thing to change (or extend) > XBEL a bit. We propose an icon repository at the end of XBEL document in > form of an element, something like this > > > base64-encoded data. > ... > > > To be more specific, s could point to external locations (web > URL's or local-machine repositories) and/or keep icon data inlined as, > say, base64-encoded binary (Mozilla(s) already have that), containing > images in various formats (jpegs, gifs, pngs), ico files, maybe even > something more exotic (in that case we'd also need to put a content-type > in there somewhere). > > Bookmarks could then refer to these icons (through their ids) with some > referrer attribute. > > The point of all this is that having such a format could result in a > compact (one XBEL file) bookmark repository. > > I'd be glad to leave the details of specification to you. What I need to > know is whether you'd even consider it, and if so, how soon can we > expect the actual specification? This sounds well enough, but I think it should be a matter for extension. Just for starters, a Mozilla-like browser might prefer to use a data scheme URL. Another might prefer to use entities, etc. I think XBEL should be as simple as possible, and icons feel too much like crossing the line. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From Uche.Ogbuji at fourthought.com Fri Jan 28 08:41:01 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 08:41:04 2005 Subject: [XML-SIG] XBEL xslt stylesheet In-Reply-To: <200501101358.29131.junkc@fh-trier.de> References: <41D49DEA.7020806@machina.no> <200501031641.00210.junkc@fh-trier.de> <41DDA3B0.4080103@machina.no> <200501101358.29131.junkc@fh-trier.de> Message-ID: <1106898061.8243.41.camel@borgia> On Mon, 2005-01-10 at 13:58 +0100, Christian Junk wrote: > Am Donnerstag, 6. Januar 2005 21:46 schrieb Narve Saetre: > > >Yes, this URL is no longer valid, but Joris sent me his XSLT stylesheets a > > >month ago. If you like, I can upload them to 'secure' webspace. I'm able > > > to provide webspace for other stylesheets, too. So if you're interested, > > > we can collect good styles and offer them under a unique url? > > > > Good idea -- having nice, working stylesheets available is always good > > if you have xml files and you want to display them quickly. Since your > > web site seems to be the main page for XBEL related work, it would be > > the natural place to host XBEL stylesheets. And of course, broken links > > are evil, so it is better to host the stylesheets yourself:) > > I arranged webspace under the subdomain > > http://xbel.webinternals.de/ > > and uploaded the stylesheets. If you have other stylesheets please send them > to me. This site asks for a login. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From Uche.Ogbuji at fourthought.com Fri Jan 28 08:43:35 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 08:43:38 2005 Subject: [XML-SIG] XBEL resource page updates Message-ID: <1106898215.8243.44.camel@borgia> I cleared up the backlog of XBEL resource page requests for the year. IIRC, the new updates should be apparent once our cron tasks runs on SourceForge. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From prasad_st at beceem.com Fri Jan 28 09:42:42 2005 From: prasad_st at beceem.com (Prasad PS) Date: Fri Jan 28 09:43:34 2005 Subject: [XML-SIG] Could somebody help me? Message-ID: Hi, I have an xml file. Now I want to add a node to the existing xml file. How can I do this using Python? I would appreciate if someone could help me in doing this. With regards, Prasad.p.s. From morillas at posta.unizar.es Fri Jan 28 10:06:51 2005 From: morillas at posta.unizar.es (Luis Miguel Morillas) Date: Fri Jan 28 10:07:01 2005 Subject: [XML-SIG] Could somebody help me? In-Reply-To: References: Message-ID: <1106903211.41fa00ab1ca3f@webmail.unizar.es> Mensaje citado por Prasad PS : > Hi, > I have an xml file. Now I want to add a node to the existing xml file. > How can I do this using Python? > I would appreciate if someone could help me in doing this. > With regards, > Prasad.p.s. > It's easy. You can use python xml dom [1] or in a more pythonic way I prefer amara [2] Greetings, [1] http://pyxml.sourceforge.net/topics/howto/node21.html [2] http://uche.ogbuji.net/uche.ogbuji.net/tech/4Suite/amara/manual.html -- Luis Miguel From prasad_st at beceem.com Fri Jan 28 10:12:32 2005 From: prasad_st at beceem.com (Prasad PS) Date: Fri Jan 28 10:13:53 2005 Subject: [XML-SIG] Could somebody help me? Message-ID: Hi Luis, I too have followed the second choice but what happened was, when I add the root document to the xml file, I find the previous content and the combination of the previous and the new content in the file and moreover xml declarator is appearing twice. Thanks for your prompt reply. Prasad.p.s. -----Original Message----- From: Luis Miguel Morillas [mailto:morillas@posta.unizar.es] Sent: Friday, January 28, 2005 2:37 PM To: Prasad PS Cc: XML-SIG Subject: Re: [XML-SIG] Could somebody help me? Mensaje citado por Prasad PS : > Hi, > I have an xml file. Now I want to add a node to the existing xml file. > How can I do this using Python? > I would appreciate if someone could help me in doing this. > With regards, > Prasad.p.s. > It's easy. You can use python xml dom [1] or in a more pythonic way I prefer amara [2] Greetings, [1] http://pyxml.sourceforge.net/topics/howto/node21.html [2] http://uche.ogbuji.net/uche.ogbuji.net/tech/4Suite/amara/manual.html -- Luis Miguel From fredrik at pythonware.com Fri Jan 28 11:04:29 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Jan 28 11:04:36 2005 Subject: [XML-SIG] Re: Could somebody help me? References: Message-ID: Prasad PS wrote: > I too have followed the second choice but what happened was, when I add > the root document to the xml file, I find the previous content and the < combination of the previous and the new content in the file and moreover > xml declarator is appearing twice. can you perhaps post a short snippet that illustrates the problem? is using a traditional DOM API an absolute requirement, btw? From faure at kde.org Fri Jan 28 11:09:28 2005 From: faure at kde.org (David Faure) Date: Fri Jan 28 11:09:54 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <1106895280.8243.39.camel@borgia> References: <41C072E5.7020305@noviforum.si> <1106895280.8243.39.camel@borgia> Message-ID: <200501281109.28815.faure@kde.org> On Friday 28 January 2005 07:54, Uche Ogbuji wrote: > Just for starters, a Mozilla-like browser might prefer to use a data > scheme URL. A data scheme URL is fine with KDE as well. Using a URL even allows referring to other URLs (e.g. a favicon http url) instead of "embedding" the icon. (-> more flexibility, but more difficult interoperability for the implementations which don't have easy means of downloading remote URLs). Either way is fine, as long as we standardize on one. > Another might prefer to use entities, etc. Well that's the whole point of a standard: to make everyone use the same way in order to achieve interoperability. If everyone wants his own thing, then we wouldn't even be using the same XML description in the first place. > I think XBEL should be as simple as possible Yes, but at the same time it should be useable as an interoperable format. So icons need to be in there - in a simple way, yes. Icons are really an integral part of bookmarks nowadays. -- David Faure, faure@kde.org, sponsored by Trolltech to work on KDE, Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org). From prasad_st at beceem.com Fri Jan 28 11:18:09 2005 From: prasad_st at beceem.com (Prasad PS) Date: Fri Jan 28 11:19:17 2005 Subject: [XML-SIG] Re: Could somebody help me? Message-ID: Sure, here is the code In the code below, what I am doing is - I am opening an xml file and appending a node to the root document. Then I add this root document to the xml file fp = open (string.strip(self.cnfDtls.GetLogFilePath()), 'w') xml.dom.ext.PrettyPrint(doc, self.xmlFile) self.xmlFile.write("\n") fp.close(). So for each appending of node, the file is opened and the new root document with the appeneded node is re-written to the file. This is taking lot of time. Is there any way to append a node to the xml file without re-writing the file? Prasad.p.s. fp = open( string.strip(self.cnfDtls.GetLogFilePath()), 'r') doc = FromXmlStream(fp) ## else: ## print "File not available" ## print "The Doc is ", doc #### self.testcases.appendChild(doc.createTextNode("\n ")) ## print **************************************************************", doc top_nodeList = doc.getElementsByTagName("Logger") ## print "Hi", top_nodeList tc = doc.createElement("LogDetails") ## top_nodeList[0].appendChild(tc) tc.appendChild(doc.createTextNode("\n ")) i=0 for elements in self.firstPart: ln = doc.createElement(elements) if elements == "Steps": stepParts = self.parts[i] ## print "The Step Part is ", stepParts for index in range(len(stepParts)): step = stepParts[index] msgSent = step[0] msgRecd = step[1] timeout = step[2] stateMsg = step[3] stepNode = doc.createElement("Step") mvalueNode1 = doc.createElement("MessageSent") for msgIndex in range(len(msgSent)): ## stepNode.appendChild(mvalueNode1) keyL = msgSent[msgIndex].keys() for m in range(len(keyL)): keyNode1 = doc.createElement(keyL[m]) valL = msgSent[msgIndex][keyL[m]] if type(valL) == str: ## attNode0 = doc.createElement() keyNode1.appendChild(doc.createTextNode(valL)) ## mvalueNode1.appendChild(keyNode1) else: keysL = valL.keys() for n in range(len(keysL)): attNode1 = doc.createElement(keysL[n]) attNode1.appendChild(doc.createTextNode(valL[keysL[n]])) keyNode1.appendChild(attNode1) mvalueNode1.appendChild(keyNode1) stepNode.appendChild(mvalueNode1) mvalueNode2 = doc.createElement("MessageReceived") for msgIndex in range(len(msgRecd)): keyL = msgRecd[msgIndex].keys() for m in range(len(keyL)): keyNode2 = doc.createElement(keyL[m]) valL2 = msgRecd[msgIndex][keyL[m]] if type(valL2) == str: keyNode2.appendChild(doc.createTextNode(valL2)) ## mvalueNode2.appendChild(keyNode1) else: keysL = valL2.keys() for n in range(len(keysL)): attNode2 = doc.createElement(keysL[n]) attNode2.appendChild(doc.createTextNode(valL2[keysL[n]])) keyNode2.appendChild(attNode2) mvalueNode2.appendChild(keyNode2) stepNode.appendChild(mvalueNode2) timeoutNode = doc.createElement("TimeOut") timeoutNode.appendChild(doc.createTextNode(timeout)) stepNode.appendChild(timeoutNode) stateMsgNode = doc.createElement("StateMessage") stateMsgNode.appendChild(doc.createTextNode(stateMsg)) stepNode.appendChild(stateMsgNode) ln.appendChild(stepNode) else: ln.appendChild(doc.createTextNode(self.parts[i])) tc.appendChild(ln) tc.appendChild(doc.createTextNode("\n ")) i = i+1 t = doc.createTextNode("\n") top_nodeList[0].appendChild(tc) top_nodeList[0].appendChild(t) fp.close() fp = open (string.strip(self.cnfDtls.GetLogFilePath()), 'w') xml.dom.ext.PrettyPrint(doc, self.xmlFile) self.xmlFile.write("\n") fp.close() -----Original Message----- From: xml-sig-bounces@python.org [mailto:xml-sig-bounces@python.org] On Behalf Of Fredrik Lundh Sent: Friday, January 28, 2005 3:34 PM To: xml-sig@python.org Subject: [XML-SIG] Re: Could somebody help me? Prasad PS wrote: > I too have followed the second choice but what happened was, when I add > the root document to the xml file, I find the previous content and the < combination of the previous and the new content in the file and moreover > xml declarator is appearing twice. can you perhaps post a short snippet that illustrates the problem? is using a traditional DOM API an absolute requirement, btw? _______________________________________________ XML-SIG maillist - XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig From fredrik at pythonware.com Fri Jan 28 11:59:21 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Jan 28 11:59:28 2005 Subject: [XML-SIG] Re: Re: Could somebody help me? References: Message-ID: "Prasad PS" wrote: > In the code below, what I am doing is - I am opening an xml file and > appending a node to the root document. Then I add this root document to > the xml file > fp = open (string.strip(self.cnfDtls.GetLogFilePath()), 'w') > xml.dom.ext.PrettyPrint(doc, self.xmlFile) > self.xmlFile.write("\n") > fp.close(). > > So for each appending of node, the file is opened and the new root > document with the appeneded node is re-written to the file. This is > taking lot of time. > Is there any way to append a node to the xml file without re-writing the > file? do you have to reload the file every time? (do you even have to save it every time?) a little profiling might also help; what part of the update process is taking most of the time (loading, updating, or saving). I would do load file while processing ... add node if enough time has elapsed (say, a second or two): save file if that's not efficient enough, you might have use a custom XML writer (you cannot just append to an XML file, since the end tag must be last in the file. you could open the file, read backwards until you find the start of the end tag, overwrite another node, and put the end tag back again. or you could use a custom writer that keeps track of where the end tag is, and overwrites it every time a new node is added) From junkc at fh-trier.de Fri Jan 28 13:50:58 2005 From: junkc at fh-trier.de (Christian Junk) Date: Fri Jan 28 13:50:56 2005 Subject: [XML-SIG] XBEL xslt stylesheet In-Reply-To: <1106898061.8243.41.camel@borgia> References: <41D49DEA.7020806@machina.no> <200501101358.29131.junkc@fh-trier.de> <1106898061.8243.41.camel@borgia> Message-ID: <200501281350.58711.junkc@fh-trier.de> Sorry, the page wasn't available yesterday because of a server update. Now everything should work again ;) Regards, Christian Am Freitag, 28. Januar 2005 08:41 schrieb Uche Ogbuji: > On Mon, 2005-01-10 at 13:58 +0100, Christian Junk wrote: > > Am Donnerstag, 6. Januar 2005 21:46 schrieb Narve Saetre: > > > >Yes, this URL is no longer valid, but Joris sent me his XSLT > > > > stylesheets a month ago. If you like, I can upload them to 'secure' > > > > webspace. I'm able to provide webspace for other stylesheets, too. So > > > > if you're interested, we can collect good styles and offer them under > > > > a unique url? > > > > > > Good idea -- having nice, working stylesheets available is always good > > > if you have xml files and you want to display them quickly. Since your > > > web site seems to be the main page for XBEL related work, it would be > > > the natural place to host XBEL stylesheets. And of course, broken links > > > are evil, so it is better to host the stylesheets yourself:) > > > > I arranged webspace under the subdomain > > > > http://xbel.webinternals.de/ > > > > and uploaded the stylesheets. If you have other stylesheets please send > > them to me. > > This site asks for a login. From Alexandre.Fayolle at logilab.fr Fri Jan 28 15:20:09 2005 From: Alexandre.Fayolle at logilab.fr (Alexandre) Date: Fri Jan 28 15:20:23 2005 Subject: [XML-SIG] PyXML In-Reply-To: <000e01c3fc66$01de8590$773210ac@KAOUTHAR> References: <000e01c3fc66$01de8590$773210ac@KAOUTHAR> Message-ID: <20050128142009.GP22369@crater.logilab.fr> On Thu, Feb 26, 2004 at 01:42:32PM +0100, kaouther wrote: > salutations, > j'ai install? Zope-2.6.2-win32-x86.exe > qui a python 2.1, win32-x86 > et je voulais mettre en place silva, > j'ai t?l?charg? Silva-1.0rc4-all.tgz > j'ai trouv? une indication que je dois installer PyXML > MAIS AUCUNE VERSION NE SEMBLE MARCHER SUR MA PLATEFORME WINDOWS > J'AI BIEN ESSAYER PyXML-0.7.1.win32-py2.1.exe, > PyXML-0.8.1.win32-py2.1.exe, PyXML-0.8.4.win32-py2.2.exe > pouvez vous m'aider et m'indiquer la version de PyXML qui me convient I think this is a zope problem caused by Zope binary installation not registering the python installation it does in the Windows registry. The problem is that the installers for python extension modules then cannot find the python implementation. One easy solution is to install python 2.1 from the binary installer, install pyxml on that python installation, and then copy the _xmlplus directory in the corresponding directory of Zope's python installation. Asking your question on a zope dedicated mailing list would be the right thing to do. Be sure to google for it first, since this is a FAQ. -- Alexandre Fayolle LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/xml-sig/attachments/20050128/ff56bf36/attachment.pgp From fredrik at pythonware.com Fri Jan 28 15:34:18 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Jan 28 15:34:26 2005 Subject: [XML-SIG] Re: PyXML References: <000e01c3fc66$01de8590$773210ac@KAOUTHAR> <20050128142009.GP22369@crater.logilab.fr> Message-ID: Alexandre" wrote: > I think this is a zope problem caused by Zope binary installation not > registering the python installation it does in the Windows registry. The > problem is that the installers for python extension modules then cannot > find the python implementation. http://effbot.org/zone/python-register.htm might help. From uche.ogbuji at fourthought.com Fri Jan 28 15:50:01 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 15:50:23 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <200501281109.28815.faure@kde.org> References: <41C072E5.7020305@noviforum.si> <1106895280.8243.39.camel@borgia> <200501281109.28815.faure@kde.org> Message-ID: <1106923801.8243.54.camel@borgia> On Fri, 2005-01-28 at 11:09 +0100, David Faure wrote: > Yes, but at the same time it should be useable as an interoperable format. > So icons need to be in there - in a simple way, yes. > Icons are really an integral part of bookmarks nowadays. OK. I think we have to agree to disagree on this one, but I'm just one person. What do others think? If everyone does seem to feel there should be one standard way to have icons, I'll defer on that point. Meanwhile, am I correct that XBEL 1.1 is a dead letter? It's bundled up into PyXML, but it's not linked from the Web site, or anything. It's also not really ready for prime time (the icons/NMTOKEN head-scratcher, for example). Last time XBEL evolution came up, Martin mentioned 1.2 as the natural next version. Should we just say on the XBEL page that there was an experimental 1.1, but that it is withdrawn (sorta like XSLT 1.1) and that we're working on 1.2? Then we could start hashing out issues such as icons for 1.2, and get it out, already. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From Uche.Ogbuji at fourthought.com Fri Jan 28 16:00:15 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 16:00:18 2005 Subject: [XML-SIG] Could somebody help me? In-Reply-To: References: Message-ID: <1106924415.8243.57.camel@borgia> On Fri, 2005-01-28 at 14:42 +0530, Prasad PS wrote: > Hi Luis, > I too have followed the second choice but what happened was, when I add > the root document to the xml file, I find the previous content and the > combination of the previous and the new content in the file and moreover > xml declarator is appearing twice. You don't give enough information about your problem for anyone to diagnose. Here is an example of what you described in your original message: $ python Python 2.3.2 (#1, Dec 8 2003, 07:49:35) [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> XML = "" >>> from amara import binderytools >>> doc = binderytools.bind_string(XML) >>> doc.a.xml_append(doc.xml_element(None, u'b')) #None is the namespace >>> doc.xml() '\n' >>> print doc.xml() >>> Works fine. If you tried something similar and it didn't work, let us know the details of what you tried and what went wrong. Thanks. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From Uche.Ogbuji at fourthought.com Fri Jan 28 16:13:02 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 16:13:06 2005 Subject: [XML-SIG] Re: Could somebody help me? In-Reply-To: References: Message-ID: <1106925182.8243.68.camel@borgia> On Fri, 2005-01-28 at 15:48 +0530, Prasad PS wrote: > Sure, here is the code > > In the code below, what I am doing is - I am opening an xml file and > appending a node to the root document. Then I add this root document to > the xml file > fp = open (string.strip(self.cnfDtls.GetLogFilePath()), 'w') > xml.dom.ext.PrettyPrint(doc, self.xmlFile) > self.xmlFile.write("\n") > fp.close(). So you tried the first choice (PyXML) rather than the second (Amara). OK. You were not clear on that. Your first problem is that you're using xml.dom.ext.reader.FromXmlStream rather than from xml.dom import minidom doc = minidom.parse(string.strip(self.cnfDtls.GetLogFilePath())) ... doc.toprettyxml() (rather than xml.dom.ext.PrettyPrint) That's the fault of the PyXML docs, which should really be updated. Side question: you mean you're appending a node to the document element, right? Not the root document. The latter would result in an invalid XML document entity. In the code you posted, it looks as if you only append to subsidiary nodes, so that should be OK. Even using 4DOM, your general approach should work, and I've used it oftentimes before (in the far-off past), with no problem, so I wonder: Are you sure self.xmlFile is "empty" at the point of the xml.dom.ext.PrettyPrint? If so, I suggest you whittle down a test case that reveals the apparent bug, and post data and complete, runnable code (preferably after switching to minidom). If it seems a clear bug, you can use the PyXML bug tracker. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From Uche.Ogbuji at fourthought.com Fri Jan 28 16:20:36 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 16:20:51 2005 Subject: [XML-SIG] Re: Could somebody help me? In-Reply-To: References: Message-ID: <1106925636.8243.77.camel@borgia> On Fri, 2005-01-28 at 15:48 +0530, Prasad PS wrote: > So for each appending of node, the file is opened and the new root > document with the appeneded node is re-written to the file. This is > taking lot of time. > Is there any way to append a node to the xml file without re-writing the > file? You should be able to manipulate the DOM all you want, before re-writing the finished update. But just to throw out another suggestion, 4Suite [1] supports XUpdate [2]. See also [3]. Here's an XUpdate for the "" -> "" transformation I just showed in Amara: Think of it like a patch (if you're a UNIX head). You could build up your XUpdate file bit by bit while running through your code, and then run the resulting XUpdate whole-sale against the original file. This is a technique that's worked very well for us in similar cases. [1] http://4suite.org [2] http://www.xmldatabases.org/projects/XUpdate-UseCases/ [3] http://uche.ogbuji.net/tech/akara/nodes/2004-09-30/xupdate -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From uche.ogbuji at fourthought.com Fri Jan 28 16:44:04 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 16:44:08 2005 Subject: [XML-SIG] XBEL xslt stylesheet In-Reply-To: <200501281350.58711.junkc@fh-trier.de> References: <41D49DEA.7020806@machina.no> <200501101358.29131.junkc@fh-trier.de> <1106898061.8243.41.camel@borgia> <200501281350.58711.junkc@fh-trier.de> Message-ID: <1106927045.8243.86.camel@borgia> On Fri, 2005-01-28 at 13:50 +0100, Christian Junk wrote: > Sorry, the page wasn't available yesterday because of a server update. Now > everything should work again ;) OK. I'll add that resource. Thanks. But all that may not matter, since the doupdate script seems not to be working. According to the notes, after I commit a change to the www module in CVS, the Web site should be updated the next hour, ten minutes after the hour. But I don't see the changes I committed last night. Martin? Ideas? Thanks. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From junkc at fh-trier.de Fri Jan 28 16:44:46 2005 From: junkc at fh-trier.de (Christian Junk) Date: Fri Jan 28 16:44:44 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <1106923801.8243.54.camel@borgia> References: <41C072E5.7020305@noviforum.si> <200501281109.28815.faure@kde.org> <1106923801.8243.54.camel@borgia> Message-ID: <200501281644.46978.junkc@fh-trier.de> Am Freitag, 28. Januar 2005 15:50 schrieb Uche Ogbuji: > On Fri, 2005-01-28 at 11:09 +0100, David Faure wrote: > > Yes, but at the same time it should be useable as an interoperable > > format. So icons need to be in there - in a simple way, yes. > > Icons are really an integral part of bookmarks nowadays. > > OK. I think we have to agree to disagree on this one, but I'm just one > person. What do others think? If everyone does seem to feel there > should be one standard way to have icons, I'll defer on that point. I think icons should be included in a new XBEL version, although it is one of the not so necessary features - it's not more than an eyecandy. Including the icons as base64-encoded data is one of the easiest way, but if your bookmark collection is large enough, then the icon data will blow up the XML file. > Meanwhile, am I correct that XBEL 1.1 is a dead letter? It's bundled up > into PyXML, but it's not linked from the Web site, or anything. I think 1.1 is a dead letter, too. Perhaps this is the only official link: http://pyxml.sourceforge.net/topics/dtds/xbel-1.1.dtd > It's > also not really ready for prime time (the icons/NMTOKEN head-scratcher, > for example). Last time XBEL evolution came up, Martin mentioned 1.2 as > the natural next version. > Should we just say on the XBEL page that there was an experimental 1.1, > but that it is withdrawn (sorta like XSLT 1.1) and that we're working on > 1.2? > > Then we could start hashing out issues such as icons for 1.2, and get it > out, already. One of the greatest lacks is the definition of the METADATA section. So far, there is no solution for this problem. Please correct me if I'm not right! Perhaps we should think of using XML Scheme instead of DTD? Regards, Christian -- Christian Junk FH Trier, University of Applied Sciences Faculty of Design and Applied Computer Science http://christianjunk.webinternals.de http://xbel.webinternals.de From uche.ogbuji at fourthought.com Fri Jan 28 16:46:12 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 16:46:16 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <200501281644.46978.junkc@fh-trier.de> References: <41C072E5.7020305@noviforum.si> <200501281109.28815.faure@kde.org> <1106923801.8243.54.camel@borgia> <200501281644.46978.junkc@fh-trier.de> Message-ID: <1106927172.8243.88.camel@borgia> On Fri, 2005-01-28 at 16:44 +0100, Christian Junk wrote: > Perhaps we should think of using XML Scheme instead of DTD? RELAX NG, please. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From junkc at fh-trier.de Fri Jan 28 16:57:55 2005 From: junkc at fh-trier.de (Christian Junk) Date: Fri Jan 28 16:57:50 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <1106927172.8243.88.camel@borgia> References: <41C072E5.7020305@noviforum.si> <200501281644.46978.junkc@fh-trier.de> <1106927172.8243.88.camel@borgia> Message-ID: <200501281657.56075.junkc@fh-trier.de> Am Freitag, 28. Januar 2005 16:46 schrieb Uche Ogbuji: > On Fri, 2005-01-28 at 16:44 +0100, Christian Junk wrote: > > Perhaps we should think of using XML Scheme instead of DTD? > > RELAX NG, please. So, don't you like XML Scheme? -- Christian Junk FH Trier, University of Applied Sciences Faculty of Design and Applied Computer Science http://christianjunk.webinternals.de http://xbel.webinternals.de From frans.englich at telia.com Fri Jan 28 17:21:33 2005 From: frans.englich at telia.com (Frans Englich) Date: Fri Jan 28 17:13:52 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <200501281657.56075.junkc@fh-trier.de> References: <41C072E5.7020305@noviforum.si> <1106927172.8243.88.camel@borgia> <200501281657.56075.junkc@fh-trier.de> Message-ID: <200501281621.33340.frans.englich@telia.com> On Friday 28 January 2005 15:57, Christian Junk wrote: > Am Freitag, 28. Januar 2005 16:46 schrieb Uche Ogbuji: > > On Fri, 2005-01-28 at 16:44 +0100, Christian Junk wrote: > > > Perhaps we should think of using XML Scheme instead of DTD? > > > > RELAX NG, please. > > So, don't you like XML Scheme? FWIW; I think Uche is right, RELAX NG is better, at least because it allows more fine grained specification. You can convert RNG to WXS with trang, so if the specification is the expressed in RNG it doesn't exclude the latter. But XBEL would still be namespace less? Or what is people's thoughts on that? Cheers, Frans From uche.ogbuji at fourthought.com Fri Jan 28 17:26:00 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 17:26:11 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <200501281657.56075.junkc@fh-trier.de> References: <41C072E5.7020305@noviforum.si> <200501281644.46978.junkc@fh-trier.de> <1106927172.8243.88.camel@borgia> <200501281657.56075.junkc@fh-trier.de> Message-ID: <1106929561.8243.93.camel@borgia> On Fri, 2005-01-28 at 16:57 +0100, Christian Junk wrote: > Am Freitag, 28. Januar 2005 16:46 schrieb Uche Ogbuji: > > On Fri, 2005-01-28 at 16:44 +0100, Christian Junk wrote: > > > Perhaps we should think of using XML Scheme instead of DTD? > > > > RELAX NG, please. > > So, don't you like XML Scheme? No. I don't like W3C XML Schema. I think it's far too quirky and complex. But no need for schema language wars. We can start with one language and use tools such as trang to generate others. I do think that RELAX NG has the most expressive power (except for Schematron), so I still think it's the best starting place. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From Uche.Ogbuji at fourthought.com Fri Jan 28 17:34:29 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 17:34:33 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <200501281621.33340.frans.englich@telia.com> References: <41C072E5.7020305@noviforum.si> <1106927172.8243.88.camel@borgia> <200501281657.56075.junkc@fh-trier.de> <200501281621.33340.frans.englich@telia.com> Message-ID: <1106930069.8243.98.camel@borgia> On Fri, 2005-01-28 at 16:21 +0000, Frans Englich wrote: > On Friday 28 January 2005 15:57, Christian Junk wrote: > > Am Freitag, 28. Januar 2005 16:46 schrieb Uche Ogbuji: > > > On Fri, 2005-01-28 at 16:44 +0100, Christian Junk wrote: > > > > Perhaps we should think of using XML Scheme instead of DTD? > > > > > > RELAX NG, please. > > > > So, don't you like XML Scheme? > > FWIW; I think Uche is right, RELAX NG is better, at least because it allows > more fine grained specification. You can convert RNG to WXS with trang, so if > the specification is the expressed in RNG it doesn't exclude the latter. > > But XBEL would still be namespace less? Or what is people's thoughts on that? Well, my inclination would be to keep it namespace-free. The nice thing about keeping it namespace-free is that it helps keep processing simple. Namespaces are a simple idea that inject a ridiculous amount of complexity in practice. I'm always happier when I can process XML without namespaces. If anyone does call for XBEL to define a namespace, what is your specific use case that compels it? -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 UBL 1.0 - http://www-106.ibm.com/developerworks/xml/library/x-think28.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From junkc at fh-trier.de Fri Jan 28 17:42:03 2005 From: junkc at fh-trier.de (Christian Junk) Date: Fri Jan 28 17:42:02 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <1106929561.8243.93.camel@borgia> References: <41C072E5.7020305@noviforum.si> <200501281657.56075.junkc@fh-trier.de> <1106929561.8243.93.camel@borgia> Message-ID: <200501281742.03915.junkc@fh-trier.de> Am Freitag, 28. Januar 2005 17:26 schrieb Uche Ogbuji: > On Fri, 2005-01-28 at 16:57 +0100, Christian Junk wrote: > > Am Freitag, 28. Januar 2005 16:46 schrieb Uche Ogbuji: > > > On Fri, 2005-01-28 at 16:44 +0100, Christian Junk wrote: > > > > Perhaps we should think of using XML Scheme instead of DTD? > > > > > > RELAX NG, please. > > > > So, don't you like XML Scheme? > > No. I don't like W3C XML Schema. I think it's far too quirky and > complex. > > But no need for schema language wars. We can start with one language > and use tools such as trang to generate others. I do think that RELAX > NG has the most expressive power (except for Schematron), so I still > think it's the best starting place. You're right! If it is required we can still translate a RNC syntax definition to a XML scheme, automatically ;) But whatever scheme language we use, you all agree that DTD is no alternative for the future of XBEL? -- Christian Junk FH Trier, University of Applied Sciences Faculty of Design and Applied Computer Science http://christianjunk.webinternals.de http://xbel.webinternals.de From frans.englich at telia.com Fri Jan 28 17:50:27 2005 From: frans.englich at telia.com (Frans Englich) Date: Fri Jan 28 17:42:45 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <1106930069.8243.98.camel@borgia> References: <41C072E5.7020305@noviforum.si> <200501281621.33340.frans.englich@telia.com> <1106930069.8243.98.camel@borgia> Message-ID: <200501281650.27873.frans.englich@telia.com> On Friday 28 January 2005 16:34, Uche Ogbuji wrote: > On Fri, 2005-01-28 at 16:21 +0000, Frans Englich wrote: > > On Friday 28 January 2005 15:57, Christian Junk wrote: > > > Am Freitag, 28. Januar 2005 16:46 schrieb Uche Ogbuji: > > > > On Fri, 2005-01-28 at 16:44 +0100, Christian Junk wrote: > > > > > Perhaps we should think of using XML Scheme instead of DTD? > > > > > > > > RELAX NG, please. > > > > > > So, don't you like XML Scheme? > > > > FWIW; I think Uche is right, RELAX NG is better, at least because it > > allows more fine grained specification. You can convert RNG to WXS with > > trang, so if the specification is the expressed in RNG it doesn't exclude > > the latter. > > > > But XBEL would still be namespace less? Or what is people's thoughts on > > that? > > Well, my inclination would be to keep it namespace-free. The nice thing > about keeping it namespace-free is that it helps keep processing simple. > Namespaces are a simple idea that inject a ridiculous amount of > complexity in practice. I'm always happier when I can process XML > without namespaces. > > If anyone does call for XBEL to define a namespace, what is your > specific use case that compels it? This is my opinion. If I were to design XBEL from the ground up I would have put it in a namespace. Doing it at this point would be done in the name of somekind of "XML-correctness". While it itches to suggest it, I don't think it justifies all the compatibility havoc it creates. Cheers, Frans From faure at kde.org Fri Jan 28 17:55:58 2005 From: faure at kde.org (David Faure) Date: Fri Jan 28 17:56:04 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <1106929561.8243.93.camel@borgia> References: <41C072E5.7020305@noviforum.si> <200501281657.56075.junkc@fh-trier.de> <1106929561.8243.93.camel@borgia> Message-ID: <200501281755.58997.faure@kde.org> On Friday 28 January 2005 17:26, Uche Ogbuji wrote: > On Fri, 2005-01-28 at 16:57 +0100, Christian Junk wrote: > > Am Freitag, 28. Januar 2005 16:46 schrieb Uche Ogbuji: > > > On Fri, 2005-01-28 at 16:44 +0100, Christian Junk wrote: > > > > Perhaps we should think of using XML Scheme instead of DTD? > > > > > > RELAX NG, please. > > > > So, don't you like XML Scheme? > > No. I don't like W3C XML Schema. I think it's far too quirky and > complex. > > But no need for schema language wars. We can start with one language > and use tools such as trang to generate others. I do think that RELAX > NG has the most expressive power (except for Schematron), so I still > think it's the best starting place. Yep. I'm 100% for Relax NG too. Best expressive power and validation tools (http://www.koffice.org/developer/fileformat/validate.php) On Friday 28 January 2005 16:44, Christian Junk wrote: > Including the icons as base64-encoded data is one of the easiest way, but if > your bookmark collection is large enough, then the icon data will blow up the > XML file. Hence the idea of using URLs instead, to point to local or remote files. "Exporters" can always allow people to "inline" all the icons as base64 data, without this being the default mode of operation. -- David Faure, faure@kde.org, sponsored by Trolltech to work on KDE, Konqueror (http://www.konqueror.org), and KOffice (http://www.koffice.org). From Uche.Ogbuji at fourthought.com Fri Jan 28 21:09:06 2005 From: Uche.Ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 21:09:11 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <200501281650.27873.frans.englich@telia.com> References: <41C072E5.7020305@noviforum.si> <200501281621.33340.frans.englich@telia.com> <1106930069.8243.98.camel@borgia> <200501281650.27873.frans.englich@telia.com> Message-ID: <1106942947.8243.108.camel@borgia> On Fri, 2005-01-28 at 16:50 +0000, Frans Englich wrote: > This is my opinion. > > If I were to design XBEL from the ground up I would have put it in a > namespace. Doing it at this point would be done in the name of somekind of > "XML-correctness". While it itches to suggest it, I don't think it justifies > all the compatibility havoc it creates. Namespaces don't really have anything to do with XML correctness. I think this is a popular misconception. Namespaces are meant to solve a particular problem in XML. Many argue it makes a hash of the solution (I'm somewhat in the middle), but I think everyone would agree that if you don't have the problem, there is no need to complicate things by using namespaces. Knee-jerk use of namespaces is as bad as knee-jerk use of rat traps. If you don't have a rodent infestation, there is no point putting rat traps around the house and thus risking a swollen toe :-) I believe that if we were creating XBEL the very first time now, I would still argue against a namespace. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From rsalz at datapower.com Fri Jan 28 21:10:16 2005 From: rsalz at datapower.com (Rich Salz) Date: Fri Jan 28 21:10:22 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <200501281755.58997.faure@kde.org> Message-ID: Since RELAX can be converted into W3C XSD, and since RELAX can use the W3C XSD type system, there seems no reason to not use RELAX. /r$ -- Rich Salz Chief Security Architect DataPower Technology http://www.datapower.com XS40 XML Security Gateway http://www.datapower.com/products/xs40.html XML Security Overview http://www.datapower.com/xmldev/xmlsecurity.html From frans.englich at telia.com Fri Jan 28 21:26:59 2005 From: frans.englich at telia.com (Frans Englich) Date: Fri Jan 28 21:19:17 2005 Subject: OT Re: [XML-SIG] XBEL / Call for extension In-Reply-To: <1106942947.8243.108.camel@borgia> References: <41C072E5.7020305@noviforum.si> <200501281650.27873.frans.englich@telia.com> <1106942947.8243.108.camel@borgia> Message-ID: <200501282026.59807.frans.englich@telia.com> On Friday 28 January 2005 20:09, Uche Ogbuji wrote: > On Fri, 2005-01-28 at 16:50 +0000, Frans Englich wrote: > > This is my opinion. > > > > If I were to design XBEL from the ground up I would have put it in a > > namespace. Doing it at this point would be done in the name of somekind > > of "XML-correctness". While it itches to suggest it, I don't think it > > justifies all the compatibility havoc it creates. > > Namespaces don't really have anything to do with XML correctness. What do you mean by "XML correctness"? :) > I > think this is a popular misconception. Namespaces are meant to solve a > particular problem in XML. Many argue it makes a hash of the solution > (I'm somewhat in the middle), but I think everyone would agree that if > you don't have the problem, there is no need to complicate things by > using namespaces. I stay neutral, but have a question: in what situation should namespaces then be used? E.g, why is XHTML in a namespaces? Because it may be combined with other vocabularies? Any other reason? Cheers, Frans From uche.ogbuji at fourthought.com Fri Jan 28 21:37:16 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 21:37:20 2005 Subject: OT Re: [XML-SIG] XBEL / Call for extension In-Reply-To: <200501282026.59807.frans.englich@telia.com> References: <41C072E5.7020305@noviforum.si> <200501281650.27873.frans.englich@telia.com> <1106942947.8243.108.camel@borgia> <200501282026.59807.frans.englich@telia.com> Message-ID: <1106944636.8243.124.camel@borgia> On Fri, 2005-01-28 at 20:26 +0000, Frans Englich wrote: > On Friday 28 January 2005 20:09, Uche Ogbuji wrote: > > On Fri, 2005-01-28 at 16:50 +0000, Frans Englich wrote: > > > This is my opinion. > > > > > > If I were to design XBEL from the ground up I would have put it in a > > > namespace. Doing it at this point would be done in the name of somekind > > > of "XML-correctness". While it itches to suggest it, I don't think it > > > justifies all the compatibility havoc it creates. > > > > Namespaces don't really have anything to do with XML correctness. > > What do you mean by "XML correctness"? :) That's easy. Strict conformance to the XML 1.0 specification, Third Edition in particular. The specification defines conformance criteria. > > I > > think this is a popular misconception. Namespaces are meant to solve a > > particular problem in XML. Many argue it makes a hash of the solution > > (I'm somewhat in the middle), but I think everyone would agree that if > > you don't have the problem, there is no need to complicate things by > > using namespaces. > > I stay neutral, but have a question: in what situation should namespaces then > be used? E.g, why is XHTML in a namespaces? Because it may be combined with > other vocabularies? Any other reason? That is the *only* reason. believe me. I've been in all the wars. And even that reasoning has been widely debated. I've never seen a single application of XBEL embedded in another vocabulary. Even if I had, I would advocate namespace-free for an XBEL- only document, with an namespace only to be used in embedded cases. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From uche.ogbuji at fourthought.com Fri Jan 28 23:06:43 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Fri Jan 28 23:06:46 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: References: Message-ID: <1106950003.8243.160.camel@borgia> On Fri, 2005-01-28 at 15:10 -0500, Rich Salz wrote: > Since RELAX can be converted into W3C XSD, and since RELAX can use the W3C > XSD type system, there seems no reason to not use RELAX. I, um, guess. Shall we try to minimize the data typing mojo as much as possible, though? -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From noreply at sourceforge.net Sat Jan 29 15:26:06 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Sat Jan 29 15:26:08 2005 Subject: [XML-SIG] [ pyxml-Bugs-1112052 ] XMLFilterBase broken when used with entities Message-ID: Bugs item #1112052, was opened at 2005-01-29 14:26 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1112052&group_id=6473 Category: SAX Group: None Status: Open Resolution: None Priority: 5 Submitted By: Uche Ogbuji (uche) Assigned to: Uche Ogbuji (uche) Summary: XMLFilterBase broken when used with entities Initial Comment: We were missing a "return" statement for resolveEntity, which would cause very odd breakage when you tried to use filters with DTDs or entities. I fixed this already, but I'm logging it here for user awareness (affects up to PyXML 0.8.4), and so that we don't forget to upstream it to Python 2.4.x and 2.5. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1112052&group_id=6473 From rsalz at datapower.com Sat Jan 29 18:41:47 2005 From: rsalz at datapower.com (Rich Salz) Date: Sat Jan 29 18:41:54 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: <1106950003.8243.160.camel@borgia> Message-ID: > Shall we try to minimize the data typing mojo as much as possible, > though? Oh sure. I'm quite happy if there's none. I was just pointing out that there is no reason *not* to use RelaxNG, since it's almost a direct superset (via trang) of XSD /r$ -- Rich Salz Chief Security Architect DataPower Technology http://www.datapower.com XS40 XML Security Gateway http://www.datapower.com/products/xs40.html XML Security Overview http://www.datapower.com/xmldev/xmlsecurity.html From uche.ogbuji at fourthought.com Sat Jan 29 19:17:10 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Sat Jan 29 19:17:14 2005 Subject: [XML-SIG] XBEL / Call for extension In-Reply-To: References: Message-ID: <1107022630.8243.165.camel@borgia> On Sat, 2005-01-29 at 12:41 -0500, Rich Salz wrote: > > Shall we try to minimize the data typing mojo as much as possible, > > though? > > Oh sure. I'm quite happy if there's none. I was just pointing out that > there is no reason *not* to use RelaxNG, since it's almost a direct > superset (via trang) of XSD True. True. I was just a bit dashed that my evil plan had been foiled: My scheme was built on the hope that if no one happened to mention data typing, perhaps no one would be tempted to use it ;-) -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From martin at v.loewis.de Sun Jan 30 10:25:33 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 30 10:25:21 2005 Subject: [XML-SIG] XBEL resource page updates In-Reply-To: <1106898215.8243.44.camel@borgia> References: <1106898215.8243.44.camel@borgia> Message-ID: <41FCA80D.4050709@v.loewis.de> Uche Ogbuji wrote: > I cleared up the backlog of XBEL resource page requests for the year. > IIRC, the new updates should be apparent once our cron tasks runs on > SourceForge. Thanks! Unfortunately, SF has disabled cron on shell1, so this needs to be run manually for the moment (which I did). The command to run on shell1 is /home/groups/p/py/pyxml/doupdate Regards, Martin From uche.ogbuji at fourthought.com Sun Jan 30 15:42:10 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Sun Jan 30 15:42:13 2005 Subject: [XML-SIG] XBEL resource page updates In-Reply-To: <41FCA80D.4050709@v.loewis.de> References: <1106898215.8243.44.camel@borgia> <41FCA80D.4050709@v.loewis.de> Message-ID: <1107096130.8243.172.camel@borgia> On Sun, 2005-01-30 at 10:25 +0100, "Martin v. L?wis" wrote: > Uche Ogbuji wrote: > > I cleared up the backlog of XBEL resource page requests for the year. > > IIRC, the new updates should be apparent once our cron tasks runs on > > SourceForge. > > Thanks! Unfortunately, SF has disabled cron on shell1, so this needs > to be run manually for the moment (which I did). The command to run on > shell1 is > > /home/groups/p/py/pyxml/doupdate Thanks. I'll update the README. Should I also add a note to the XBEL page that discussions on XBEL 1.2 are under way, and that interested parties should follow XML-SIG in order to keep up with likely changes? XBEL 1.2 is still a ways away right now, but I think the earlier we post such a note, the more people will have fair warning of the evolution. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From martin at v.loewis.de Sun Jan 30 19:12:59 2005 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Jan 30 19:12:46 2005 Subject: [XML-SIG] XBEL resource page updates In-Reply-To: <1107096130.8243.172.camel@borgia> References: <1106898215.8243.44.camel@borgia> <41FCA80D.4050709@v.loewis.de> <1107096130.8243.172.camel@borgia> Message-ID: <41FD23AB.1020302@v.loewis.de> Uche Ogbuji wrote: > XBEL 1.2 is still a ways away > right now, but I think the earlier we post such a note, the more people > will have fair warning of the evolution. Certainly. It would be good if somebody was really in charge with XBEL maintenance. I would try to do this if nobody else wants to (i.e. in my role as PyXML maintainer), but now that the XBEL interest is beyond Python, it might be reasonable to detach XBEL from PyXML. OTOH, I would not like to give that out of my hands unless I'm certain that whoever takes over shows long-term commitment, and values both stability and progress equally. Regards, Martin From fredrik at pythonware.com Sun Jan 30 20:02:16 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun Jan 30 20:02:14 2005 Subject: [XML-SIG] ANN: cElementTree 1.0.1 (january 30, 2005) References: Message-ID: effbot.org proudly presents version 1.0.1 of the cElementTree library, a fast and very efficient implementation of the ElementTree API, for Python 2.1 and later. This release is a maintenance release. It adds the little-used 'remove' method, which was missing from earlier releases For more information on this library, including download instructions, detailed benchmark figures, and more, see: http://effbot.org/zone/celementtree.htm enjoy /F From sales at fsf.org Mon Jan 31 11:56:45 2005 From: sales at fsf.org (sales@fsf.org) Date: Mon Jan 31 11:57:07 2005 Subject: [XML-SIG] Delivery reports about your e-mail Message-ID: <20050131105706.383BD1E4004@bag.python.org> Your message was not delivered due to the following reason(s): Your message could not be delivered because the destination computer was unreachable within the allowed queue period. The amount of time a message is queued before it is returned depends on local configura- tion parameters. Most likely there is a network problem that prevented delivery, but it is also possible that the computer is turned off, or does not have a mail system running right now. Your message was not delivered within 2 days: Host 149.187.12.77 is not responding. The following recipients could not receive this message: Please reply to postmaster@python.org if you feel this message to be in error. -------------- next part -------------- test; hi; hello; Mail Delivery System; Mail Transaction Failed; Server Report; Status; Error; Test; Hi; Hello; Encrypted Mail; Virus sample; abuse?; feel free to use it; Excel file; Details; fake; read it immediately; something for you; information; order; encrypted document; file is bad; your document; your archive; re: unknow; re: questions; report; is that your account?; re: protected message; hidden message; Mail Delivery; failure notice; Picture Size: 11 KB, Mail: +OK; Fw: Buon Natale! From uche.ogbuji at fourthought.com Mon Jan 31 15:39:35 2005 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Mon Jan 31 15:39:40 2005 Subject: [XML-SIG] XBEL resource page updates In-Reply-To: <41FD23AB.1020302@v.loewis.de> References: <1106898215.8243.44.camel@borgia> <41FCA80D.4050709@v.loewis.de> <1107096130.8243.172.camel@borgia> <41FD23AB.1020302@v.loewis.de> Message-ID: <1107182375.8243.194.camel@borgia> On Sun, 2005-01-30 at 19:12 +0100, "Martin v. L?wis" wrote: > Uche Ogbuji wrote: > > XBEL 1.2 is still a ways away > > right now, but I think the earlier we post such a note, the more people > > will have fair warning of the evolution. > > Certainly. It would be good if somebody was really in charge with XBEL > maintenance. I would try to do this if nobody else wants to (i.e. in > my role as PyXML maintainer), but now that the XBEL interest is beyond > Python, it might be reasonable to detach XBEL from PyXML. > OTOH, I would not like to give that out of my hands unless I'm certain > that whoever takes over shows long-term commitment, and values both > stability and progress equally. Well, I've done the last few Web page updates, anyway, and I'm already set up as a developer. Besides the 1.2 discussion, it's light enough work that I'm willing to take responsibility as XBEL maintainer. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286 Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html From erik at cq2.nl Mon Jan 31 16:27:14 2005 From: erik at cq2.nl (Erik J. Groeneveld) Date: Mon Jan 31 16:27:33 2005 Subject: [XML-SIG] SOAPpy streaming base64 Message-ID: <200501311627.14973.erik@cq2.nl> Hi, I am new to this list. I am developing a web site that harvests OAI repositories using the oai-mph protocol, and uploads the records to a indexing service using SOAPpy. It seems that SOAPpy does not support streaming (large) base64 encoded content into a request. I saw from the code that it allocates string buffers and calls base64.encode_string() on it. Since we have to deal with potetentially large files (more than 64 MB), we are looking for a way to perform SOAP requests with base64 merged into the stream, without having to keep the whole document in memory. Could any of you give us hints or pointers? Any help would be appreciated. Erik -- Erik J. Groeneveld Seek You Too softwarestudio www.cq2.nl, +31 318 555 488 From martin at v.loewis.de Mon Jan 31 20:09:21 2005 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Jan 31 20:09:05 2005 Subject: [XML-SIG] XBEL resource page updates In-Reply-To: <1107182375.8243.194.camel@borgia> References: <1106898215.8243.44.camel@borgia> <41FCA80D.4050709@v.loewis.de> <1107096130.8243.172.camel@borgia> <41FD23AB.1020302@v.loewis.de> <1107182375.8243.194.camel@borgia> Message-ID: <41FE8261.4020705@v.loewis.de> Uche Ogbuji wrote: > Well, I've done the last few Web page updates, anyway, and I'm already > set up as a developer. Besides the 1.2 discussion, it's light enough > work that I'm willing to take responsibility as XBEL maintainer. Very good! If I can help with more infrastructure (mailing lists on SF or python.org, etc) please let me know. Regards, Martin