From ht@cogsci.ed.ac.uk Tue Apr 1 09:06:44 2003 From: ht@cogsci.ed.ac.uk (Henry S. Thompson) Date: 01 Apr 2003 10:06:44 +0100 Subject: [XML-SIG] Re: how to use XSV in python to validate an xml file against an xsd schema file In-Reply-To: <200303311152.h2VBq9q27850@mailgate5.cinetic.de> References: <200303311152.h2VBq9q27850@mailgate5.cinetic.de> Message-ID: "Nico Grubert" writes: > Unfortunately, I do not know exactly what I have to do with the line > " res[0].printme(sys.stdout) " you mentioned in your posting. You seem to have missed my original posting, which that is a correction to. Here's the whole thing, corrected. As its name suggests, runitAndShow does the output itself, but it goes to stderr, so you may have just never seen it. Also note the second argument is a list, so you need to say >>> runitAndShow(xmlfile,[schemafile]) For your purposes, you want to do """ # validateTestXML.py from XSV.driver import runit import sys xmlfile = "myxmlfile.xml" schemafile = "myxsdfile.xsd" res = runit( xmlfile, [schemafile] ) res[0].printme(sys.stdout) """ You can inspect res to see how your validation went. ht -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh Half-time member of W3C Team 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam] From philippe.dubreuil@inrialpes.fr Tue Apr 1 15:06:36 2003 From: philippe.dubreuil@inrialpes.fr (Philippe Dubreuil) Date: Tue, 01 Apr 2003 17:06:36 +0200 Subject: [XML-SIG] re : XSL Performance / Processors Message-ID: <3E89AAFC.9090208@inrialpes.fr> Hello, I'm actually working on xsl processor, and particulary on saxon and xalan. I would like to show that builtin XSLT indexing mechanisms (i.e tag) are very useful, and then find a way to propagate them through several transformations. I have done some tests : Saxon is really faster than Xalan. They are two different tree implementations in Saxon. The use of tinytree seems to decrease the performance whereas results are really "good" with tree implementation. The tinytree should win because it is faster to build, and uses less memory: in principle the full tree should be faster if you keep it in memory and use it often enough to pay for the extra build cost. The key implementation in the two cases is identical. But nobody can explain my results : "Performance can depend on many apparently unrelated factors (including the fluttering of the wings of a butterfly...)" M.Kay Example---------- I use for my example a typical bibliography document, ie

(...) </reference> size ~(5Mo) Here are parts of stylesheets that I use: withIndex : key declaration <xsl:key name="clef2" match="//reference" use="authors/p/@last"> <xsl:for-each select='key("clef2","zorro")'> <li><xsl:value-of select="title"/></li> </xsl:for-each> noIndex <xsl:for-each select="//reference/authors/p[@last='zorro']"> <li><xsl:value-of select=".../../title"/></li> </xsl:for-each> Some of my results (execution Time) (RedHat 6.2, i686, 256Mo RAM, PIII 550Mhz) I'm using the same structure (index or not) three times in this example. If I use 10 times, i get the same results. withIndex noIndex Xalan2.5 8780ms 9108ms Saxon6.5.2 tree 2800ms 2900ms " tinytree 3021ms 1757ms !!!! --------------------------------------------- I'm really interested in finding similar results. thanks Phil. Dubreuil INRIA Rhone Alpes FRANCE From veillard@redhat.com Tue Apr 1 15:33:33 2003 From: veillard@redhat.com (Daniel Veillard) Date: Tue, 1 Apr 2003 10:33:33 -0500 Subject: [XML-SIG] re : XSL Performance / Processors In-Reply-To: <3E89AAFC.9090208@inrialpes.fr>; from philippe.dubreuil@inrialpes.fr on Tue, Apr 01, 2003 at 05:06:36PM +0200 References: <3E89AAFC.9090208@inrialpes.fr> Message-ID: <20030401103333.K31240@redhat.com> On Tue, Apr 01, 2003 at 05:06:36PM +0200, Philippe Dubreuil wrote: > But nobody can explain my results : > "Performance can depend on many apparently unrelated factors (including > the fluttering of the wings of a butterfly...)" M.Kay [...] > Some of my results (execution Time) (RedHat 6.2, i686, 256Mo RAM, PIII > 550Mhz) > I'm using the same structure (index or not) three times in this example. > If I use 10 times, i get the same results. > > > withIndex noIndex > Xalan2.5 8780ms 9108ms > Saxon6.5.2 tree 2800ms 2900ms > " tinytree 3021ms 1757ms !!!! > If you want to do performance timing, using a Java based implementation is the best way to introduce irrationality in your results. The effects of the HotSpot algorithm detection and of the asynchronous garbage collection simply makes any test run less than 50 times a joke, because they don't represent the output of the real working set nor show GC effects. In general the more you pile up complex layers, the hardest it is to get predicatble results for the whole stack, have fun ! Daniel -- Daniel Veillard | Red Hat Network https://rhn.redhat.com/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ From stephane.bidoul@softwareag.com Tue Apr 1 19:46:59 2003 From: stephane.bidoul@softwareag.com (Stephane Bidoul) Date: Tue, 1 Apr 2003 21:46:59 +0200 Subject: [XML-SIG] Advice on how to deal with locking problems using python-wrapped C libs in MT frameworks like Zope/Twisted? In-Reply-To: <20030331050920.W31240@redhat.com> Message-ID: <00e301c2f887$760e5780$ca14200a@acsesbi> > > Here's an example of how to release the GIL in C: > > > > Py_BEGIN_ALLOW_THREADS; > > /* Call into some C lib */ > > Py_END_ALLOW_THREADS; > > Hum, of course I assume it's an expensive operation. As a result > I can't generate it for all calls to libxml2 or libxslt, and which > call need it or not need to be manually selected, it's not something > which can be automated. As a result it's a request for enhancement > of my Python bindings but since it requires going over the full > set of entry point manually, and an update of the bindings generator > don't expect it to be done quickly. Also, if I remember well, you must be sure to reacquire the interpreter lock before calling back into python (in the error handler callback, for instance). -sbi From uche.ogbuji@fourthought.com Tue Apr 1 21:12:50 2003 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Tue, 01 Apr 2003 14:12:50 -0700 Subject: [XML-SIG] re : XSL Performance / Processors In-Reply-To: Message from Daniel Veillard <veillard@redhat.com> of "Tue, 01 Apr 2003 10:33:33 EST." <20030401103333.K31240@redhat.com> Message-ID: <E190T3q-0003Le-00@borgia.local> > On Tue, Apr 01, 2003 at 05:06:36PM +0200, Philippe Dubreuil wrote: > > But nobody can explain my results : > > "Performance can depend on many apparently unrelated factors (including > > the fluttering of the wings of a butterfly...)" M.Kay > [...] > > Some of my results (execution Time) (RedHat 6.2, i686, 256Mo RAM, PIII > > 550Mhz) > > I'm using the same structure (index or not) three times in this example. > > If I use 10 times, i get the same results. > > > > > > withIndex noIndex > > Xalan2.5 8780ms 9108ms > > Saxon6.5.2 tree 2800ms 2900ms > > " tinytree 3021ms 1757ms !!!! > > > > If you want to do performance timing, using a Java based implementation > is the best way to introduce irrationality in your results. The effects > of the HotSpot algorithm detection and of the asynchronous garbage collection > simply makes any test run less than 50 times a joke, because they don't > represent the output of the real working set nor show GC effects. > In general the more you pile up complex layers, the hardest it is > to get predicatble results for the whole stack, Agreed with this comment. I'm also wondering what the original question has to do with the Python XML-SIG. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Use internal references in XML vocabularies - http://www-106.ibm.com/developerw orks/xml/library/x-tipvocab.html Universal Business Language (UBL) - http://www-106.ibm.com/developerworks/xml/l ibrary/x-think16.html EXSLT by example - http://www-106.ibm.com/developerworks/library/x-exslt.html The worry about program wizards - http://www.adtmag.com/article.asp?id=7238 Use rdf:about and rdf:ID effectively in RDF/XML - http://www-106.ibm.com/develo perworks/xml/library/x-tiprdfai.html Keep context straight in XSLT - http://www-106.ibm.com/developerworks/xml/libra ry/x-tipcurrent.html Using SAX for Proper XML Output - http://www.xml.com/pub/a/2003/03/12/py-xml.ht ml SAX filters for flexible processing - http://www-106.ibm.com/developerworks/xml /library/x-tipsaxflex.html From mark@easymailings.com Wed Apr 2 18:22:01 2003 From: mark@easymailings.com (Mark Bucciarelli) Date: Wed, 2 Apr 2003 13:22:01 -0500 Subject: [XML-SIG] Uninstalling pyxml Fixes Crash In-Reply-To: <m38yuvt9jn.fsf@mira.informatik.hu-berlin.de> References: <200303271610.41642.mark@easymailings.com> <200303311641.21403.mark@easymailings.com> <m38yuvt9jn.fsf@mira.informatik.hu-berlin.de> Message-ID: <200304021322.01612.mark@easymailings.com> --Boundary-00=_Jpyi+nbXHwCVovC Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Monday 31 March 2003 6:12 pm, Martin v. L=F6wis wrote: > Mark Bucciarelli <mark@easymailings.com> writes: > > Anything else that might be helpful? > > Yes, an exact citation of the XML and DTD, perhaps in form of a > minimal example. See attached. Mark --Boundary-00=_Jpyi+nbXHwCVovC Content-Type: application/x-tgz; name="saxerror.tgz" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="saxerror.tgz" H4sIAKYpiz4AA+1Z62/bNhDP5wD7HzgXgeRWs+Vnu8RRkNrOYsBxMtspmj1gsBYTq5Ulg2Iexrr/ fUeKelmylAFpimHhF4vk74734B2P9OT44wWmHulT6tLqzjdput7U37Za8Kvrb9tN+ev3ZdvhE/V2 vdmuwXetDiQ7qPVtxEm2W49hitDOEtMveTjzE34OcZ67TRL+n7vLpetUTGY+5Rp6TdfbzeY2/9fb zVbo/1arDvgmjO4g/SmF2Nb+5/7vHD0sbXRHqGe5zmGpVtFLiDhz17Scm8PS5fTkp3elI+OHXQSt 82N/2D/rj6bo2DQp8TykThglhGnI/60faahrsTXvY0Y09Ju16romKacY+Hikvrro9o6nx9sA9RwE XyiXAUiQMy9Fy0KMpoPpFdpDPc6hdEUw1dCZ67CFBkOg3Kl7y0cs55aRkgE0PmWSPydLMU9CBM8C DCxYgODSFC0kRM0xRn+JLTtnftDLmbz0CHXwMo//Bfa8e5eaOZChe2M5SA2YaSFNGjoSi51Y1GMj AR1i/ysNDUF5K+NCyLnpLgkjaTvHVFy4DrAAfzGLy9S/Iw7EkIZO8ANX5obAnukS206TSpo8//jM chCwSq79b3KF53LlkVNyTSglZhd2LJ6zM8IWbqYzg9CB7HBrM1QSx0rXvXUgR4jv16XMHRrhCray ACJV4k0i2Z5BNgIdc2gygz0DJznlWGPs66bu+UoeZETEeJhD/8FPtnmIwSgx+w3zf/L8hzTw5Kd/ 4fnfaDQ2z/9Gs9F4Of+fo/3L8z+Ib1koLl3z1iZocjWZ9s9QKSof41G+lwAfJCIv5AdugBRLZ5Ss 7DUqyRjROLFIOTPLuXYPNDiJtIjB9hYdI+Jo03hIDS0PspAfvRtZaFOMOaY2oFPSdOV8F9MsbrGE KoFIVfeSuh2U0VcUDSZWOihHueB4Oh0OJjFOkcBgDpE/gqmxkPLrpnRiuIxejfu/Xg7G/Z7PYIvi MSujkn+qyiJP8482bcspAAVQVi7dkCUSFgYGsEj5dWYK3kAhdXjiCwN+5z7UkCzH8BfCizKTgG8f syHCIxz0sgll8kwakzm2bdl5FJ8JoXfWnEgSXjuMCTZ5qZh9qPgKFJw8MTEKkJHwBcC4nAXQuBKw NfnPQTaye8xxhE5GBSwD4OXKBG7FjIcgquORCzuraM+EZtf3SSjfJYU1OGyhAszodvmJ0GOIujsi 7O89isB36uOw0ltAAhVvEckHsrDmkHbVbQERuSk/Waa9pEVxkkuZNomW1LogmrK0zlXWTyKyU36T DfVzvLCLgHzvw/U/0JL1X6JXWa2fZo2C+q8df//z67+WXmu+1H/P0azlyqUMQRFY8fDDruxOGIX6 b3C+uzu34RoeHssfz4an2DEh4FVJUVn4/QovCojD5HR5X8SnSa7RDIoKi81mqkfsaznO2yvEXIRN E2HkkHu0ou4K8siaj7IFQVAYIffTZzKHU/YzLC+gfILhG7QglFTinGBsyUFLzOYLpNruPS+tPFJG 9wvMFA9ZjiDmUod0XKAKkHroEP2l3FmOsq8omuLKDCh6qSSmrCHfiimkLPkDjk9k4rX/sXBvw2nx 7MI7f6fWhCUVJTkKGnNJ1HJqGEZHUILthjadLzCFCgzKUmFVLTYQs7B1HakIfvJmX8haDUZiuIQt fg++/kRvDhEkW+xw5p4aWyJBGQrFA4n1bbKEbSDF8ktwzBiNLRczAZ+vCGfFtI6JjQ7BTLFSmZen SlJwuUEx3Ucr+Eor5dsPALBEKCtxzLSkGabjxJaHHJcJDySXFo9eW5TYav+NhSJRGTeTGqwa2C7y C+//Wc4gTAD8fbUphi9poS15Y3SdHsyWUAHTMwoVBK8eYMPnls8cw1+5ZtfUXc5WFpkTL+RV4UGV jrWUnhwqgu6RWIjKRyJ52D5WABHW+WBF15Vy2lkJPkGeAZdZch8mxgvIuUGBNOzHnZFL6fHAjtyY v0yQejbnycOcrJh4Io8qBg2R7L0DGZriFWRgiCWeu2Ffu8iFhEyREOjesm10QxgYIpPedm8q99Ri RAUdVcjqb5Dyh6Nki37tUn/DQ843rWjDZkRd0GSIVEQK8+4ttlCV2Uwp74uLMfibbN/ckWzKXk3n Gby055VAOrhWq34QF94rbzKjv1wuZyVNHt+7cJCBo7HoKslHHKVW0ZXoEUcRjzjKkSEfFnrn3enV RT+8RQSPN8HbX0kCw4cH/7nhUMl6XlCMqA6XryQGLN+pBp1oeoPcSBils3HxN1Im6/gXaePYtl3t vet8hoDtVOVgGj3oGTW92e5U4SM9CxcEo/6+cdprtse12mm73arrDRAahtNgfsEy6rpe61TFZxrB L19G77z3S79TFd8ZEH4zMwajaX98MegBTPTTuOD6ZbxrtkH6sJtGRi8BBtg71ktDY88LHBvvpsHx hwOOTvQz3BJ7PEhPJwzY2GZAaSHI6wa3sv+VjerhtVH7uVPlv9kI/n+YUWt1quJjy1oihRstsLD8 TCtW3a5Zp7p1uyamon0eDRsKj98Vz5k8uQZl9BK2zcwfhSpCFtW8bklX3mVJXYETOVl1q5IuRIgf NajmK8GHKrMH5Jfvff94aS/tpb20l/Z92j9L8PfIACgAAA== --Boundary-00=_Jpyi+nbXHwCVovC-- From tlo@aw.sgi.com Wed Apr 2 22:52:34 2003 From: tlo@aw.sgi.com (Terence Lo) Date: Wed, 2 Apr 2003 17:52:34 -0500 Subject: [XML-SIG] smtpLib and xml_instance? Message-ID: <026c01c2f96a$8d9f6b80$ca411dc6@ms.aliaswavefront.com> Hi there, Just wondering if someone could help me out. For some odd reason when I run the following code snippet, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> import sys, smtplib, string, os from gnosis.xml.objectify import XML_Objectify, pyobj_printer xml_obj = XML_Objectify('config.xml') config = xml_obj.make_instance() fromaddr = 'bob@asdf.com' toaddrs = config.mailer.email.PCDATA print toaddrs bodytext = 'this is the bodytext' smtp = smtplib.SMTP('mail.tor.aw.sgi.com') smtp.sendmail(fromaddr, toaddrs, bodytext) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> OUTPUT: File "C:\Python21\lib\smtplib.py", line 494, in sendmail (code,resp) = self.data(msg) File "C:\Python21\lib\smtplib.py", line 384, in data raise SMTPDataError(code,repl) smtplib.SMTPDataError: (503, 'Need RCPT (recipient)') >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I get a smtplib.SMTPDataError: (503, 'Need RCPT (recipient)') error. Now I know that I am fairly certain that the xml file that I am using is correct because when i print out the toaddrs variable, it correctly displays the correct email address. Now when I hardcode: toaddrs = "blah@blah.com" *instead of* toaddrs = config.mailer.email.PCDATA The smtplib.SMTPDataError doesn't occur. If someone could this newbie out, I would greatly appreciate it. Thanks in advance! Terence. From jasj@miller.cs.uwm.edu Thu Apr 3 01:52:09 2003 From: jasj@miller.cs.uwm.edu (Jason Michael Jurkowski) Date: Wed, 2 Apr 2003 19:52:09 -0600 (CST) Subject: [XML-SIG] smtpLib and xml_instance? In-Reply-To: <026c01c2f96a$8d9f6b80$ca411dc6@ms.aliaswavefront.com> Message-ID: <Pine.GSO.4.33.0304021943430.5999-100000@miller.cs.uwm.edu> the print statement invokes the str() method of the object provided to it. the str() method is providing the character data for that object. i'm not familiar with the gnosis code but if CDATA is a 'CDATASection Object' you want to assign PCDATA.data to toaddrs and pass that into the smtplib code. On Wed, 2 Apr 2003, Terence Lo wrote: > Hi there, > Just wondering if someone could help me out. For some odd reason when I run > the following code snippet, > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > import sys, smtplib, string, os > from gnosis.xml.objectify import XML_Objectify, pyobj_printer > > xml_obj = XML_Objectify('config.xml') > config = xml_obj.make_instance() > > fromaddr = 'bob@asdf.com' > toaddrs = config.mailer.email.PCDATA > > print toaddrs > > bodytext = 'this is the bodytext' > > smtp = smtplib.SMTP('mail.tor.aw.sgi.com') > smtp.sendmail(fromaddr, toaddrs, bodytext) > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > OUTPUT: > > File "C:\Python21\lib\smtplib.py", line 494, in sendmail > (code,resp) = self.data(msg) > File "C:\Python21\lib\smtplib.py", line 384, in data > raise SMTPDataError(code,repl) > smtplib.SMTPDataError: (503, 'Need RCPT (recipient)') > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > I get a smtplib.SMTPDataError: (503, 'Need RCPT (recipient)') error. Now I > know that I am fairly certain that the xml file that I am using is correct > because when i print out the toaddrs variable, it correctly displays the > correct email address. > > Now when I hardcode: > > toaddrs = "blah@blah.com" > > *instead of* > toaddrs = config.mailer.email.PCDATA > > The smtplib.SMTPDataError doesn't occur. If someone could this newbie out, > I would greatly appreciate it. > > Thanks in advance! > > Terence. > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > From nicogrubert@web.de Thu Apr 3 08:45:23 2003 From: nicogrubert@web.de (Nico Grubert) Date: Thu, 3 Apr 2003 10:45:23 +0200 Subject: [XML-SIG] Running XSV on Python 2.1.3 ? Message-ID: <200304030845.h338jNq01971@mailgate5.cinetic.de> hello, after henry helped me to get a result when validating an xml file against an xsd schema file, I have a question about XSV and Python 2.1. I have installed Python 2.1 and Python 2.2 on my machine running Debian Linux. Furthermore, I have installed XSV 1.6 for both python versions: So, XSV is installed in: - /usr/lib/python2.1/site-packages/XSV/ - /usr/lib/python2.2/site-packages/XSV/ When running my short script to validate an xml file against an xsd schema file with python 2.2 everything works fine and I get a result. The XSV developers did a great job ! The parser is absolutely great for validating an xml file agains an xsd schema file. When running the same script with python 2.1 nothing happens, and the python expression "res = runit( xmlfile, [schemafile] )" seems to crash, because after that line of code, Python does not do anything meaning that I can write the line "raise 'myError', 'My Error'" after the the line "res = runit( xmlfile, [schemafile] )" but the raise statement is never reached. As I said: With Python 2.2 everything works fine. The problem I have is: I am using the web application server "Zope 2.6.1" which runs on Python 2.1.3 only and I can't get my script to work. So, my question is: Is it somehow possible to run XSV on Python 2.1.3 ? thanks in advance, -nico ______________________________________________________________________________ Sie stehen auf POP3? Dann versenden Sie mit WEB.DE FreeMail Ihre SMS aus Outlook oder Netscape! http://freemail.web.de/features/?mc=021178 From ht@cogsci.ed.ac.uk Thu Apr 3 09:11:22 2003 From: ht@cogsci.ed.ac.uk (Henry S. Thompson) Date: 03 Apr 2003 10:11:22 +0100 Subject: [XML-SIG] Running XSV on Python 2.1.3 ? In-Reply-To: <200304030845.h338jNq01971@mailgate5.cinetic.de> References: <200304030845.h338jNq01971@mailgate5.cinetic.de> Message-ID: <f5bn0j80wud.fsf@erasmus.inf.ed.ac.uk> "Nico Grubert" <nicogrubert@web.de> writes: > So, my question is: Is it somehow possible to run XSV on Python 2.1.3 ? Deep within XSV is a fast validating XML parser written in C, and integrated into Python via its native code interface. Python 2.1 and 2.2 are incompatible at that level, alas. It's perfectly possible to build the necessary library (PyLTXML) for 2.1, using the distutils packaging. Just download ftp://ftp.cogsci.ed.ac.uk/PyLTXML-1.3.tar.gz, unpack it and follow the installation instructions, being sure to use python2.1 for the build and installation. The instructions will tell you how to get and install the C parser as well -- there are binary installers for most common architectures, so that should be straightforward. ht -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh Half-time member of W3C Team 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail really from me _always_ has this .sig -- mail without it is forged spam] From tlo@aw.sgi.com Thu Apr 3 16:16:49 2003 From: tlo@aw.sgi.com (Terence Lo) Date: Thu, 3 Apr 2003 11:16:49 -0500 Subject: [XML-SIG] smtpLib and xml_instance? In-Reply-To: <Pine.GSO.4.33.0304021943430.5999-100000@miller.cs.uwm.edu> Message-ID: <027c01c2f9fc$6f0a0e40$ca411dc6@ms.aliaswavefront.com> Thanks for the help Jason. In case anyone runs into this problem, I've discovered a quick fix. As it turns out, wrapping the PCDATA with an str() method and assigning it to toaddrs seems to work quite nicely. ie.: toaddrs = str(config.mailer.email.PCDATA) Terence. -----Original Message----- From: Jason Michael Jurkowski [mailto:jasj@miller.cs.uwm.edu] Sent: Wednesday, April 02, 2003 8:52 PM To: Terence Lo Cc: xml-sig@python.org Subject: Re: [XML-SIG] smtpLib and xml_instance? the print statement invokes the str() method of the object provided to it. the str() method is providing the character data for that object. i'm not familiar with the gnosis code but if CDATA is a 'CDATASection Object' you want to assign PCDATA.data to toaddrs and pass that into the smtplib code. On Wed, 2 Apr 2003, Terence Lo wrote: > Hi there, > Just wondering if someone could help me out. For some odd reason when I run > the following code snippet, > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > import sys, smtplib, string, os > from gnosis.xml.objectify import XML_Objectify, pyobj_printer > > xml_obj = XML_Objectify('config.xml') > config = xml_obj.make_instance() > > fromaddr = 'bob@asdf.com' > toaddrs = config.mailer.email.PCDATA > > print toaddrs > > bodytext = 'this is the bodytext' > > smtp = smtplib.SMTP('mail.tor.aw.sgi.com') > smtp.sendmail(fromaddr, toaddrs, bodytext) > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > OUTPUT: > > File "C:\Python21\lib\smtplib.py", line 494, in sendmail > (code,resp) = self.data(msg) > File "C:\Python21\lib\smtplib.py", line 384, in data > raise SMTPDataError(code,repl) > smtplib.SMTPDataError: (503, 'Need RCPT (recipient)') > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > I get a smtplib.SMTPDataError: (503, 'Need RCPT (recipient)') error. Now I > know that I am fairly certain that the xml file that I am using is correct > because when i print out the toaddrs variable, it correctly displays the > correct email address. > > Now when I hardcode: > > toaddrs = "blah@blah.com" > > *instead of* > toaddrs = config.mailer.email.PCDATA > > The smtplib.SMTPDataError doesn't occur. If someone could this newbie out, > I would greatly appreciate it. > > Thanks in advance! > > Terence. > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > From mark@easymailings.com Wed Apr 9 17:18:32 2003 From: mark@easymailings.com (Mark Bucciarelli) Date: Wed, 9 Apr 2003 12:18:32 -0400 Subject: [XML-SIG] [BUG] xml.dom.minidom on Windows. Message-ID: <200304091218.32306.mark@easymailings.com> --Boundary-00=_YfEl+DvTdBjt+Ug Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline The attached code snippet produces peculiar results on Windows. Namely, it truncates text node data when there is only one grandchild tag. This error occurs on Windows XP, with Python 2.2.2 (#37, Oct 14, 2002, 17:02:34) [MSC 32 bit (Intel)] but not on Linux with Python 2.2.1 (#1, Aug 30 2002, 12:15:30) [GCC 3.2 20020822 (Red Hat Linux Rawhide 3.2-4)] The output I get on Windows is shown below. The test script includes a very similar text string that works correctly. What's kind of fun is that if you start to reduce the length of the second granchild tag, from "abcdef" to "abcde" and then to "abcd" at some point the "works" example will fail as well. The shorter you make it the more trunctated the nodeValue of the text node becomes. I'll log this at sourceforge. In the meantime, can anyone suggest a work around? XML: <REQUEST><TYPE>AUTHORIZATION</TYPE><abcdef>123</abcdef></REQUEST> Node: <TYPE>AUTHORIZATION</TYPE> xml : AUTHORIZATION data: "AUTHORIZATION" Node: <abcdef>123</abcdef> xml : 123 data: "123" XML: <REQUEST><TYPE>AUTHORIZATION</TYPE></REQUEST> Node: <TYPE>AUTHORIZATION</TYPE> xml : AUTHORI data: "AUTHORI" Traceback (most recent call last): File "testparse.py", line 40, in ? assert does_not_work['TYPE'] == 'AUTHORIZATION', does_not_work['TYPE'] AssertionError: AUTHORI --Boundary-00=_YfEl+DvTdBjt+Ug Content-Type: text/x-python; charset="us-ascii"; name="testparse.py" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="testparse.py" import xml.dom.minidom def parseTwoDeep(strXml): retval = {} document = xml.dom.minidom.parseString(strXml) rootnode = document.firstChild for n in rootnode.childNodes: if n.nodeType == xml.dom.Node.ELEMENT_NODE: retval = parseOneDeep(n.toxml()) document.unlink() return retval def parseOneDeep(strXml): retval = {} document = xml.dom.minidom.parseString(strXml) rootnode = document.firstChild print 'XML:', strXml for n in rootnode.childNodes: if n.nodeType == xml.dom.Node.ELEMENT_NODE: print ' Node:', n.toxml() if n.hasChildNodes() and n.firstChild.nodeType == n.TEXT_NODE: print ' xml :', n.firstChild.toxml() print ' data: "%s"' % n.firstChild.nodeValue retval[n.localName] = n.firstChild.nodeValue.strip() else: retval[n.localName] = '' document.unlink() return retval works = parseTwoDeep('<ROOT><REQUEST><TYPE>AUTHORIZATION</TYPE><abcdef>123</abcdef></REQUEST></ROOT>') assert works['TYPE'] == 'AUTHORIZATION', works['TYPE'] does_not_work = parseTwoDeep('<ROOT><REQUEST><TYPE>AUTHORIZATION</TYPE></REQUEST></ROOT>') assert does_not_work['TYPE'] == 'AUTHORIZATION', does_not_work['TYPE'] --Boundary-00=_YfEl+DvTdBjt+Ug-- From mark@easymailings.com Wed Apr 9 23:03:57 2003 From: mark@easymailings.com (Mark Bucciarelli) Date: Wed, 9 Apr 2003 18:03:57 -0400 Subject: [XML-SIG] [BUG] xml.dom.minidom on Windows. In-Reply-To: <200304091218.32306.mark@easymailings.com> References: <200304091218.32306.mark@easymailings.com> Message-ID: <200304091803.58054.mark@easymailings.com> On Wednesday 09 April 2003 12:18 pm, Mark Bucciarelli wrote: > In the meantime, can anyone suggest a work around? Never mind, I got one--use sax instead of minidom. Mark From tpassin@comcast.net Wed Apr 9 23:53:50 2003 From: tpassin@comcast.net (Thomas B. Passin) Date: Wed, 09 Apr 2003 18:53:50 -0400 Subject: [XML-SIG] [BUG] xml.dom.minidom on Windows. References: <200304091218.32306.mark@easymailings.com> Message-ID: <001001c2feea$e3627c80$6401a8c0@tbp1> [Mark Bucciarelli] > The attached code snippet produces peculiar results on Windows. > Namely, it truncates text node data when there is only one grandchild > tag. > > This error occurs on Windows XP, with Python 2.2.2 (#37, Oct 14, 2002, > 17:02:34) [MSC 32 bit (Intel)] but not on Linux with Python 2.2.1 > (#1, Aug 30 2002, 12:15:30) [GCC 3.2 20020822 (Red Hat Linux Rawhide > 3.2-4)] > Your example worked fine on my Win2000/Py2.2 0.8.2 system. > The output I get on Windows is shown below. The test script includes > a very similar text string that works correctly. What's kind of fun > is that if you start to reduce the length of the second granchild > tag, from "abcdef" to "abcde" and then to "abcd" at some point the > "works" example will fail as well. The shorter you make it the more > trunctated the nodeValue of the text node becomes. > It still worked for me even truncated all the way down to "a"... Cheers, Tom P From fredrik@pythonware.com Thu Apr 10 11:13:00 2003 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 10 Apr 2003 12:13:00 +0200 Subject: [XML-SIG] Re: [BUG] xml.dom.minidom on Windows. References: <200304091218.32306.mark@easymailings.com> Message-ID: <b73g1l$v7b$1@main.gmane.org> Mark Bucciarelli wrote: > The attached code snippet produces peculiar results on Windows. > Namely, it truncates text node data when there is only one grandchild > tag. your snippet is broken: a DOM tree builder can use any number of text nodes (including empty nodes) to represent a single text string in the original XML file. (either due to buffering, or because the XML file contains plain text mixed with entities and CDATA sections). either change your code to deal with this, or call the "normalize" method before processing the node. </F> From mark@easymailings.com Thu Apr 10 13:20:21 2003 From: mark@easymailings.com (Mark Bucciarelli) Date: Thu, 10 Apr 2003 08:20:21 -0400 Subject: [XML-SIG] Re: [BUG] xml.dom.minidom on Windows. In-Reply-To: <b73g1l$v7b$1@main.gmane.org> References: <200304091218.32306.mark@easymailings.com> <b73g1l$v7b$1@main.gmane.org> Message-ID: <200304100820.21388.mark@easymailings.com> On Thursday 10 April 2003 6:13 am, Fredrik Lundh wrote: > your snippet is broken: > call the "normalize" method before processing the node. yes indeed, thank you. sorry for the ruckus. From uche.ogbuji@fourthought.com Thu Apr 10 14:51:24 2003 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Thu, 10 Apr 2003 07:51:24 -0600 Subject: Article: Gems From the [XML-SIG] Archives Message-ID: <3E9576DC.9060903@fourthought.com> http://www.xml.com/pub/a/2003/04/09/py-xml.html "In this and in subsequent articles I will mine the richness of the XML-SIG mailing list for some of its choicest bits of code. I start in this article with a couple of very handy snippets from 1998 and 1999. Where necessary, I have updated code to use current APIs, style, and conventions in order to make it immediately useful to readers. All code in this article was tested using Python 2.2.1 and PyXML 0.8.2." -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://4Suite.org http://fourthought.com Introducing N-Triples - http://www-106.ibm.com/developerworks/xml/library/x-think17/index.html Use internal references in XML vocabularies - http://www-106.ibm.com/developerworks/xml/library/x-tipvocab.html EXSLT by example - http://www-106.ibm.com/developerworks/library/x-exslt.html The worry about program wizards - http://www.adtmag.com/article.asp?id=7238 Use rdf:about and rdf:ID effectively in RDF/XML - http://www-106.ibm.com/developerworks/xml/library/x-tiprdfai.html Keep context straight in XSLT - http://www-106.ibm.com/developerworks/xml/library/x-tipcurrent.html Using SAX for Proper XML Output - http://www.xml.com/pub/a/2003/03/12/py-xml.html From cstrong@arielpartners.com Thu Apr 10 15:16:19 2003 From: cstrong@arielpartners.com (Craeg K Strong) Date: Thu, 10 Apr 2003 10:16:19 -0400 Subject: [4suite] Article: Gems From the [XML-SIG] Archives In-Reply-To: <3E9576DC.9060903@fourthought.com> References: <3E9576DC.9060903@fourthought.com> Message-ID: <3E957CB3.3030102@arielpartners.com> > > > If you are interested in a bit of a mind-bending experimentation in > XML output using Python idioms, see this message > <http://mail.python.org/pipermail/xml-sig/1998-October/000423.html> by > Greg Stein. As an example of the very interesting perspective it > provides, the following snippet should generate a bit of XHTML: > >|f = Factory() >body = f.body(bgcolor='#ffffff').p.a(href='l.html').img(src='l.gif') >html = f.html[f.head.title('title'), body] | > > I think you'll agree this is delightfully twisted. > Good catch! The above seems to me a quite natural and Pythonic way to produce XML. It reminds me of Jakarta's Element Construction Set ( http://jakarta.apache.org/ecs ) which we have been using to produce XML from Java for some time: Html html = new Html() .addElement(new Head() .addElement(new Title("Demo"))) .addElement(new Body() .addElement(new H1("Demo Header")) .addElement(new H3("Sub Header:")) .addElement(new Font().setSize("+1") .setColor(HtmlColor.WHITE) .setFace("Times") .addElement("The big dog & the little cat chased each other."))); out.println(html.toString()); Of course, Python's built in introspection capabilities make it much more terse. Yay, Python. --Craeg Uche Ogbuji wrote: > http://www.xml.com/pub/a/2003/04/09/py-xml.html > > "In this and in subsequent articles I will mine the richness of the > XML-SIG mailing list for some of its choicest bits of code. I start in > this article with a couple of very handy snippets from 1998 and 1999. > Where necessary, I have updated code to use current APIs, style, and > conventions in order to make it immediately useful to readers. All > code in this article was tested using Python 2.2.1 and PyXML 0.8.2." From jean@upfrontsystems.co.za Fri Apr 11 12:44:32 2003 From: jean@upfrontsystems.co.za (Jean Jordaan) Date: Fri, 11 Apr 2003 13:44:32 +0200 Subject: [4suite] Article: Gems From the [XML-SIG] Archives In-Reply-To: <3E957CB3.3030102@arielpartners.com> References: <3E9576DC.9060903@fourthought.com> <3E957CB3.3030102@arielpartners.com> Message-ID: <3E96AAA0.7050705@upfrontsystems.co.za> >> body = f.body(bgcolor='#ffffff').p.a(href='l.html').img(src='l.gif') >> html = f.html[f.head.title('title'), body] | This looks to me very close to what JaXML does http://www.librelogiciel.com/software/ # --- Classical Version fp = open("sample.xml", "w") fp.write('<?xml version="1.0" encoding="iso-8859-1"?>\n') fp.write('<sometag someattr="1">\n') fp.write(' <anothertag jaxml="Nice">\n') fp.write(' <thirdone>Yo</thirdone>\n') fp.write(' </anothertag>\n') fp.write('</sometag>\n') fp.close() # an equivalent version using JAXML import jaxml doc = jaxml.XML_document() doc.sometag(someattr=1).anothertag(jaxml="Nice") doc.thirdone("Yo") doc._output("sample.xml") -- Jean Jordaan http://www.upfrontsystems.co.za From noreplyplease@lizards.ws Fri Apr 11 21:33:50 2003 From: noreplyplease@lizards.ws (ixszjmohme@hotmail.com) Date: Fri, 11 Apr 2003 22:33:50 +0200 Subject: [XML-SIG] the message Message-ID: <courier.3E9726AE.000038CF@qwerty.net> Here is the Message: http://www.sacredlook.org You won't receive messages anymore. From thomas.reimann@outertech.com Sun Apr 13 00:11:09 2003 From: thomas.reimann@outertech.com (Thomas Reimann) Date: Sun, 13 Apr 2003 01:11:09 +0200 Subject: [XML-SIG] XBEL Support Message-ID: <20030413010854.D52A.THOMAS.REIMANN@outertech.com> Hi! Your page (http://pyxml.sourceforge.net/topics/xbel/) displays software that supports the XBEL format. Our URL Manager Linkman includes a XBEL export template. -- Best regards, Thomas Reimann outertech.com From cstrong@arielpartners.com Sun Apr 13 05:04:02 2003 From: cstrong@arielpartners.com (Craeg K Strong) Date: Sun, 13 Apr 2003 00:04:02 -0400 Subject: [XML-SIG] Redhat RPM for PyXML 0.8.2? In-Reply-To: <m37kapcls7.fsf@mira.informatik.hu-berlin.de> References: <3E7E94EE.6030706@arielpartners.com> <m37kapcls7.fsf@mira.informatik.hu-berlin.de> Message-ID: <3E98E1B2.9010803@arielpartners.com> Martin v. L=F6wis wrote: >Craeg K Strong <cstrong@arielpartners.com> writes: > =20 > >>My Red Hat 8.0 system comes with PyXML 0.7.1 installed as an RPM. >>there is also a bunch of printconf stuff that depends on PyXML. Can >>anyone give me advice on how to upgrade to PyXML 0.8.2? - Is there a >>redhat RPM available anywhere? >> =20 >> >Not that I know of. > =20 > >>- Is there someone I can beg and plead to make one for me :-) >> =20 >> >You can make it yourself. Unpack the source distribution, and invoke >"python setup.py bdist_rpm". > Uh oh. This doesn't look good: $ python setup.py bdist_rpm > building RPMs > rpm -ba --define _topdir /opt/PyXML-0.8.2/build/bdist.linux-i686/rpm=20 > --clean build/bdist.linux-i686/rpm/SPECS/PyXML.spec > -ba: unknown option > error: command 'rpm' failed with exit status 1 > $ rpm --version > RPM version 4.1 This happens on a stock RedHat Linux 8.0 Do you have any suggestions?=20 > Just build an RPM and install it. > >Regards, >Martin > =20 > Thanks, --Craeg From martin@v.loewis.de Sun Apr 13 08:44:23 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 13 Apr 2003 09:44:23 +0200 Subject: [XML-SIG] Redhat RPM for PyXML 0.8.2? In-Reply-To: <3E98E1B2.9010803@arielpartners.com> References: <3E7E94EE.6030706@arielpartners.com> <m37kapcls7.fsf@mira.informatik.hu-berlin.de> <3E98E1B2.9010803@arielpartners.com> Message-ID: <m3r886x2nc.fsf@mira.informatik.hu-berlin.de> --=-=-= Craeg K Strong <cstrong@arielpartners.com> writes: > $ python setup.py bdist_rpm > > > building RPMs > > rpm -ba --define _topdir /opt/PyXML-0.8.2/build/bdist.linux-i686/rpm > > --clean build/bdist.linux-i686/rpm/SPECS/PyXML.spec > > -ba: unknown option > > error: command 'rpm' failed with exit status 1 > > $ rpm --version > > RPM version 4.1 > > This happens on a stock RedHat Linux 8.0 > > Do you have any suggestions? Ah, f*cking RPM. Redhat decided to kill "rpm -ba", and asks everybody to use "rpmbuild" instead. This is fixed in recent distutils; please replace bdist_rpm with the file attached. Regards, Martin --=-=-= Content-Type: application/octet-stream Content-Disposition: attachment; filename=bdist_rpm.py """distutils.command.bdist_rpm Implements the Distutils 'bdist_rpm' command (create RPM source and binary distributions).""" # created 2000/04/25, by Harry Henry Gebel __revision__ = "$Id: bdist_rpm.py,v 1.27.6.3 2002/11/04 13:33:47 akuchling Exp $" import sys, os, string import glob from types import * from distutils.core import Command, DEBUG from distutils.util import get_platform from distutils.file_util import write_file from distutils.errors import * class bdist_rpm (Command): description = "create an RPM distribution" user_options = [ ('bdist-base=', None, "base directory for creating built distributions"), ('rpm-base=', None, "base directory for creating RPMs (defaults to \"rpm\" under " "--bdist-base; must be specified for RPM 2)"), ('dist-dir=', 'd', "directory to put final RPM files in " "(and .spec files if --spec-only)"), ('python=', None, "path to Python interpreter to hard-code in the .spec file " "(default: \"python\")"), ('fix-python', None, "hard-code the exact path to the current Python interpreter in " "the .spec file"), ('spec-only', None, "only regenerate spec file"), ('source-only', None, "only generate source RPM"), ('binary-only', None, "only generate binary RPM"), ('use-bzip2', None, "use bzip2 instead of gzip to create source distribution"), # More meta-data: too RPM-specific to put in the setup script, # but needs to go in the .spec file -- so we make these options # to "bdist_rpm". The idea is that packagers would put this # info in setup.cfg, although they are of course free to # supply it on the command line. ('distribution-name=', None, "name of the (Linux) distribution to which this " "RPM applies (*not* the name of the module distribution!)"), ('group=', None, "package classification [default: \"Development/Libraries\"]"), ('release=', None, "RPM release number"), ('serial=', None, "RPM serial number"), ('vendor=', None, "RPM \"vendor\" (eg. \"Joe Blow <joe@example.com>\") " "[default: maintainer or author from setup script]"), ('packager=', None, "RPM packager (eg. \"Jane Doe <jane@example.net>\")" "[default: vendor]"), ('doc-files=', None, "list of documentation files (space or comma-separated)"), ('changelog=', None, "RPM changelog"), ('icon=', None, "name of icon file"), ('provides=', None, "capabilities provided by this package"), ('requires=', None, "capabilities required by this package"), ('conflicts=', None, "capabilities which conflict with this package"), ('build-requires=', None, "capabilities required to build this package"), ('obsoletes=', None, "capabilities made obsolete by this package"), # Actions to take when building RPM ('keep-temp', 'k', "don't clean up RPM build directory"), ('no-keep-temp', None, "clean up RPM build directory [default]"), ('use-rpm-opt-flags', None, "compile with RPM_OPT_FLAGS when building from source RPM"), ('no-rpm-opt-flags', None, "do not pass any RPM CFLAGS to compiler"), ('rpm3-mode', None, "RPM 3 compatibility mode (default)"), ('rpm2-mode', None, "RPM 2 compatibility mode"), ] boolean_options = ['keep-temp', 'use-rpm-opt-flags', 'rpm3-mode'] negative_opt = {'no-keep-temp': 'keep-temp', 'no-rpm-opt-flags': 'use-rpm-opt-flags', 'rpm2-mode': 'rpm3-mode'} def initialize_options (self): self.bdist_base = None self.rpm_base = None self.dist_dir = None self.python = None self.fix_python = None self.spec_only = None self.binary_only = None self.source_only = None self.use_bzip2 = None self.distribution_name = None self.group = None self.release = None self.serial = None self.vendor = None self.packager = None self.doc_files = None self.changelog = None self.icon = None self.prep_script = None self.build_script = None self.install_script = None self.clean_script = None self.pre_install = None self.post_install = None self.pre_uninstall = None self.post_uninstall = None self.prep = None self.provides = None self.requires = None self.conflicts = None self.build_requires = None self.obsoletes = None self.keep_temp = 0 self.use_rpm_opt_flags = 1 self.rpm3_mode = 1 # initialize_options() def finalize_options (self): self.set_undefined_options('bdist', ('bdist_base', 'bdist_base')) if self.rpm_base is None: if not self.rpm3_mode: raise DistutilsOptionError, \ "you must specify --rpm-base in RPM 2 mode" self.rpm_base = os.path.join(self.bdist_base, "rpm") if self.python is None: if self.fix_python: self.python = sys.executable else: self.python = "python" elif self.fix_python: raise DistutilsOptionError, \ "--python and --fix-python are mutually exclusive options" if os.name != 'posix': raise DistutilsPlatformError, \ ("don't know how to create RPM " "distributions on platform %s" % os.name) if self.binary_only and self.source_only: raise DistutilsOptionError, \ "cannot supply both '--source-only' and '--binary-only'" # don't pass CFLAGS to pure python distributions if not self.distribution.has_ext_modules(): self.use_rpm_opt_flags = 0 self.set_undefined_options('bdist', ('dist_dir', 'dist_dir')) self.finalize_package_data() # finalize_options() def finalize_package_data (self): self.ensure_string('group', "Development/Libraries") self.ensure_string('vendor', "%s <%s>" % (self.distribution.get_contact(), self.distribution.get_contact_email())) self.ensure_string('packager') self.ensure_string_list('doc_files') if type(self.doc_files) is ListType: for readme in ('README', 'README.txt'): if os.path.exists(readme) and readme not in self.doc_files: self.doc_files.append(readme) self.ensure_string('release', "1") self.ensure_string('serial') # should it be an int? self.ensure_string('distribution_name') self.ensure_string('changelog') # Format changelog correctly self.changelog = self._format_changelog(self.changelog) self.ensure_filename('icon') self.ensure_filename('prep_script') self.ensure_filename('build_script') self.ensure_filename('install_script') self.ensure_filename('clean_script') self.ensure_filename('pre_install') self.ensure_filename('post_install') self.ensure_filename('pre_uninstall') self.ensure_filename('post_uninstall') # XXX don't forget we punted on summaries and descriptions -- they # should be handled here eventually! # Now *this* is some meta-data that belongs in the setup script... self.ensure_string_list('provides') self.ensure_string_list('requires') self.ensure_string_list('conflicts') self.ensure_string_list('build_requires') self.ensure_string_list('obsoletes') # finalize_package_data () def run (self): if DEBUG: print "before _get_package_data():" print "vendor =", self.vendor print "packager =", self.packager print "doc_files =", self.doc_files print "changelog =", self.changelog # make directories if self.spec_only: spec_dir = self.dist_dir self.mkpath(spec_dir) else: rpm_dir = {} for d in ('SOURCES', 'SPECS', 'BUILD', 'RPMS', 'SRPMS'): rpm_dir[d] = os.path.join(self.rpm_base, d) self.mkpath(rpm_dir[d]) spec_dir = rpm_dir['SPECS'] # Spec file goes into 'dist_dir' if '--spec-only specified', # build/rpm.<plat> otherwise. spec_path = os.path.join(spec_dir, "%s.spec" % self.distribution.get_name()) self.execute(write_file, (spec_path, self._make_spec_file()), "writing '%s'" % spec_path) if self.spec_only: # stop if requested return # Make a source distribution and copy to SOURCES directory with # optional icon. sdist = self.reinitialize_command('sdist') if self.use_bzip2: sdist.formats = ['bztar'] else: sdist.formats = ['gztar'] self.run_command('sdist') source = sdist.get_archive_files()[0] source_dir = rpm_dir['SOURCES'] self.copy_file(source, source_dir) if self.icon: if os.path.exists(self.icon): self.copy_file(self.icon, source_dir) else: raise DistutilsFileError, \ "icon file '%s' does not exist" % self.icon # build package self.announce('building RPMs') rpm_cmd = ['rpm'] if os.path.exists('/usr/bin/rpmbuild') or \ os.path.exists('/bin/rpmbuild'): rpm_cmd = ['rpmbuild'] if self.source_only: # what kind of RPMs? rpm_cmd.append('-bs') elif self.binary_only: rpm_cmd.append('-bb') else: rpm_cmd.append('-ba') if self.rpm3_mode: rpm_cmd.extend(['--define', '_topdir %s/%s' % (os.getcwd(), self.rpm_base),]) if not self.keep_temp: rpm_cmd.append('--clean') rpm_cmd.append(spec_path) self.spawn(rpm_cmd) # XXX this is a nasty hack -- we really should have a proper way to # find out the names of the RPM files created; also, this assumes # that RPM creates exactly one source and one binary RPM. if not self.dry_run: if not self.binary_only: srpms = glob.glob(os.path.join(rpm_dir['SRPMS'], "*.rpm")) assert len(srpms) == 1, \ "unexpected number of SRPM files found: %s" % srpms self.move_file(srpms[0], self.dist_dir) if not self.source_only: rpms = glob.glob(os.path.join(rpm_dir['RPMS'], "*/*.rpm")) assert len(rpms) == 1, \ "unexpected number of RPM files found: %s" % rpms self.move_file(rpms[0], self.dist_dir) # run() def _make_spec_file(self): """Generate the text of an RPM spec file and return it as a list of strings (one per line). """ # definitions and headers spec_file = [ '%define name ' + self.distribution.get_name(), '%define version ' + self.distribution.get_version(), '%define release ' + self.release, '', 'Summary: ' + self.distribution.get_description(), ] # put locale summaries into spec file # XXX not supported for now (hard to put a dictionary # in a config file -- arg!) #for locale in self.summaries.keys(): # spec_file.append('Summary(%s): %s' % (locale, # self.summaries[locale])) spec_file.extend([ 'Name: %{name}', 'Version: %{version}', 'Release: %{release}',]) # XXX yuck! this filename is available from the "sdist" command, # but only after it has run: and we create the spec file before # running "sdist", in case of --spec-only. if self.use_bzip2: spec_file.append('Source0: %{name}-%{version}.tar.bz2') else: spec_file.append('Source0: %{name}-%{version}.tar.gz') spec_file.extend([ 'Copyright: ' + self.distribution.get_license(), 'Group: ' + self.group, 'BuildRoot: %{_tmppath}/%{name}-buildroot', 'Prefix: %{_prefix}', ]) # noarch if no extension modules if not self.distribution.has_ext_modules(): spec_file.append('BuildArchitectures: noarch') for field in ('Vendor', 'Packager', 'Provides', 'Requires', 'Conflicts', 'Obsoletes', ): val = getattr(self, string.lower(field)) if type(val) is ListType: spec_file.append('%s: %s' % (field, string.join(val))) elif val is not None: spec_file.append('%s: %s' % (field, val)) if self.distribution.get_url() != 'UNKNOWN': spec_file.append('Url: ' + self.distribution.get_url()) if self.distribution_name: spec_file.append('Distribution: ' + self.distribution_name) if self.build_requires: spec_file.append('BuildRequires: ' + string.join(self.build_requires)) if self.icon: spec_file.append('Icon: ' + os.path.basename(self.icon)) spec_file.extend([ '', '%description', self.distribution.get_long_description() ]) # put locale descriptions into spec file # XXX again, suppressed because config file syntax doesn't # easily support this ;-( #for locale in self.descriptions.keys(): # spec_file.extend([ # '', # '%description -l ' + locale, # self.descriptions[locale], # ]) # rpm scripts # figure out default build script def_build = "%s setup.py build" % self.python if self.use_rpm_opt_flags: def_build = 'env CFLAGS="$RPM_OPT_FLAGS" ' + def_build # insert contents of files # XXX this is kind of misleading: user-supplied options are files # that we open and interpolate into the spec file, but the defaults # are just text that we drop in as-is. Hmmm. script_options = [ ('prep', 'prep_script', "%setup"), ('build', 'build_script', def_build), ('install', 'install_script', ("%s setup.py install " "--root=$RPM_BUILD_ROOT " "--record=INSTALLED_FILES") % self.python), ('clean', 'clean_script', "rm -rf $RPM_BUILD_ROOT"), ('pre', 'pre_install', None), ('post', 'post_install', None), ('preun', 'pre_uninstall', None), ('postun', 'post_uninstall', None), ] for (rpm_opt, attr, default) in script_options: # Insert contents of file referred to, if no file is refered to # use 'default' as contents of script val = getattr(self, attr) if val or default: spec_file.extend([ '', '%' + rpm_opt,]) if val: spec_file.extend(string.split(open(val, 'r').read(), '\n')) else: spec_file.append(default) # files section spec_file.extend([ '', '%files -f INSTALLED_FILES', '%defattr(-,root,root)', ]) if self.doc_files: spec_file.append('%doc ' + string.join(self.doc_files)) if self.changelog: spec_file.extend([ '', '%changelog',]) spec_file.extend(self.changelog) return spec_file # _make_spec_file () def _format_changelog(self, changelog): """Format the changelog correctly and convert it to a list of strings """ if not changelog: return changelog new_changelog = [] for line in string.split(string.strip(changelog), '\n'): line = string.strip(line) if line[0] == '*': new_changelog.extend(['', line]) elif line[0] == '-': new_changelog.append(line) else: new_changelog.append(' ' + line) # strip trailing newline inserted by first changelog entry if not new_changelog[0]: del new_changelog[0] return new_changelog # _format_changelog() # class bdist_rpm --=-=-=-- From ryanwilcox@mac.com Sun Apr 13 23:09:40 2003 From: ryanwilcox@mac.com (Ryan Wilcox) Date: Sun, 13 Apr 2003 18:09:40 -0400 Subject: [XML-SIG] Can't Build 0.8.2 on Jaguar? Message-ID: <9F99A43A-6DFC-11D7-91F0-000502BD4C9B@mac.com> Hi folks. I just joined the list, so if this is a common question I'm sorry. I was trying to build PyXML 0.8.2 today on my OS X (Jaguar) machine and got an error: % python setup.py help Traceback (most recent call last): File "setup.py", line 58, in ? if sys.platform[:6] == "darwin" and \ NameError: name 'distutils' is not defined I tried this with the Apple supplied Python, and Python 2.2.2 with the same result. Can you shead some light on this error? Thank you _very_ much, -Ryan Wilcox From cstrong@arielpartners.com Mon Apr 14 01:51:36 2003 From: cstrong@arielpartners.com (Craeg K Strong) Date: Sun, 13 Apr 2003 20:51:36 -0400 Subject: [XML-SIG] Redhat RPM for PyXML 0.8.2? In-Reply-To: <m3r886x2nc.fsf@mira.informatik.hu-berlin.de> References: <3E7E94EE.6030706@arielpartners.com> <m37kapcls7.fsf@mira.informatik.hu-berlin.de> <3E98E1B2.9010803@arielpartners.com> <m3r886x2nc.fsf@mira.informatik.hu-berlin.de> Message-ID: <3E9A0618.9050201@arielpartners.com> Martin v. L=F6wis wrote: >Craeg K Strong <cstrong@arielpartners.com> writes: > =20 > >>$ python setup.py bdist_rpm >> =20 >> >>>building RPMs >>>rpm -ba --define _topdir /opt/PyXML-0.8.2/build/bdist.linux-i686/rpm >>>--clean build/bdist.linux-i686/rpm/SPECS/PyXML.spec >>>-ba: unknown option >>>error: command 'rpm' failed with exit status 1 >>>$ rpm --version >>>RPM version 4.1 >>> =20 >>> > Redhat decided to kill "rpm -ba", and asks everybody >to use "rpmbuild" instead. This is fixed in recent distutils; please >replace bdist_rpm with the file attached. > I did as you said, and using Python 2.1.3 got the following error: python setup.py bdist_rpm running bdist_rpm Traceback (most recent call last): File "setup.py", line 227, in ? scripts =3D ['scripts/xmlproc_parse', 'scripts/xmlproc_val'] File "/opt/Zope-2.6.1-linux2-x86/lib/python2.1/distutils/core.py",=20 line 138, in setup dist.run_commands() File "/opt/Zope-2.6.1-linux2-x86/lib/python2.1/distutils/dist.py",=20 line 899, in run_commands self.run_command(cmd) File "/opt/Zope-2.6.1-linux2-x86/lib/python2.1/distutils/dist.py",=20 line 919, in run_command cmd_obj.run() File "/opt/zope/lib/python2.1/distutils/command/bdist_rpm.py", line=20 252, in run (spec_path, File "/opt/zope/lib/python2.1/distutils/command/bdist_rpm.py", line=20 352, in _make_spec_file spec_file.extend([ AttributeError: Distribution instance has no attribute 'get_license' I went ahead and changed line 353 in bdist_rpm.py from 'Copyright: ' + self.distribution.get_license(), to 'Copyright: n/a', and it worked. I looked like the older version of bdist_rpm.py would=20 have had the same problem, however, as it also calls get_license(). Any ideas? --Craeg From jasj@miller.cs.uwm.edu Mon Apr 14 02:07:02 2003 From: jasj@miller.cs.uwm.edu (Jason Michael Jurkowski) Date: Sun, 13 Apr 2003 20:07:02 -0500 (CDT) Subject: [XML-SIG] Can't Build 0.8.2 on Jaguar? In-Reply-To: <9F99A43A-6DFC-11D7-91F0-000502BD4C9B@mac.com> Message-ID: <Pine.GSO.4.33.0304132005530.2142-100000@miller.cs.uwm.edu> if you go to the below url you'll find my proposed patch for this problem. http://mail.python.org/pipermail/xml-sig/2003-March/009181.html On Sun, 13 Apr 2003, Ryan Wilcox wrote: > Hi folks. I just joined the list, so if this is a common question I'm > sorry. > > I was trying to build PyXML 0.8.2 today on my OS X (Jaguar) machine and > got an error: > > % python setup.py help > Traceback (most recent call last): > File "setup.py", line 58, in ? > if sys.platform[:6] == "darwin" and \ > NameError: name 'distutils' is not defined > > I tried this with the Apple supplied Python, and Python 2.2.2 with the > same > result. > > Can you shead some light on this error? > > Thank you _very_ much, > -Ryan Wilcox > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig > From martin@v.loewis.de Mon Apr 14 06:29:31 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 14 Apr 2003 07:29:31 +0200 Subject: [XML-SIG] Redhat RPM for PyXML 0.8.2? In-Reply-To: <3E9A0618.9050201@arielpartners.com> References: <3E7E94EE.6030706@arielpartners.com> <m37kapcls7.fsf@mira.informatik.hu-berlin.de> <3E98E1B2.9010803@arielpartners.com> <m3r886x2nc.fsf@mira.informatik.hu-berlin.de> <3E9A0618.9050201@arielpartners.com> Message-ID: <m3fzold4uc.fsf@mira.informatik.hu-berlin.de> Craeg K Strong <cstrong@arielpartners.com> writes: > AttributeError: Distribution instance has no attribute 'get_license' > > I went ahead and changed line 353 in bdist_rpm.py from > > 'Copyright: ' + self.distribution.get_license(), > > to > > 'Copyright: n/a', > > and it worked. I looked like the older version of bdist_rpm.py would > have had > the same problem, however, as it also calls get_license(). Any ideas? It's surprising you get this error. In distutils/dist.py, I see the function def get_license(self): return self.license or "UNKNOWN" get_licence = get_license which always gives a value. Do you have no function get_license in dist.py? Regards, Martin From noreply@sourceforge.net Mon Apr 14 16:16:33 2003 From: noreply@sourceforge.net (SourceForge.net) Date: Mon, 14 Apr 2003 08:16:33 -0700 Subject: [XML-SIG] [ pyxml-Patches-721163 ] xml.xpath type equivalence issue Message-ID: <E1955hB-0000Oe-00@sc8-sf-web2.sourceforge.net> Patches item #721163, was opened at 2003-04-14 17:16 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=306473&aid=721163&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Paul Boddie (pboddie) Assigned to: Nobody/Anonymous (nobody) Summary: xml.xpath type equivalence issue Initial Comment: In ParsedRelativeLocationPath (in PyXML 0.8.2), there is a type equivalence test which seems to be outdated. The supplied "unified diff" patch provides a workaround rather than a definitive solution, since I don't know exactly what kind of test should be used. This issue arose when I used an XPath expression of the form "//element/subelement". Apparently, the representation of node lists has changed enough to break this code. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=306473&aid=721163&group_id=6473 From cstrong@arielpartners.com Mon Apr 14 19:07:21 2003 From: cstrong@arielpartners.com (Craeg Strong) Date: Mon, 14 Apr 2003 14:07:21 -0400 Subject: [XML-SIG] =?iso-8859-1?Q?Re:_Re:_[XML-SIG]_Redhat_RPM_for_PyXML_0.8.2=3F?= Message-ID: <200304141816.h3EIG1Yt003676@conversent.net> > Craeg K Strong <cstrong@arielpartners.com> writes: > > > AttributeError: Distribution instance has no attribute 'get_license' > In distutils/dist.py, I see the > function > > def get_license(self): > return self.license or "UNKNOWN" > get_licence = get_license > > which always gives a value. Do you have no function get_license in > dist.py? Yes, this function does not exist in Python 2.1.3 (win2K) __revision__ = "$Id: dist.py,v 1.47 2001/03/31 02:41:01 akuchling Exp $" I am sure that upgrading to Python 2.2.2 would fix this problem, and I will certainly do this as soon as Zope 2.7 is released... --Craeg > > Regards, > Martin > From martin@v.loewis.de Mon Apr 14 21:02:43 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 14 Apr 2003 22:02:43 +0200 Subject: [XML-SIG] Re: Re: [XML-SIG] Redhat RPM for PyXML 0.8.2? In-Reply-To: <200304141816.h3EIG1Yt003676@conversent.net> References: <200304141816.h3EIG1Yt003676@conversent.net> Message-ID: <m3ptnovod8.fsf@mira.informatik.hu-berlin.de> Craeg Strong <cstrong@arielpartners.com> writes: > Yes, this function does not exist in Python 2.1.3 (win2K) > __revision__ > I am sure that upgrading to Python 2.2.2 would fix this > problem, and I will certainly do this as soon as Zope > 2.7 is released... Ah, ok. So I guess you are more-or-less out of luck. We simply cannot support building RPMs on Python 2.1, and Redhat 8: You either need an older RPM package (where rpm -ba still works), or a newer Python (where rpmbuild is used). Replacing distutils wholesale should probably work, as well, but no distutils release is forthcoming. Regards, Martin From fss@forensicstrategy.com Mon Apr 14 22:16:59 2003 From: fss@forensicstrategy.com (Forensic Strategy Newsletter) Date: Mon, 14 Apr 2003 17:16:59 -0400 Subject: [XML-SIG] Forensic Strategy Data Recovery Newsletter: Vol 1 Issue 1 Message-ID: <1050355019.993@forensicstrategy.com> *********************************************************************** Forensic Strategy Data Recovery Newsletter Vol. 1, Issue 1 *********************************************************************** --------- EDITOR'S NOTE ----------------------------------------------- The intent of this newsletter is to educate and inform attorneys about basic computer forensics for cases that involve personal computers or computer evidence. Utilizing the services of a computer forensics specialist can eliminate problems that often occur when forensics is of significant importance to a case: timing, the handling of the data and the possibility of evidence being destroyed. -------- IN THIS ISSUE: ----------------------------------------------- 1. COMMENTARY - Computer Forensics 101: What is Computer Forensics? 2. SPONSOR - Varidev Technology Solutions 3. UPCOMING NEWSLETTER ISSUES - Items you can look forward to in future issues! 4. CONTACT US - For more information on Forensic Strategy Services. ----------------------------------------------------------------------- 1. ==== COMMENTARY ==== * COMPUTER FORENSICS 101: What is Computer Forensics? By: Scott Moulton, Computer Forensic Specialist mailto:scott@forensicfirm.com Forensics, as it relates to computers and data, is the collection and preservation of data to investigate or establish facts for any type of legal purpose. For each case, computer forensics can contain many different types of material and can be gathered from dozens of sources. Information can be limited to what exists on a hard drive and may even include data from the Internet, tapes, CDs, disks or printouts made by a specific computer. Computer forensics is an emerging specialty that has no defined criteria. This makes it difficult to find a person with the knowledge, experience and skills needed to be an expert in this area. Colleges are beginning to recognize this as a growing field and are adding degrees and certification programs to their curriculum. With the speed at which the computer industry changes, it is often a struggle for the legal profession to keep up with all of the new laws established to convict criminals who use technology as a weapon. It is equally challenging to locate a knowledgeable computer specialist that has the interest, expertise and skills in fields other than computer science. Consequently, a computer forensic specialist who has skills in other disciplines such as accounting and/or law, will deliver better results meaning more useful and credible evidence for you. Methodologies are a set of processes that can be applied to any situation. While the tools or items used to lay the groundwork for the discovery phase may vary, the methodology remains the same. Some of these methods are still being developed in the area of computer forensics. Changes are frequent because of new laws that require the way processes are completed. Other changes are due to an ever-evolving technology and the ability to completely remove two or three processes with new software or hardware. Qualified computer forensic specialists will spend considerable time staying in front of the new technology curve. It takes an extreme amount of work to keep up with the changes in the computing industry, as well as, issues involving the law. This is the type of expertise you should seek for assistance with cases requiring computer forensics. Most lawyers have little knowledge about computers and will need guidance as a case develops. They will continually need to discuss the case with a computer forensic specialist and review new material even when it seems unnecessary. When dealing with computers and data, the process of understanding what is achievable and what isn't requires an advanced understanding of technology generally not found outside the professional computer security community. Not only must the computer forensic specialist assist the attorney with what can be done but they must also stand as a credible witness under the pressure and scrutiny of cross examination. During the discovery phase of a case, being a forensic computer specialist can be compared to being a Private Investigator, only the subject matter is mainly dealing with computers and electronic data. Discovery often involves several passes at the data. As new facts are revealed about the case, the old data will need to be reviewed to see what has been discovered and how it is applicable to the case. In some cases, knowing what happened is more important than the actual data itself. Example #1: In a divorce case, a court order was given to the husband with instructions not to delete or destroy any data. The computer was to be picked up by a forensic investigator and reviewed for evidence per the court order. The husband promptly went home and deleted everything on the computer he thought would be incriminating. After examining the computer, it was proven that he purposely deleted data after the court order. Since he violated the court order, this case could have easily escalated into more than just a divorce case for the husband. When the opposing attorney confronted the husband with this fact, the husband quickly decided to settle out of court and agreed to his soon to be ex-wife's demands. Example #2: The majority of work is often discovering how to look at the information and display it so that it makes sense to laymen. This also includes educating the attorney about the technical details so they can decide how to approach the case. It is of no value if the information is so complex that it can not be explained clearly. In a recent case, a CD was stolen from a company. During the discovery period of the case, the defendant was ordered to make an EXACT copy of the original CD and deliver it to the plaintiff the same day. It was noted that one of the files had been changed on the CD. On the CD there were several files that amounted to 500 megabytes. This brand of CD was only able to hold 650 megabytes. The specific file in question was a 200 megabyte file. The defendants claim was that the CD was a CDRW (ReWritable CD) and that the file changed while viewing the CD. In this instance the changed file could not overwrite the existing file, but would be appended to the CD. As there was only 150 megabytes left, there was not enough space to append a 200 megabyte file. The defendant would have needed another 50 megabytes in order to make a change to the file on the same CD. Therefore, this was not an exact copy of the same CD that was taken. Only a computer specialist with experience with a ReWritable CD would have realized this was not possible. The opposing attorney initially accepted the explanation; however, the computer specialist on the team revealed that evidence had been tampered with. More examples and experiences will be discussed in future issues. If you are interested and would like to continue to receive our newsletter, please see our website to sign up for a FREE subscription at: http://www.forensicstrategy.com/contacts.asp ----------------------------------------------------------------------- -------- Sponsored by Varidev Technology Solutions -------------------- Varidev Technology Solutions can develop solutions to help your business operate more efficiently. Varidev is your complete business technology resource for front-end and back-end database development using Microsoft .NET Technology. Varidev has made operations much more efficient for companies like Six Flags and Georgia Pacific, and they can do it for you. Check out amazing demos at http://www.varidev.com ----------------------------------------------------------------------- 3. ==== UPCOMING NEWSLETTER ISSUES ==== * What items are usually found in data recovered * Equipment used for Forensic Storage of Data * Details of Forensic Data Gathering 4. ==== CONTACT US ==== * TECHNICAL QUESTIONS: mailto:info@forensicstrategy.com * COMMENTS OR QUESTIONS ABOUT THIS NEWSLETTER: To suggest a topic for a future issue or to send a comment to the editor email: mailto:comments@forensicstrategy.com * WEBSITE: http://www.forensicstrategy.com * MAILING ADDRESS/PHONE/FAX: Forensic Strategy Services, LLC. 601B Industrial Court Woodstock, Georgia 30189 ph: 770.926.5588 fax: 770.926.7089 * WOULD YOUR COMPANY LIKE TO SPONSOR A FORENSIC STRATEGY DATA RECOVERY NEWSLETTER? Send us an email at mailto:sponsor@forensicstrategy.com ----------------------------------------------------------------------- To receive the latest information about forensic computer technology and news SUBSCRIBE to our FREE email newsletter: http://www.forensicstrategy.com/contacts.asp Thank you for reading The Forensic Strategy Data Recovery Newsletter. __________________________________________________________ Forensic Strategy Services, LLC. 2003 From noreply@sourceforge.net Tue Apr 15 21:43:05 2003 From: noreply@sourceforge.net (SourceForge.net) Date: Tue, 15 Apr 2003 13:43:05 -0700 Subject: [XML-SIG] [ pyxml-Bugs-722097 ] newlines in attribute values not escaped when writing Message-ID: <E195XGj-0001Yw-00@sc8-sf-web1.sourceforge.net> Bugs item #722097, was opened at 2003-04-15 12:43 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=722097&group_id=6473 Category: SAX Group: None Status: Open Resolution: None Priority: 5 Submitted By: Greg Chapman (glchapman) Assigned to: Nobody/Anonymous (nobody) Summary: newlines in attribute values not escaped when writing Initial Comment: Neither xml.sax.writer.XmlWriter (nor PrettyPrinter) nor sax.saxutils.XmlGenerator escapes newlines embedded in attribute values (the values in the attrs passed to startElement). I am no expert on XML, but I believe they should be escaped (see, for example, dom.ext.Printer which apparently does escape them in TranslateCdataAttr). In general, I wonder if there should be some way to specify additional entities which can be passed to the calls to sax.saxutils.escape made by the different sax outputters? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=722097&group_id=6473 From Willems.luc@pandora.be Thu Apr 17 18:51:56 2003 From: Willems.luc@pandora.be (luc willems) Date: Thu, 17 Apr 2003 19:51:56 +0200 Subject: [XML-SIG] Empty dom class returned after parsing a valid XML file Message-ID: <200304171951.57335.Willems.luc@pandora.be> Hello All , i have a problem with pyexpat and sax2 parser of the pyxml-0.8.2 packages. I developed a small script that manipulates some xmls files on my development system (SuSE 8.1 , python 2.2.1 , pyxml 0.8.2 ) . This script together with the test XML files worked fine. But when i moved the script to a Soloris 2.8 with custom build python 2.2.2 and pyxml 0.8.2 it doesn't work any more . to be precise , following code returns a 'None' instead of a DOM class for the "dom" variable --------------------------------------------------------------------------------- #open input file or use STDIN if opts.has_key("--in-file"): try: fi = open(opts["--in-file"], 'rb') except: # print help information and exit: (exctype,value,trace)=sys.exc_info() showerror("FILE",value) sys.exit(2) else: fi = sys.stdin # Try parsing the input file , exit if problems arise try: reader = PyExpat.Reader() dom = reader.fromStream(fi) <---- returns 'None' on solaris !!!! except: (exctype,value,trace)=sys.exc_info() showerror("XML",value) sys.exit(3) if dom == None: showerror("EXPAT","No DOM class after parsing") sys.exit(3) - xml file ------------------------------------ <?xml version='1.0' encoding='UTF-8'?> <config> <master port='12345' host='master.ikke.be'/> <node type='slave'/> <snmp secret='you-dont-know-it' host='mieke.be'/> <statepath>/tmp</statepath> <test><![CDATA[function matchwo(a,b)]]></test> </config> ----------------------------------------------- When i run this code with a simple xml file , it returns 'None' on solaris and a parsed DOM on my linux system. I tried using the PyExpat and Sax2 parsed , both give the same problem. Note : python and pyxml are both compiled with gcc-3.2.2 on solaris luc From Juhapekka Tolvanen <juhtolv@iki.fi> Sun Apr 20 14:48:06 2003 From: Juhapekka Tolvanen <juhtolv@iki.fi> (Juhapekka Tolvanen) Date: Sun, 20 Apr 2003 16:48:06 +0300 Subject: [XML-SIG] I use XBEL!!! Message-ID: <20030420134806.GA12345@verso.st.jyu.fi> Here is my link page. It is partially based on XBEL-technology. http://iki.fi/links/ And it is free as a bird! I want you to mention my link collection here: http://pyxml.sourceforge.net/topics/xbel/ * * * I have some software ideas that would exploit XBEL-technology: HTML2XBEL: Take any HTML-file, pick all links from it and then write them as XBEL-file. XBEL-verifier: Take XBEL-file. Check validity of its URLs. If some URL brings you some WWW-page with redirection, then convert that URL in original XBEL-file (except when explicitly forbidden in configuration file ( see this redirection service: http://iki.fi/ )). If some link does not work, write it down to separated database. If some link seems nonfunctional ten times, remove it completely. Would ypu please implement these ideas as free software? -- Juhapekka "naula" Tolvanen * * http colon slash slash iki dot fi slash juhtolv "Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem." Cicero From Juhapekka Tolvanen <juhtolv@iki.fi> Sun Apr 20 14:51:13 2003 From: Juhapekka Tolvanen <juhtolv@iki.fi> (Juhapekka Tolvanen) Date: Sun, 20 Apr 2003 16:51:13 +0300 Subject: [XML-SIG] Re: I use XBEL!!! In-Reply-To: <20030420134806.GA12345@verso.st.jyu.fi> References: <20030420134806.GA12345@verso.st.jyu.fi> Message-ID: <20030420135113.GA12664@verso.st.jyu.fi> On Sun, 20 Apr 2003, +16:49:47 EEST (UTC +0300), Juhapekka Tolvanen <juhtolv@cc.jyu.fi> pressed some keys: > Here is my link page. It is partially based on XBEL-technology. > > http://iki.fi/links/ Argh! This is the right URL: http://iki.fi/juhtolv/links/ -- Juhapekka "naula" Tolvanen * * http colon slash slash iki dot fi slash juhtolv "Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem." Cicero From jh@web.de Sun Apr 20 14:54:05 2003 From: jh@web.de (Juergen Hermann) Date: Sun, 20 Apr 2003 15:54:05 +0200 Subject: [XML-SIG] I use XBEL!!! In-Reply-To: <20030420134806.GA12345@verso.st.jyu.fi> Message-ID: <E197FGH-00048z-00@smtp.web.de> >Take=20XBEL-file.=20Check=20validity=20of=20its=20URLs. See=20 http://twistedmatrix.com/users/jh.twistd/python/moin.cgi/J_fcrgenHermann Ciao,=20J=FCrgen From noreply@sourceforge.net Mon Apr 21 15:16:33 2003 From: noreply@sourceforge.net (SourceForge.net) Date: Mon, 21 Apr 2003 07:16:33 -0700 Subject: [XML-SIG] [ pyxml-Bugs-725010 ] Script text not contained in script element Message-ID: <E197c5x-0000Nn-00@sc8-sf-web2.sourceforge.net> Bugs item #725010, was opened at 2003-04-21 10:16 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=725010&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: vincent marchetti (vincemarch) Assigned to: Nobody/Anonymous (nobody) Summary: Script text not contained in script element Initial Comment: This appears in PyXML 0.8.2, the xml.dom.html module Because the list under the key 'script' in the dictionary xml.dom.html.HTML_DTD is empty (as defined in the module), when an html file is parsed into a DOM tree the contents of a script element appear as a sibling of the script element, not as a child. The following fixes this behavior at run -time: (before parsing file) ... xml.dom.html.HTML_DTD['script'].append('#PCDATA') ... My own reading of the HTML 4.01 DTD <a href="http://www.w3.org/TR/REC-html40/interact/scripts.html#edef-SCRIPT"> indicates this is the correct behavior. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=725010&group_id=6473 From reagle@mit.edu Mon Apr 21 16:36:06 2003 From: reagle@mit.edu (Joseph Reagle) Date: Mon, 21 Apr 2003 11:36:06 -0400 Subject: [XML-SIG] pyxml minidom: I can remove and append, but not replace Message-ID: <200304211135.59621.reagle@mit.edu> --Boundary-00=_m/Ap+NmbP9tRrN5 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline I have a snippet that replacing a title element with a new element with a modified one (base on an amazon query) -- if that title element already exists. However, I can't do an insertBefore or replace for some odd reason: for property in book_dtd: # step through the DTD feature = _get_feature(property,book) if property == "title": r_title = doc.createElementNS(book_ns,"title") r_title.appendChild(doc.createTextNode(title)) if feature: book.appendChild(r_title) book.removeChild(feature) # book.replaceChild(feature,r_title) else: book.appendChild(r_title) If I were to comment the "book.{append,remove}Child" and uncomment the book.replaceChild I get the following error: Traceback (most recent call last): File "./pybook.py", line 163, in ? bookAugment(doc) File "./pybook.py", line 100, in bookAugment book.replaceChild(feature,r_title) File "/usr/lib/python2.2/site-packages/_xmlplus/dom/minidom.py", line 145, in replaceChild raise xml.dom.NotFoundErr() xml.dom.NotFoundErr: Node does not exist in this context The node is using the same namespace, and I've used the replaceChild in other cases (where I'm actually using a node from a different XML instance!) .py and test file attached --Boundary-00=_m/Ap+NmbP9tRrN5 Content-Type: text/x-python; charset="us-ascii"; name="amazon.py" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="amazon.py" """Python wrapper for Amazon web APIs This module allows you to access Amazon's web APIs, to do things like search Amazon and get the results programmatically. Described here: http://www.amazon.com/webservices You need a Amazon-provided license key to use these services. Follow the link above to get one. These functions will look in several places (in this order) for the license key: - the "license_key" argument of each function - the module-level LICENSE_KEY variable (call setLicense once to set it) - an environment variable called AMAZON_LICENSE_KEY - a file called ".amazonkey" in the current directory - a file called "amazonkey.txt" in the current directory - a file called ".amazonkey" in your home directory - a file called "amazonkey.txt" in your home directory - a file called ".amazonkey" in the same directory as amazon.py - a file called "amazonkey.txt" in the same directory as amazon.py Sample usage: >>> import amazon >>> amazon.setLicense('...') # must get your own key! >>> pythonBooks = amazon.searchByKeyword('Python') >>> pythonBooks[0].ProductName u'Learning Python (Help for Programmers)' >>> pythonBooks[0].URL ... >>> pythonBooks[0].OurPrice ... Other available functions: - browseBestSellers - searchByASIN - searchByUPC - searchByAuthor - searchByArtist - searchByActor - searchByDirector - searchByManufacturer - searchByListMania - searchSimilar - searchByWishlist Other usage notes: - Most functions can take product_line as well, see source for possible values - All functions can take type="lite" to get less detail in results - All functions can take page=N to get second, third, fourth page of results - All functions can take license_key="XYZ", instead of setting it globally - All functions can take http_proxy="http://x/y/z" which overrides your system setting """ __author__ = "Mark Pilgrim (f8dy@diveintomark.org)" __version__ = "0.5" __cvsversion__ = "$Revision: 1.6 $"[11:-2] __date__ = "$Date: 2003/04/04 03:08:34 $"[7:-2] __copyright__ = "Copyright (c) 2002 Mark Pilgrim" __license__ = "Python" # Powersearch and return object type fix by Joseph Reagle <geek@goatee.net> from xml.dom import minidom import os, sys, getopt, cgi, urllib try: import timeoutsocket # http://www.timo-tasi.org/python/timeoutsocket.py timeoutsocket.setDefaultSocketTimeout(10) except ImportError: pass LICENSE_KEY = None HTTP_PROXY = None # don't touch the rest of these constants class AmazonError(Exception): pass class NoLicenseKey(Exception): pass _amazonfile1 = ".amazonkey" _amazonfile2 = "amazonkey.txt" _licenseLocations = ( (lambda key: key, 'passed to the function in license_key variable'), (lambda key: LICENSE_KEY, 'module-level LICENSE_KEY variable (call setLicense to set it)'), (lambda key: os.environ.get('AMAZON_LICENSE_KEY', None), 'an environment variable called AMAZON_LICENSE_KEY'), (lambda key: _contentsOf(os.getcwd(), _amazonfile1), '%s in the current directory' % _amazonfile1), (lambda key: _contentsOf(os.getcwd(), _amazonfile2), '%s in the current directory' % _amazonfile2), (lambda key: _contentsOf(os.environ.get('HOME', ''), _amazonfile1), '%s in your home directory' % _amazonfile1), (lambda key: _contentsOf(os.environ.get('HOME', ''), _amazonfile2), '%s in your home directory' % _amazonfile2), (lambda key: _contentsOf(_getScriptDir(), _amazonfile1), '%s in the amazon.py directory' % _amazonfile1), (lambda key: _contentsOf(_getScriptDir(), _amazonfile2), '%s in the amazon.py directory' % _amazonfile2) ) ## administrative functions def version(): print """PyAmazon %(__version__)s %(__copyright__)s released %(__date__)s """ % globals() ## utility functions def setLicense(license_key): """set license key""" global LICENSE_KEY LICENSE_KEY = license_key def getLicense(license_key = None): """get license key license key can come from any number of locations; see module docs for search order""" for get, location in _licenseLocations: rc = get(license_key) if rc: return rc raise NoLicenseKey, 'get a license key at http://www.amazon.com/webservices' def setProxy(http_proxy): """set HTTP proxy""" global HTTP_PROXY HTTP_PROXY = http_proxy def getProxy(http_proxy = None): """get HTTP proxy""" return http_proxy or HTTP_PROXY def getProxies(http_proxy = None): http_proxy = getProxy(http_proxy) if http_proxy: proxies = {"http": http_proxy} else: proxies = None return proxies def _contentsOf(dirname, filename): filename = os.path.join(dirname, filename) if not os.path.exists(filename): return None fsock = open(filename) contents = fsock.read() fsock.close() return contents def _getScriptDir(): if __name__ == '__main__': return os.path.abspath(os.path.dirname(sys.argv[0])) else: return os.path.abspath(os.path.dirname(sys.modules[__name__].__file__)) class Bag: pass def unmarshal(element): rc = Bag() if isinstance(element, minidom.Element) and (element.tagName == 'Details'): rc.URL = element.attributes["url"].value childElements = [e for e in element.childNodes if isinstance(e, minidom.Element)] if childElements: for child in childElements: key = child.tagName if hasattr(rc, key): if type(getattr(rc, key)) <> type([]): setattr(rc, key, [getattr(rc, key)]) setattr(rc, key, getattr(rc, key) + [unmarshal(child)]) elif child.tagName in ['Details']: # make the first Details element a list setattr(rc,key,[unmarshal(child)]) #dbg: because otherwise 'hasattr' only tests #dbg: on the second occurence: if there's a #dbg: single return to a query, it's not a #dbg: list. This module should always #dbg: return a list of Details objects. else: setattr(rc, key, unmarshal(child)) else: rc = "".join([e.data for e in element.childNodes if isinstance(e, minidom.Text)]) if element.tagName == 'SalesRank': rc = int(rc.replace(',', '')) return rc def buildURL(search_type, keyword, product_line, type, page, license_key): url = "http://xml.amazon.com/onca/xml?v=1.0&f=xml&t=webservices-20" url += "&dev-t=%s" % license_key.strip() url += "&type=%s" % type if page: url += "&page=%s" % page if product_line: url += "&mode=%s" % product_line url += "&%s=%s" % (search_type, urllib.quote(keyword)) #dbg: title searches with a ":" yields no results print url return url ## main functions def search(search_type, keyword, product_line, type="heavy", page=None, license_key = None, http_proxy = None, return_xml = 0): """search Amazon You need a license key to call this function; see http://www.amazon.com/webservices to get one. Then you can either pass it to this function every time, or set it globally; see the module docs for details. Parameters: keyword - keyword to search search_type - in (KeywordSearch, BrowseNodeSearch, AsinSearch, UpcSearch, AuthorSearch, ArtistSearch, ActorSearch, DirectorSearch, ManufacturerSearch, ListManiaSearch, SimilaritySearch) product_line - type of product to search for. restrictions based on search_type UpcSearch - in (music, classical) AuthorSearch - must be "books" ArtistSearch - in (music, classical) ActorSearch - in (dvd, vhs, video) DirectorSearch - in (dvd, vhs, video) ManufacturerSearch - in (electronics, kitchen, videogames, software, photo, pc-hardware) http_proxy (optional) - address of HTTP proxy to use for sending and receiving SOAP messages Returns: list of Bags, each Bag may contain the following attributes: Asin - Amazon ID ("ASIN" number) of this item Authors - list of authors Availability - "available", etc. BrowseList - list of related categories Catalog - catalog type ("Book", etc) CollectiblePrice - ?, format "$34.95" ImageUrlLarge - URL of large image of this item ImageUrlMedium - URL of medium image of this item ImageUrlSmall - URL of small image of this item Isbn - ISBN number ListPrice - list price, format "$34.95" Lists - list of ListMania lists that include this item Manufacturer - manufacturer Media - media ("Paperback", "Audio CD", etc) NumMedia - number of different media types in which this item is available OurPrice - Amazon price, format "$24.47" ProductName - name of this item ReleaseDate - release date, format "09 April, 1999" Reviews - reviews (AvgCustomerRating, plus list of CustomerReview with Rating, Summary, Content) SalesRank - sales rank (integer) SimilarProducts - list of Product, which is ASIN number ThirdPartyNewPrice - ?, format "$34.95" URL - URL of this item """ license_key = getLicense(license_key) url = buildURL(search_type, keyword, product_line, type, page, license_key) proxies = getProxies(http_proxy) u = urllib.FancyURLopener(proxies) usock = u.open(url) xmldoc = minidom.parse(usock) # from xml.dom.ext import PrettyPrint # PrettyPrint(xmldoc) usock.close() if return_xml: return xmldoc else: data = unmarshal(xmldoc).ProductInfo if hasattr(data, 'ErrorMsg'): raise AmazonError, data.ErrorMsg else: return data.Details def searchByKeyword(keyword, product_line="books", type="heavy", page=1, license_key=None, http_proxy=None): return search("KeywordSearch", keyword, product_line, type, page, license_key, http_proxy) def browseBestSellers(browse_node, product_line="books", type="heavy", page=1, license_key=None, http_proxy=None): return search("BrowseNodeSearch", browse_node, product_line, type, page, license_key, http_proxy) def searchByASIN(ASIN, type="heavy", license_key=None, http_proxy=None): return search("AsinSearch", ASIN, None, type, None, license_key, http_proxy) def searchByUPC(UPC, type="heavy", license_key=None, http_proxy=None): return search("UpcSearch", UPC, None, type, None, license_key, http_proxy) def searchByAuthor(author, type="heavy", page=1, license_key=None, http_proxy=None): return search("AuthorSearch", author, "books", type, page, license_key, http_proxy) def searchByArtist(artist, product_line="music", type="heavy", page=1, license_key=None, http_proxy=None): if product_line not in ("music", "classical"): raise AmazonError, "product_line must be in ('music', 'classical')" return search("ArtistSearch", artist, product_line, type, page, license_key, http_proxy) def searchByActor(actor, product_line="dvd", type="heavy", page=1, license_key=None, http_proxy=None): if product_line not in ("dvd", "vhs", "video"): raise AmazonError, "product_line must be in ('dvd', 'vhs', 'video')" return search("ActorSearch", actor, product_line, type, page, license_key, http_proxy) def searchByDirector(director, product_line="dvd", type="heavy", page=1, license_key=None, http_proxy=None): if product_line not in ("dvd", "vhs", "video"): raise AmazonError, "product_line must be in ('dvd', 'vhs', 'video')" return search("DirectorSearch", director, product_line, type, page, license_key, http_proxy) def searchByManufacturer(manufacturer, product_line="pc-hardware", type="heavy", page=1, license_key=None, http_proxy=None): if product_line not in ("electronics", "kitchen", "videogames", "software", "photo", "pc-hardware"): raise AmazonError, "product_line must be in ('electronics', 'kitchen', 'videogames', 'software', 'photo', 'pc-hardware')" return search("ManufacturerSearch", manufacturer, product_line, type, page, license_key, http_proxy) def searchByListMania(listManiaID, type="heavy", page=1, license_key=None, http_proxy=None): return search("ListManiaSearch", listManiaID, None, type, page, license_key, http_proxy) def searchSimilar(ASIN, type="heavy", page=1, license_key=None, http_proxy=None): return search("SimilaritySearch", ASIN, None, type, page, license_key, http_proxy) def searchByWishlist(wishlistID, type="heavy", page=1, license_key=None, http_proxy=None): return search("WishlistSearch", wishlistID, None, type, page, license_key, http_proxy) def searchByPower(keyword, product_line="books", type="heavy", page=1, license_key=None, http_proxy=None, return_xml = 0): return search("PowerSearch", keyword, product_line, type, page, license_key, http_proxy, return_xml) # >>> RecentKing = amazon.searchByPower('author:Stephen King and pubdate:2003') # >>> SnowCrash = amazon.searchByPower('title:Snow Crash') --Boundary-00=_m/Ap+NmbP9tRrN5 Content-Type: text/x-python; charset="us-ascii"; name="pybook.py" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="pybook.py" #!/usr/bin/env python # http://www.python.org/doc/current/lib/dom-node-objects.html def bookAugment(doc): import amazon, time def getText(node): node.normalize() rc = node.firstChild.data return _normalizeWhitespace(rc) def replaceText(feature,value): for node in feature.childNodes: if node.nodeType == node.TEXT_NODE: feature.removeChild(node) print "***value is ", type(value), value feature.appendChild(doc.createTextNode(value)) def buildQuery(query, property, value): if value != "": if query != "": query += " and " query += "%s: %s" %(property, value) return query def _get_feature(property,book): """_get_feature returns a book childNode corresponding to the property. This is extremely clumsy iterating over the books children for every property""" for feature in _childrenElements(book): print "testing '%s' against '%s'" %(property,feature.localName) if feature.localName == property: return feature def _splitTitle(title): try: title,subtitle = getText(ProductName).split(": ",1) except ValueError: title = getText(ProductName) subtitle = "" print "title = '%s', subtitle = '%s'" %(title, subtitle) return title, subtitle _childrenElements = lambda node: [n for n in node.childNodes if n.nodeType == n.ELEMENT_NODE] # is node element _normalizeWhitespace = lambda text: ' '.join(text.split()) _normalizeIsbn = lambda chars: chars.replace('-','') query = "" # the query to pass to Amazon searchByPower bookcase = doc.getElementsByTagName("bookcase").pop() collection = doc.getElementsByTagName("collection").pop() for book in _childrenElements(collection): # Build the query from existing title, isbn, and author for feature in _childrenElements(book): if feature.localName == "title": title = getText(feature) query = buildQuery(query, "title", title) if feature.localName == "authors": for author in _childrenElements(feature): author = getText(author) query = buildQuery(query, "author", author) if feature.localName == "isbn": isbn = _normalizeIsbn(getText(feature)) query = buildQuery(query, "isbn", isbn) # Perform the query print "*** query = ", query.encode('utf-8') try: #results = amazon.searchByPower('author:Stephenson and title:Snow Crash') results = amazon.searchByPower(query,return_xml=1) except amazon.AmazonError, e: print "*** ERRRRRROR", e query = "" # Reset query time.sleep(.7) # Amazon only permits one query per second # Augment Book with results of query by iterating over # the bookcase DTD and replacing/inserting elements # dbg: I know this algorithm sucks. Details = results.getElementsByTagName("Details") if len(Details) == 1: book_ns = "http://periapsis.org/bookcase/" book_dtd = ("title", "subtitle", "authors", "isbn") # book_dtd = ("title", "subtitle", "authors", "binding", "pur_date", # "pur_price", "publisher", "edition", "cr_years", "pub_year", # "isbn", "lccn", "pages", "languages", "genres", "keywords", # "series", "series_num", "condition", "signed", "read", "gift", # "loaned", "rating", "comments") ProductName = results.getElementsByTagName("ProductName")[0] print type(ProductName) title, subtitle = _splitTitle(getText(ProductName).split(": ",1)) for property in book_dtd: # step through the DTD feature = _get_feature(property,book) if property == "title": r_title = doc.createElementNS(book_ns,"title") r_title.appendChild(doc.createTextNode(title)) if feature: book.appendChild(r_title) book.removeChild(feature) # book.replaceChild(feature,r_title) else: book.appendChild(r_title) elif property == "subtitle" and subtitle != "": r_subtitle = doc.createElementNS(book_ns,"subtitle") r_subtitle.appendChild(doc.createTextNode(subtitle)) if feature: book.appendChild(r_subtitle) book.removeChild(feature) else: book.appendChild(r_subtitle) elif property == "authors": # remove my children for author in _childrenElements(feature): feature.removeChild(author) # add the children from the amazon result for r_author in results.getElementsByTagName("Author"): # lowercase the results to the bookcase convention r_author.tagName = "author" feature.appendChild(r_author) elif property == "isbn": r_isbn = results.getElementsByTagName("Isbn")[0] r_isbn.tagName = "isbn" if feature: book.replaceChild(feature,r_isbn) else: book.appendChild(r_isbn) PrettyPrint(bookcase) def print_usage(): print "pybookcase infile.xml outfile.xml" print "pybookcase will augment a bookcase XML file with other information from " print "the python interface" if __name__ == "__main__": import getopt, sys mode = 'xml' try: (options,files) = getopt.getopt (sys.argv[1:],"h") except getopt.error: print_usage() for (option,value) in options: pass if option == '-h': print_usage() try: infd = open(files[0]) except IndexError: infd = sys.stdin try: outfd = open(files[1], 'w') except IndexError: outfd = sys.stdout from xml.dom import minidom from xml.dom.ext import PrettyPrint doc = minidom.parse(infd) bookAugment(doc) infd.close() outfd.close() --Boundary-00=_m/Ap+NmbP9tRrN5 Content-Type: application/x-bookcase; name="test.bc" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="test.bc" <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE bookcase SYSTEM "bookcase.dtd"> <bookcase xmlns="http://periapsis.org/bookcase/" syntaxVersion="2" > <collection unitTitle="Books" title="Converted Books" unit="book" > <book> <title>The Diamond Age A Young Lady's Illustrated Primer Neal Stephenson SciFi Vampire Book J. Gordon Melton History Road Warriors Dreams and Nightmares Along the Information Highway Burstein Kline History --Boundary-00=_m/Ap+NmbP9tRrN5-- From jhooton@asi-ez.com Tue Apr 22 10:26:52 2003 From: jhooton@asi-ez.com (info@asi-ez.com) Date: Tue, 22 Apr 2003 09:26:52 Subject: [XML-SIG] ASI Expands Automation Component Offering Message-ID: PM20009:26:52 AM This is an HTML email message. If you see this, your mail client does not support HTML messages. ------=_NextPart_HQHBWPODIH Content-Type: text/html;charset="iso-8859-1" Content-Transfer-Encoding: 7bit ASI - Automation Component Offering

"ASI Expands Automation Component Offering"

When you go to www.asi-ez.com, you will be surprised to see the many new product additions that have been made recently. Automation Systems Interconnect, Inc. (ASI), is a manufacturer and supplier of high quality, innovative components for industrial and process control and automation projects. You will also find that we have an excellent offering of products for power distribution applications.

The newly updated ASI website includes IEC terminal blocks, compact din-rail mounted power supplies, interface modules, sensors, and sensor accessories, circuit breakers, industrial control relays, signal conditioners, marking systems, wire markers, labels, tools, and ferrules.

For over 4 years, ASI has made it easy for engineers and buyers to locate, specify and purchase high quality components at very competitive prices. In fact, we now have over 10,000 customers visiting our website each month. The feedback from our customers is that not only is the website easy to use, but it saves the customers 20% to 50% when compared to obtaining similar products.

Now, when you go to www.asi-ez.com, you will see several new additions to these product lines. Thank you for continued interest in our products and services.

Simply click on the above products for specifications, availability, pricing and ordering. If you need to speak to application engineering, call us toll free at 1-877-650-5160.

Automation Systems Interconnect, Inc. / P.O. Box 1230 / Carlisle, PA 17013
717-249-5581 / 877-650-5160 � toll free / 717-249-5542 � fax

www.asi-ez.com

sales: dhall@asi-ez.com / customer service: info@asi-ez.com / engineering: mmanning@asi-ez.com

ASI respects your time and your privacy. If you feel you have received this email by mistake or if you no longer want to receive product updates, please click here. However, to ensure that the unsubscribe process has been completed successfully, please allow 2 weeks. We do apologize for any interim emails that are received while we are updating our records.

------=_NextPart_HQHBWPODIH-- From and-xml@doxdesk.com Tue Apr 22 16:22:55 2003 From: and-xml@doxdesk.com (Andrew Clover) Date: Tue, 22 Apr 2003 15:22:55 +0000 Subject: [XML-SIG] pyxml minidom: I can remove and append, but not replace In-Reply-To: <200304211135.59621.reagle@mit.edu> References: <200304211135.59621.reagle@mit.edu> Message-ID: <20030422152255.GA3638@doxdesk.com> Joseph Reagle wrote: > if feature: > book.replaceChild(feature,r_title) > else: > book.appendChild(r_title) Whoops! Very common mistake, this - you've got the order of the parameters to replaceChild the wrong way round. For reasons known only to W3C, the newChild parameter goes first. > xml.dom.NotFoundErr: Node does not exist in this context Because it's looking for 'r_title' to be a child node of 'book' in order to remove it, but it's not there. -- Andrew Clover mailto:and@doxdesk.com http://www.doxdesk.com/ From dkuhlman@cutter.rexx.com Wed Apr 23 01:06:13 2003 From: dkuhlman@cutter.rexx.com (Dave Kuhlman) Date: Tue, 22 Apr 2003 17:06:13 -0700 Subject: [XML-SIG] Python support for Amazon Web services Message-ID: <20030422170613.A8214@cutter.rexx.com> I've developed a small amount of Python support for the XML-over-HTTP mode of Amazon Web services. Amazon describes this as their REST mode. The support I've developed consists mostly consists of Python code that helps to parse and process the Amazon WS XML documents. Also, to produce this support I used generateDS.py (see: http://www.rexx.com/~dkuhlman/#generateDS). Therefore, I needed XML Schema documents. So, being a Calvinist [1], I decided implement generateXsd.py, which extracts an XML Schema from an XML instance document. It's included in the distribution. There is a document describing this support at: http://www.rexx.com/~dkuhlman/amazon_ws_support.html And the distribution file is at: http://www.rexx.com/~dkuhlman/amazon_ws_support-1.0.tar.gz The Amazon WS developer's kit is available at: http://www.amazon.com/webservices - Dave [1] Calvin and Hobbes, the comic strip -- Once, when Calvin's Mom tried to force him to clean up his room, Calvin decided to invent a robot to do the work. After much struggle, Hobbes asked him: "Wouldn't it be less work to just clean your room yourself?" Calvin replied, without looking up: "It's only work if they make you do it." -- Dave Kuhlman dkuhlman@rexx.com http://www.rexx.com/~dkuhlman From tongucyumruk@interaktif.gen.tr Wed Apr 23 12:40:57 2003 From: tongucyumruk@interaktif.gen.tr (=?iso-8859-9?Q?Tongu=E7?= Yumruk) Date: Wed, 23 Apr 2003 14:40:57 +0300 Subject: [XML-SIG] Removing Whitespace Message-ID: <20030423114057.GA770@Serafettin.fazlamesai.net> --gKMricLos+KVdGMg Content-Type: text/plain; charset=iso-8859-9 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, I'm new to Python & XML processing, and I'm in trouble with whitespace. I use xml.dom to process xml and Sax2.Reader() from xml.dom.ext.reader. I don't want the whitespace interpreted as a text node. Although the Reader() class have some option like keepAllWs I don't think it really does what I need. When I browsed the archives the only whitespace problem I can see is at 1999 and there is a strip_whitespace function that solved it. The roblem is: There is no strip_whitespace in Python 2.1.3 (the default in Debian). I know that I can write a function to do this but I want to use the standard approach for it (if possible). Thanks --=20 Love Respect Linux ###########################################################################= ##### If I allowed "next $label" then I'd also have to allow "goto $label", and I don't think you really want that... :-) -- Larry Wall in <1991Mar11.230002.27271@jpl-devvax.jpl.nasa.g= ov> ###########################################################################= ##### Tongu=E7 Yumruk --gKMricLos+KVdGMg Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (GNU/Linux) iD8DBQE+pnvJ1xWu4MLSyoYRApy1AJ9ajzlAGWjJUtvjcaJTczwpsQfdhQCfQCe7 D70qbpseW/1YSYchDWVk3SI= =b2bJ -----END PGP SIGNATURE----- --gKMricLos+KVdGMg-- From Alexandre.Fayolle@logilab.fr Wed Apr 23 12:55:18 2003 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Wed, 23 Apr 2003 13:55:18 +0200 Subject: [XML-SIG] Removing Whitespace In-Reply-To: <20030423114057.GA770@Serafettin.fazlamesai.net> References: <20030423114057.GA770@Serafettin.fazlamesai.net> Message-ID: <20030423115518.GA10523@calvin> On Wed, Apr 23, 2003 at 02:40:57PM +0300, Tongu� Yumruk wrote: > Hi, > > I'm new to Python & XML processing, and I'm in trouble with whitespace. I > use xml.dom to process xml and Sax2.Reader() from xml.dom.ext.reader. I > don't want the whitespace interpreted as a text node. Although the > Reader() class have some option like keepAllWs I don't think it really > does what I need. In order to have Sax handle whitespace for you, you need: * a validating parser * a DTD for your document You need to install python-xml on Debian to have a validating parser (there are no validating parsers in python-xmlbase). With this package, you can get a validating parser with the following code : from xml.sax.sax2ext import XMLValParserFactory parser = XMLValParserFactory.make_parser() Then attach your handlers as usual. Ignorable whitespace should be reported to the ContentHandler.ignorableWhitespace() method if you provided it, and not to the characters() method. -- Alexandre Fayolle LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org D�veloppement logiciel avanc� - Intelligence Artificielle - Formations From tongucyumruk@interaktif.gen.tr Wed Apr 23 14:37:36 2003 From: tongucyumruk@interaktif.gen.tr (=?iso-8859-9?Q?Tongu=E7?= Yumruk) Date: Wed, 23 Apr 2003 16:37:36 +0300 Subject: [XML-SIG] Removing Whitespace In-Reply-To: <20030423115518.GA10523@calvin> References: <20030423114057.GA770@Serafettin.fazlamesai.net> <20030423115518.GA10523@calvin> Message-ID: <20030423133736.GA1185@Serafettin.fazlamesai.net> --BOKacYhQ+x31HxR3 Content-Type: text/plain; charset=iso-8859-9 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Thanks, but I don't use Sax, I just use the Sax2 reader from xml.dom.ext package to build a dom tree. I'm looking for a function that will remove whitespace nodes from my dom tree. Wed, Apr 23, 2003 at 01:55:18PM +0200 Tarihinde Alexandre Fayolle Demi=FEki= : > On Wed, Apr 23, 2003 at 02:40:57PM +0300, Tongu=E7 Yumruk wrote: > > Hi, > >=20 > > I'm new to Python & XML processing, and I'm in trouble with whitespace.= I > > use xml.dom to process xml and Sax2.Reader() from xml.dom.ext.reader. I > > don't want the whitespace interpreted as a text node. Although the > > Reader() class have some option like keepAllWs I don't think it really > > does what I need. >=20 > In order to have Sax handle whitespace for you, you need: > * a validating parser > * a DTD for your document >=20 > You need to install python-xml on Debian to have a validating parser > (there are no validating parsers in python-xmlbase).=20 >=20 > With this package, you can get a validating parser with the following > code : >=20 > from xml.sax.sax2ext import XMLValParserFactory > parser =3D XMLValParserFactory.make_parser() >=20 > Then attach your handlers as usual. Ignorable whitespace should be > reported to the ContentHandler.ignorableWhitespace() method if you > provided it, and not to the characters() method. >=20 > --=20 > Alexandre Fayolle > LOGILAB, Paris (France). > http://www.logilab.com http://www.logilab.fr http://www.logilab.org > D=E9veloppement logiciel avanc=E9 - Intelligence Artificielle - Formations >=20 > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig --=20 Sevgi Sayg=FD Linux ###########################################################################= ##### BOFH excuse #84: =20 Someone is standing on the ethernet cable, causeing a kink in the cable ###########################################################################= ##### Tongu=E7 Yumruk --BOKacYhQ+x31HxR3 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (GNU/Linux) iD8DBQE+ppcg1xWu4MLSyoYRAokEAKDnontQZLgPd0M/ip1JeNqwaZjgHQCgyzi2 1CGOa0Z1TYLDPX+u3gAyOlc= =yPsj -----END PGP SIGNATURE----- --BOKacYhQ+x31HxR3-- From Alexandre.Fayolle@logilab.fr Wed Apr 23 14:47:41 2003 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Wed, 23 Apr 2003 15:47:41 +0200 Subject: [XML-SIG] Removing Whitespace In-Reply-To: <20030423133736.GA1185@Serafettin.fazlamesai.net> References: <20030423114057.GA770@Serafettin.fazlamesai.net> <20030423115518.GA10523@calvin> <20030423133736.GA1185@Serafettin.fazlamesai.net> Message-ID: <20030423134741.GB11877@calvin> On Wed, Apr 23, 2003 at 04:37:36PM +0300, Tongu� Yumruk wrote: > Thanks, but I don't use Sax, I just use the Sax2 reader from xml.dom.ext > package to build a dom tree. I'm looking for a function that will remove > whitespace nodes from my dom tree. Sorry for the too-quick-reading. Have you tried to use the xml.dom.ext.StripXml function on your document? It should do exactly what you want. -- Alexandre Fayolle LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org D�veloppement logiciel avanc� - Intelligence Artificielle - Formations From tongucyumruk@interaktif.gen.tr Wed Apr 23 15:15:06 2003 From: tongucyumruk@interaktif.gen.tr (=?iso-8859-9?Q?Tongu=E7?= Yumruk) Date: Wed, 23 Apr 2003 17:15:06 +0300 Subject: [XML-SIG] Removing Whitespace In-Reply-To: <20030423134741.GB11877@calvin> References: <20030423114057.GA770@Serafettin.fazlamesai.net> <20030423115518.GA10523@calvin> <20030423133736.GA1185@Serafettin.fazlamesai.net> <20030423134741.GB11877@calvin> Message-ID: <20030423141506.GA1341@Serafettin.fazlamesai.net> --6c2NcOVqGQ03X4Wi Content-Type: text/plain; charset=iso-8859-9 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Oh, it's the cure... Thanks a lot... Wed, Apr 23, 2003 at 03:47:41PM +0200 Tarihinde Alexandre Fayolle Demi=FEki= : > On Wed, Apr 23, 2003 at 04:37:36PM +0300, Tongu=E7 Yumruk wrote: > > Thanks, but I don't use Sax, I just use the Sax2 reader from xml.dom.ext > > package to build a dom tree. I'm looking for a function that will remove > > whitespace nodes from my dom tree. >=20 > Sorry for the too-quick-reading. >=20 > Have you tried to use the xml.dom.ext.StripXml function on your > document? It should do exactly what you want. >=20 > --=20 > Alexandre Fayolle > LOGILAB, Paris (France). > http://www.logilab.com http://www.logilab.fr http://www.logilab.org > D=E9veloppement logiciel avanc=E9 - Intelligence Artificielle - Formations >=20 > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig --=20 Sevgi Sayg=FD Linux ###########################################################################= ##### The whole history of computers is rampant with cheerleading at best and bigotry at worst. -- Larry Wall in <199702111730.JAA28598@wall.org> ###########################################################################= ##### Tongu=E7 Yumruk --6c2NcOVqGQ03X4Wi Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (GNU/Linux) iD8DBQE+pp/q1xWu4MLSyoYRAiEmAKCDl/df3CLNN3jNVxusmCjRvhQxYgCg7Z/r roLSHEfUSeRZSyTDbBtnY/U= =s3XN -----END PGP SIGNATURE----- --6c2NcOVqGQ03X4Wi-- From nick@isilon.com Thu Apr 24 20:17:40 2003 From: nick@isilon.com (Nicholas M. Kirsch) Date: Thu, 24 Apr 2003 12:17:40 -0700 (PDT) Subject: [XML-SIG] Subclassing xml.dom.minidom Message-ID: <20030424121313.R7584@fireblade.isilon.com> I have looked for information (er.. I've googled) and have not found any examples of anyone subclassing minidom. I want to modify the behavior of some of the methods in Node, etc. Naturally, I want xml.dom.minidom.parse to return objects which are all subclassed as well. Is this possible? What is the best way to do this? Thanks. Nick From fdrake@acm.org Thu Apr 24 20:31:45 2003 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 24 Apr 2003 15:31:45 -0400 Subject: [XML-SIG] Subclassing xml.dom.minidom In-Reply-To: <20030424121313.R7584@fireblade.isilon.com> References: <20030424121313.R7584@fireblade.isilon.com> Message-ID: <16040.15265.307632.75018@grendel.zope.com> Nicholas M. Kirsch writes: > I have looked for information (er.. I've googled) and have not found any > examples of anyone subclassing minidom. > > I want to modify the behavior of some of the methods in Node, etc. > > Naturally, I want xml.dom.minidom.parse to return objects which are all > subclassed as well. There is some information in the Python/XML Reference Guide, included with the PyXML sources. I don't know that there's a formatted version of the document available online, unfortunately. Depending on just which node types you want to affect, this may be fairly easy, or it may be more painful. Changing Text nodes will be most difficult if you intend to use xml.dom.xmlbuilder without further subclassing. At the very least, you can expect to subclass the DOMImplementation and Document classes (to control the factory functions) and the specific node types you want to affect. I'll be glad to try and answer further questions, but you'll need to be more specific. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From estevens@rasserver.net Tue Apr 22 03:19:09 2003 From: estevens@rasserver.net (Elijah Stevens) Date: Tue, 22 Apr 2003 02:19:09 +0000 Subject: [XML-SIG] Boost your confidence in bed! ud634bxl2y613 In-Reply-To: Message-ID:

Don't want any more adverts? Simply click here.

From andrew.ittner@usa.net Sat Apr 26 16:53:52 2003 From: andrew.ittner@usa.net (Andrew Ittner) Date: Sat, 26 Apr 2003 08:53:52 -0700 Subject: [XML-SIG] Round-tripping HTML fragment to XML node Message-ID: <000b01c30c0c$08c3dae0$7b7ba8c0@attbi.com> I have an HTML fragment:

this is
a paragraph

I want to convert it to XHTML:

this is
a paragraph

And store it as a Node in an XML document. Then, I want to pull the Node back out and convert back to an HTML fragment. I want to do this automatically (not using regexp, etc.) because: -each HTML fragment is a separate weblog entry (for Yet Another Weblog Maker (c)) -I store it in XML to publish using XSL -even though I'm probably not going to use any other singletons besides
& , I want the parser to handle conversion to well-formed XML automagically -my HTML viewer (courtesy wxPython) needs HTML and cannot understand XHTML I tried converting the fragment to a full XHTML document (works OK), pulling the body element's content nodes out (can't), and copying them to the XML doc (nope). And the reverse is failing on converting XHTML back to HTML. Since I've only used PyXML's xml.dom.minidom for XML work, I haven't yet figured out how to do this. Any ideas? Andrew Ittner http://rhymingpanda.com/ From tpassin@comcast.net Sat Apr 26 18:20:55 2003 From: tpassin@comcast.net (Thomas B. Passin) Date: Sat, 26 Apr 2003 13:20:55 -0400 Subject: [XML-SIG] Round-tripping HTML fragment to XML node References: <000b01c30c0c$08c3dae0$7b7ba8c0@attbi.com> Message-ID: <003501c30c18$31f20980$6401a8c0@tbp1> [Andrew Ittner] > I have an HTML fragment:

this is
a paragraph

> I want to convert it to XHTML:

this is
a paragraph

> And store it as a Node in an XML document. > > Then, I want to pull the Node back out and convert back to an HTML fragment. > > I want to do this automatically (not using regexp, etc.) because: > -each HTML fragment is a separate weblog entry (for Yet Another Weblog Maker > (c)) > -I store it in XML to publish using XSL If you are going to use xslt to produce the results, you do not have to do anything different except use the html output method in your stylesheet. That will output the html that you want. Otherwise, I would try adding a single space to those normally empty nodes - e.g.

Strictly speaking, the img and br elements are supposed to be empty, but most browsers will accept a space and I bet the wxWindows (which is wrapped by wxPython) viewer will too. That would be a lot easier than fussing around or writing a custom serializer. > -even though I'm probably not going to use any other singletons besides
> & , I want the parser to handle conversion to well-formed XML > automagically > -my HTML viewer (courtesy wxPython) needs HTML and cannot understand XHTML > Cheers, Tom P From noreply@sourceforge.net Mon Apr 28 11:08:45 2003 From: noreply@sourceforge.net (SourceForge.net) Date: Mon, 28 Apr 2003 03:08:45 -0700 Subject: [XML-SIG] [ pyxml-Bugs-728810 ] Forgotten print statement Message-ID: Bugs item #728810, was opened at 2003-04-28 12:08 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=728810&group_id=6473 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Andy-Kim M�ller (malamute) Assigned to: Nobody/Anonymous (nobody) Summary: Forgotten print statement Initial Comment: Nothing big, but in the wddx.py file is a forgotten print statement. It seems like a rest of debugging the script. If you have now special characters the library prints everytime something to stdout. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=728810&group_id=6473 From orders@cmtinc.com Wed Apr 30 01:53:19 2003 From: orders@cmtinc.com (CMT) Date: Tue, 29 Apr 2003 16:53:19 -0800 Subject: [XML-SIG] Educational Software Message-ID: <200304301159.h3UBbOS1020961@www.cmtinc.com> To: Teachers and Students

To: Teachers and Students

CMT provides FREE GIS, SURVEY and GPS software for educational purposes.� Many Universities have already taken advantage of this FREE software.� Like the University of Florida, National Institute of Oceanography, Stark State College of Technology, University of Otago New Zealand, Cleveland State College, Western Illinois University, Wallace State Community College, and ITT Technical Institute.� Many High Schools use our free software for their math and science departments as well.

GIS:

PC-GIS 3.2 = a Lightweight GIS program that is very good for teaching purposes.

DOWNLOAD GIS SOFTWARE HERE!

SURVEY:

CMT CogoCAD 2.3 = COGO functions, CAD Import/Export, editing Total Station Survey data.

CMT Contour/Volume 2.4 = Contour & Volume calculation, 3D modeling/viewing for.

DOWNLOAD SURVEY SOFTWARE HERE!

GPS:

PC-MAP 1.5 = View and edit GPS data, transfer data.

NavView = Store points, navigate to waypoints, load BaseMaps for Win CE PDA.

DOWNLOAD GPS SOFTWARE HERE!

CMT also offers Professional grade software for your PC or Windows CE PDA. Educational discounts are available.� Visit cmtinc.com for more information.

Please help us by forwarding this message to all of your educational affiliates (world-wide).

If you would like to be removed from our mailing list, please reply with remove as the subject.

Sincerely,

CMT Software Team

Contact: steved@cmtinc.com