From brian@sweetapp.com Fri May 2 19:54:03 2003 From: brian@sweetapp.com (Brian Quinlan) Date: Fri, 02 May 2003 11:54:03 -0700 Subject: [XML-SIG] ANN: Pyana 0.8.0 Released In-Reply-To: <5het8b.tqs.ln@jbmail.com> Message-ID: <004401c310dc$33926eb0$21795418@dell1700> ANN: Pyana 0.8.0 Released Download it from: http://sourceforge.net/project/showfiles.php?group_id=28142 Changes: - Node-sets can now be passed as XPath extension arguments - A few more DOM methods are exposed - Experimental support for transformation to DOM - Support for Python wide Unicode builds - Support for external schema validation - Bug fixes - Mac OS X build support - Updated for Xalan 1.5/Xerces 2.2 - Python2.3b1 support (including support for the new bool type in XPath expressions) - The getEncoding(), getPublicID() and getSystemID() methods of InputSource objects are now honored - More detailed (and Python customizable) information presented when DOM nodes are printed What is Pyana? Pyana is a Python interface to the Xalan C XSLT processor. Some usage examples are provided here: http://pyana.sourceforge.net/examples/ From noreply@sourceforge.net Mon May 5 04:01:43 2003 From: noreply@sourceforge.net (SourceForge.net) Date: Sun, 04 May 2003 20:01:43 -0700 Subject: [XML-SIG] [ pyxml-Bugs-732458 ] Source RPM Message-ID: Bugs item #732458, was opened at 2003-05-04 22:01 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=732458&group_id=6473 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Christopher Blunck (blunck2) Assigned to: Nobody/Anonymous (nobody) Summary: Source RPM Initial Comment: The i386 RPMs are linked against a Python interpreter that supports UCS2 encoding. The latest RedHat uses Python 2.2.2, which supports UCS4. It would be helpful to provide a source RPM that we RedHat folks could use so that we can build our own PyXML that works with our Python. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=732458&group_id=6473 From Ian.Sparks@etrials.com Mon May 5 21:45:48 2003 From: Ian.Sparks@etrials.com (Ian Sparks) Date: Mon, 5 May 2003 16:45:48 -0400 Subject: [XML-SIG] Newbie : Identifying characters that will choke XML parser Message-ID: <41A1CBC76FDECC42B67946519C6677A9A20B28@pippin.int.etrials.com> I build an XML document from data pulled from a database. Sometimes the = database contains "bad" characters, how can I filter out the bad and = properly encode the good? Here's my example program...I'm sure I'm missing something fundamental.=20 from xml.dom.minidom import parseString bad_string =3D chr(133) + chr(6) + chr(180) doc1 =3D parseString('') docNode =3D doc1.childNodes[0] docNode.setAttributeNS(None,'a',unicode(bad_string,'iso-8859-1')) source =3D doc1.toxml('iso-8859-1') #result is badly formed xml From martin@v.loewis.de Mon May 5 23:09:40 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 06 May 2003 00:09:40 +0200 Subject: [XML-SIG] Newbie : Identifying characters that will choke XML parser In-Reply-To: <41A1CBC76FDECC42B67946519C6677A9A20B28@pippin.int.etrials.com> References: <41A1CBC76FDECC42B67946519C6677A9A20B28@pippin.int.etrials.com> Message-ID: "Ian Sparks" writes: > I build an XML document from data pulled from a database. Sometimes > the database contains "bad" characters, how can I filter out the bad > and properly encode the good? If you want to completely discard the bad characters, I recommend use use string.replace. Regards, Martin From Ian.Sparks@etrials.com Tue May 6 13:17:15 2003 From: Ian.Sparks@etrials.com (Ian Sparks) Date: Tue, 6 May 2003 08:17:15 -0400 Subject: [XML-SIG] Newbie : Identifying characters that will choke XML parser Message-ID: <41A1CBC76FDECC42B67946519C6677A95AE4F8@pippin.int.etrials.com> Hmm...as I feared. As I discover new XML-chokers I'm building up a = library like : #Remove ACK's (I've seen it!) w =3D w.replace(chr(6),'') #Remove ... characters (again, I've seen it) w =3D w.replace(chr(133),'') I was hoping to find some way of identifying everything that will choke = my XML, some rule to auto-filter out the nastiness.. -----Original Message----- From: Martin v. L=F6wis [mailto:martin@v.loewis.de] Sent: Monday, May 05, 2003 6:10 PM To: Ian Sparks Cc: Xml-Sig (E-mail) Subject: Re: [XML-SIG] Newbie : Identifying characters that will choke XML parser "Ian Sparks" writes: > I build an XML document from data pulled from a database. Sometimes > the database contains "bad" characters, how can I filter out the bad > and properly encode the good? If you want to completely discard the bad characters, I recommend use use string.replace. Regards, Martin From tug@wilson.co.uk Tue May 6 13:32:47 2003 From: tug@wilson.co.uk (John Wilson) Date: Tue, 6 May 2003 13:32:47 +0100 Subject: [XML-SIG] Newbie : Identifying characters that will choke XML parser References: <41A1CBC76FDECC42B67946519C6677A95AE4F8@pippin.int.etrials.com> Message-ID: <000b01c313cb$99f3dad0$640a0b0a@handel> Ian, If the character is in the following ranges it's illegal: c < 0X0009 c > 0X000A and c < 0X000D c > 0X000D and c < 0X0020 c > 0XD7FF and c < 0XE000 c > 0XFFFD John Wilson The Wilson Partnership http://www.wilson.co.uk ----- Original Message ----- From: "Ian Sparks" To: "Martin v. Löwis" Cc: "Xml-Sig (E-mail)" Sent: Tuesday, May 06, 2003 1:17 PM Subject: RE: [XML-SIG] Newbie : Identifying characters that will choke XML parser Hmm...as I feared. As I discover new XML-chokers I'm building up a library like : #Remove ACK's (I've seen it!) w = w.replace(chr(6),'') #Remove ... characters (again, I've seen it) w = w.replace(chr(133),'') I was hoping to find some way of identifying everything that will choke my XML, some rule to auto-filter out the nastiness.. From joakley@solutioninc.com Tue May 6 14:40:32 2003 From: joakley@solutioninc.com (James Oakley) Date: Tue, 6 May 2003 10:40:32 -0300 Subject: [XML-SIG] Newbie : Identifying characters that will choke XML parser In-Reply-To: <41A1CBC76FDECC42B67946519C6677A95AE4F8@pippin.int.etrials.com> References: <41A1CBC76FDECC42B67946519C6677A95AE4F8@pippin.int.etrials.com> Message-ID: <200305061040.36857.joakley@solutioninc.com> =2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tuesday 06 May 2003 09:17 am, Ian Sparks wrote: > Hmm...as I feared. As I discover new XML-chokers I'm building up a library > like : > > #Remove ACK's (I've seen it!) > w =3D w.replace(chr(6),'') > #Remove ... characters (again, I've seen it) > w =3D w.replace(chr(133),'') > > I was hoping to find some way of identifying everything that will choke my > XML, some rule to auto-filter out the nastiness.. I had the same trouble with a xmlrpclib.loads(). Here's what I did: # These characters are forbidden in XML. We'll just drop them badchars =3D (0, 1, 2, 3, 4, 5, 6, 7, 8, 11, 12, 14, 15, 16, 17, 18, 19, 20= , \ 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31) # Remove invalid characters data =3D filter(lambda x: ord(x) not in badchars, data) # Convert 8-bit characters to numeric entities data =3D re.sub(r'[\x7F-\xFF]', lambda m: '&#%d;' % ord(m.group(0)), data ) Hope that helps, =2D --=20 James Oakley Engineering - SolutionInc Ltd. joakley@solutioninc.com http://www.solutioninc.com =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (GNU/Linux) iD8DBQE+t7tR+FOexA3koIgRAoXmAKCSPJhyTIX/s3jWewvJf1n1l4dn8gCghr1X kZ6QtVmIR33bJAtXaceH/L0=3D =3DE+Xs =2D----END PGP SIGNATURE----- From Ian.Sparks@etrials.com Tue May 6 17:45:33 2003 From: Ian.Sparks@etrials.com (Ian Sparks) Date: Tue, 6 May 2003 12:45:33 -0400 Subject: [XML-SIG] Newbie : Identifying characters that will choke XML parser Message-ID: <41A1CBC76FDECC42B67946519C6677A9A20B29@pippin.int.etrials.com> Thank you James & John your solutions allow me to filter out what should = be marked as "bad" characters. However, I'm having real problems with character conversions. I'm = building an xml document using minidom and setAttributeNS() I want to be able to do something like : from xml.dom.minidom import parseString doc1 =3D parseString('') docNode =3D doc1.childNodes[0] docNode.setAttributeNS(None,'a',chr(180)) source =3D doc1.toxml('iso-8859-1') and have source contain : =20 without getting UnicodeErrors from codecs.py on toxml() and without = ending up with : =20 Either this is really hard or, more likely, I'm really ignorant. From tug@wilson.co.uk Tue May 6 21:55:40 2003 From: tug@wilson.co.uk (John Wilson) Date: Tue, 6 May 2003 21:55:40 +0100 Subject: [XML-SIG] Newbie : Identifying characters that will choke XML parser References: <41A1CBC76FDECC42B67946519C6677A9A20B29@pippin.int.etrials.com> Message-ID: <098f01c31411$daa26060$640a0b0a@handel> I've done some poking around with minidom (I have never used it before). It would appear that it does (correctly) replace & with & but it does not like characters with values > 127 and does not replace them with numeric character entities. I would suggest that you try using the full DOM implementation. John Wilson The Wilson Partnership http://www.wilson.co.uk ----- Original Message ----- From: "Ian Sparks" To: "James Oakley" ; ; "John Wilson" Sent: Tuesday, May 06, 2003 5:45 PM Subject: RE: [XML-SIG] Newbie : Identifying characters that will choke XML parser Thank you James & John your solutions allow me to filter out what should be marked as "bad" characters. However, I'm having real problems with character conversions. I'm building an xml document using minidom and setAttributeNS() I want to be able to do something like : from xml.dom.minidom import parseString doc1 = parseString('') docNode = doc1.childNodes[0] docNode.setAttributeNS(None,'a',chr(180)) source = doc1.toxml('iso-8859-1') and have source contain : without getting UnicodeErrors from codecs.py on toxml() and without ending up with : Either this is really hard or, more likely, I'm really ignorant. From dholub@intersight.com Wed May 7 00:06:18 2003 From: dholub@intersight.com (Deirdre Holub) Date: Tue, 6 May 2003 16:06:18 -0700 Subject: [XML-SIG] broken setup.py for pyxml-0.8.2 on OSX Message-ID: <5804E20A-8017-11D7-BCC4-000A957665F6@intersight.com> I had to modify the setup.py by adding: import distutils to get setup.py to run on my OSX box. From mark@easymailings.com Wed May 7 01:13:09 2003 From: mark@easymailings.com (Mark Bucciarelli) Date: Tue, 6 May 2003 20:13:09 -0400 Subject: [XML-SIG] Is this a memory leak? Message-ID: <200305062013.10358.mark@easymailings.com> #!/usr/bin/python import xml.sax import StringIO import gc gc.set_debug(gc.DEBUG_SAVEALL) class TestHandler(xml.sax.handler.ContentHandler): all_things = {} def __init__(self): TestHandler.all_things[id(self)] = 1 def __del__(self): del TestHandler.all_things[id(self)] for i in range(400): parser = xml.sax.make_parser() t = TestHandler() parser.setContentHandler(t) input = xml.sax.xmlreader.InputSource() input .setByteStream(StringIO.StringIO('bad soap request')) try: parser.parse(input) except: pass gc.collect() print 'garbage', len(gc.garbage) # 11970 print len(TestHandler.all_things) # 400 Moving the parser = xml.sax.make_parser() out of the loop changes the print statement output 0 and 1. Mark P.S. Occurs on both: python 2.2.1, Linux, has .../site-packages/_xmlplus python 2.2.2, XP, doesn't have .../site-packages/_xmlplus P.P.S While working on this problem, I found a great blog entry by Jeremy Hylton on debugging memory leaks at: http://www.python.org/~jeremy/weblog/030410.html Looks like the next release of ZODB (and Zope) will have some leaks plugged. From tpassin@comcast.net Wed May 7 07:10:51 2003 From: tpassin@comcast.net (Thomas B. Passin) Date: Wed, 07 May 2003 02:10:51 -0400 Subject: [XML-SIG] Newbie : Identifying characters that will choke XML parser References: <41A1CBC76FDECC42B67946519C6677A9A20B29@pippin.int.etrials.com> Message-ID: <003001c3145f$695b36e0$6401a8c0@tbp1> Ian, You need to put a unicode character in there to begin with - docNode.setAttributeNS(None,'a',unicode('\xb4','iso-8859-1')) chr(xxx) does not do this for you. Cheers, Tom P [Ian Sparks] Thank you James & John your solutions allow me to filter out what should be marked as "bad" characters. However, I'm having real problems with character conversions. I'm building an xml document using minidom and setAttributeNS() I want to be able to do something like : from xml.dom.minidom import parseString doc1 = parseString('') docNode = doc1.childNodes[0] docNode.setAttributeNS(None,'a',chr(180)) source = doc1.toxml('iso-8859-1') and have source contain : without getting UnicodeErrors from codecs.py on toxml() and without ending up with : From mark@easymailings.com Wed May 7 14:26:02 2003 From: mark@easymailings.com (Mark Bucciarelli) Date: Wed, 7 May 2003 09:26:02 -0400 Subject: [XML-SIG] Is this a memory leak? In-Reply-To: <200305062013.10358.mark@easymailings.com> References: <200305062013.10358.mark@easymailings.com> Message-ID: <200305070926.02347.mark@easymailings.com> More info ... According to a comment on SOAPpy bug #585837, the reset() method in expatreader.py creates a bound function loop. I'm going to try and figure out a patch and then submit something to the pyxml sourceforge site. Since the same behavior occurs whether or not I have _xmlplus installed, should I submit one to the python sourceforge site as well? Mark From noreply@sourceforge.net Wed May 7 15:06:32 2003 From: noreply@sourceforge.net (SourceForge.net) Date: Wed, 07 May 2003 07:06:32 -0700 Subject: [XML-SIG] [ pyxml-Bugs-733890 ] Memory Leak in expatparser.py Message-ID: Bugs item #733890, was opened at 2003-05-07 10:06 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=733890&group_id=6473 Category: SAX Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mark Bucciarelli (mbucc) Assigned to: Nobody/Anonymous (nobody) Summary: Memory Leak in expatparser.py Initial Comment: Looks like there is a bound-function loop created in ExpatReader.reset() method. I've attached a script that demonstrates the leak as well as a work around. Here are some comments from pywebsvcs bug #585837 on sourceforge: ... printing out the contents of gc.garbage shows >, >, ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=733890&group_id=6473 From phthenry@earthlink.net Wed May 7 17:04:18 2003 From: phthenry@earthlink.net (Paul Tremblay) Date: Wed, 7 May 2003 12:04:18 -0400 Subject: [XML-SIG] checking a string for well-formedness Message-ID: <20030507120418.Q13810@localhost.localdomain> I need to check a string for well-formedness. I stumbed across the fact that you can use expat directly, so I devised this code, which works, so long as unicode and entities aren't used: import xml.parsers.expat parser = xml.parsers.expat.ParserCreate() import sys def validate(data): parser.Parse(data) try: parser.Parse(data) return 0 except xml.parsers.expat.ExpatError: sys.stderr.write('tagging text will result in invalid XML\n') return 1 data = 'texttext,' validate(data) The function validate returns 0 in this case. However, if I try this: data = u'texttext\u201c' I get the following error: Traceback (most recent call last): File "/home/paul/lib/python/paul/xml/expat.py", line 50, in ? parser.Parse(data) UnicodeError: ASCII encoding error: ordinal not in range(128) Any idea what is going on here? I have re-written the function so that it it writes the string to a file, and then I use SAX to parse the file. If SAX fails, I know I have ill-formed XML. However, this second solution is a kludge. I would like to be able to test the string directly. Thanks Paul -- ************************ *Paul Tremblay * *phthenry@earthlink.net* ************************ From mal@lemburg.com Wed May 7 17:57:07 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Wed, 07 May 2003 18:57:07 +0200 Subject: [XML-SIG] Is this a memory leak? In-Reply-To: <200305070926.02347.mark@easymailings.com> References: <200305062013.10358.mark@easymailings.com> <200305070926.02347.mark@easymailings.com> Message-ID: <3EB93AE3.8010400@lemburg.com> Mark Bucciarelli wrote: > More info ... > > According to a comment on SOAPpy bug #585837, the reset() method in > expatreader.py creates a bound function loop. > > I'm going to try and figure out a patch and then submit something to > the pyxml sourceforge site. > > Since the same behavior occurs whether or not I have _xmlplus > installed, should I submit one to the python sourceforge site as > well? I remember having seen reports of memory leaks in pyexpat. No idea whether they've been fixed, though. You may want to try this against Python CVS. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, May 07 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ EuroPython 2003, Charleroi, Belgium: 48 days left From mark@easymailings.com Wed May 7 21:46:31 2003 From: mark@easymailings.com (Mark Bucciarelli) Date: Wed, 7 May 2003 16:46:31 -0400 Subject: [XML-SIG] Two SimpleXMLRPCServer questions In-Reply-To: <003001c2e732$091a3a10$21795418@dell1700> References: <003001c2e732$091a3a10$21795418@dell1700> Message-ID: <200305071646.31073.mark@easymailings.com> On Monday 10 March 2003 1:22 pm, Brian Quinlan wrote: > > The other question I have is how to make this server stop (and > > perhaps finish it's work) when someone presses a control-C in the > > console. > > Hmmmm....the fact that this doesn't work better already is probably > a bug. Just noticed that this behavior is different on Linux than on Windows. On Linux (2.2.1), ctrl-c stops the server. On Windows (2.2.2), it does not. Mark From jedp@ilm.com Wed May 7 23:34:31 2003 From: jedp@ilm.com (Jed Parsons) Date: Wed, 7 May 2003 15:34:31 -0700 (PDT) Subject: [XML-SIG] endElement handler for XMLValParser? Message-ID: <200305072234.PAA15537@gleason.lucasdigital.com> Hola - I'm writing my first handler using the validating parser from xml.sax.sax2exts.XMLValParserFactory.make_parser(). I can't figure out how to handle end tags for elements. I've set up a content handler that extends xml.sax.handler.ContentHandler. I use the parser's setContentHandler method and parse. My startElement, characters, and ignorableWhitespace methods all get called, but my endElement is sadly neglected. A peek at _xmlplus/sax/drivers/drv_xmlproc_val.py finds no end element handlers for this parser. (CVS info: drv_xmlproc_val.py,v 1.9 2001/12/30 12:13:45 loewis) What to do? Thanks for any help, Jed -- Jed Parsons Industrial Light + Magic (415) 448-2974 grep(do{for(ord){(!$_&&print"$s\n")||(($O+=(($_-1)%6+1)and grep(vec($s,$O++,1)=1,1..int(($_-6*6-1)/6))))}},(split(//, "++,++2-27,280,481=1-7.1++2,800+++2,8310/1+4131+1++2,80\0. What!?"))); From fredrik@pythonware.com Thu May 8 10:54:57 2003 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 8 May 2003 11:54:57 +0200 Subject: [XML-SIG] Re: checking a string for well-formedness References: <20030507120418.Q13810@localhost.localdomain> Message-ID: Paul Tremblay wrote: > import xml.parsers.expat > parser = xml.parsers.expat.ParserCreate() > import sys > > def validate(data): > parser.Parse(data) > try: > parser.Parse(data) > return 0 > except xml.parsers.expat.ExpatError: > sys.stderr.write('tagging text will result in invalid XML\n') > return 1 > > data = 'texttext,' > validate(data) > > The function validate returns 0 in this case. or raise an exception, if you don't remove the first call to parser.Parse(data). unfortunately, even if you remove that line, the function may still return 0 for invalid XML snippets, e.g: > data = 'texttext,' to fix this, you have to tell the parser that you won't call it again with more data: parser.Parse(data, 1) > However, if I try this: > > data = u'texttext\u201c' > > I get the following error: > > Traceback (most recent call last): > File "/home/paul/lib/python/paul/xml/expat.py", line 50, in ? > parser.Parse(data) > UnicodeError: ASCII encoding error: ordinal not in range(128) > > Any idea what is going on here? the parse function requires an 8-bit string, and Python defaults to ASCII when converting Unicode to 8-bit data. the simplest way to work around this is to convert the string to the XML default encoding (utf-8) on the way in: def validate(data): try: if isinstance(data, type(u"")): data = data.encode("utf-8") parser.Parse(data, 1) return 0 except xml.parsers.expat.ExpatError: sys.stderr.write('tagging text will result in invalid XML\n') return 1 From nick@isilon.com Thu May 8 17:10:27 2003 From: nick@isilon.com (Nicholas M. Kirsch) Date: Thu, 8 May 2003 09:10:27 -0700 (PDT) Subject: [XML-SIG] Subclassing xml.dom.minidom In-Reply-To: <16040.15265.307632.75018@grendel.zope.com> References: <20030424121313.R7584@fireblade.isilon.com> <16040.15265.307632.75018@grendel.zope.com> Message-ID: <20030508090222.N582@fireblade.isilon.com> > Depending on just which node types you want to affect, this may be > fairly easy, or it may be more painful. Changing Text nodes will be > most difficult if you intend to use xml.dom.xmlbuilder without further > subclassing. > > At the very least, you can expect to subclass the DOMImplementation > and Document classes (to control the factory functions) and the > specific node types you want to affect. > > I'll be glad to try and answer further questions, but you'll need to > be more specific. My desire is to modify the behavior of getAttribute and add inheritance functionality such that an element inherits its parents attributes (if not defined) and other wackiness. I have subclassed DOMImplementation and Document, but using xml.dom.minidom things still don't work properly. I modified xml.dom.expatbuilder.theDOMImplementation to be an instance of my DOMImplementation. When I use xml.dom.minidom.parse, I successfully retrieve an instance of my Document. However, the Elements are not subclasses, despite having overriden the Document.createElement method. Looking at expatbuilder.py, I have found quite a few cases in which it instantiates an Element by using minidom.Element() instead of theDOMImplementation.createElement. The same behavior is repeated for Attributes and Text. This seems like a bug to me?! Am I going about this incorrectly? Thanks. Nick From jedp@ilm.com Thu May 8 19:02:31 2003 From: jedp@ilm.com (Jed Parsons) Date: Thu, 8 May 2003 11:02:31 -0700 (PDT) Subject: [XML-SIG] end element handling with XMLValidator? In-Reply-To: <4528886@toto.iv> Message-ID: <200305081802.LAA19259@gleason.lucasdigital.com> I'm unable to get a hold of 'end element' events from the validating parser. Can anyone explain to me what I'm doing wrong? Here is a test to show you what's happening: 1. I extend the xmlproc.Application class, which is what xmlval.ValidatingApp does: from xml.parsers.xmlproc import xmlproc class PleaseOhPleaseWork(xmlproc.Application): def handle_data(self, data, start, end): print "%d chars of good data: [%s]" % (end-start, data[start:end]) def handle_ignorable_data(self, data, start, end): print "%d chars of trash: [%s]" % (end-start, data[start:end]) def handle_start_tag(self, name, attrs): print "start", name def handle_end_tag(self, name): print "end", name 2. I use this test xml file: Glug is the son of Zeus 3. I can run two tests at the python console. Using xmlproc.XMLProcessor, I get my start and end handlers. Using xmlval.XMLValidator, I only get the start handler: > python Python 2.1.3 (#1, Apr 22 2002, 18:24:35) [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-98)] on linux2 Type "copyright", "credits" or "license" for more information. >>> from Test import PleaseOhPleaseWork >>> from xml.parsers.xmlproc import xmlproc >>> from xml.parsers.xmlproc import xmlval >>> p_noval = xmlproc.XMLProcessor() >>> p_noval.set_application(PleaseOhPleaseWork()) >>> p_noval.parse_resource('test.xml') start foo 2 chars of good data: [ ] start para 28 chars of good data: [ Glug is the son of Zeus ] end para 1 chars of good data: [ ] end foo >>> p_val = xmlval.XMLValidator() >>> p_val.set_application(PleaseOhPleaseWork()) >>> p_val.parse_resource('test.xml') start foo 2 chars of trash: [ ] start para 28 chars of good data: [ Glug is the son of Zeus ] 1 chars of trash: [ ] >>> This is starting to drive me a little bit insane. Can somebody explain to me what is going on here? Thanks, Jed I wrote yesterday (taking a different approach): > > Hola - > > I'm writing my first handler using the validating parser from > xml.sax.sax2exts.XMLValParserFactory.make_parser(). I can't figure > out how to handle end tags for elements. > > I've set up a content handler that extends > xml.sax.handler.ContentHandler. > > I use the parser's setContentHandler method and parse. > > My startElement, characters, and ignorableWhitespace methods all get > called, but my endElement is sadly neglected. > > A peek at _xmlplus/sax/drivers/drv_xmlproc_val.py finds no end element > handlers for this parser. > > (CVS info: drv_xmlproc_val.py,v 1.9 2001/12/30 12:13:45 loewis) > > What to do? > > Thanks for any help, > > Jed -- Jed Parsons Industrial Light + Magic (415) 448-2974 grep(do{for(ord){(!$_&&print"$s\n")||(($O+=(($_-1)%6+1)and grep(vec($s,$O++,1)=1,1..int(($_-6*6-1)/6))))}},(split(//, "++,++2-27,280,481=1-7.1++2,800+++2,8310/1+4131+1++2,80\0. What!?"))); From phthenry@earthlink.net Thu May 8 23:08:38 2003 From: phthenry@earthlink.net (Paul Tremblay) Date: Thu, 8 May 2003 18:08:38 -0400 Subject: [XML-SIG] Re: checking a string for well-formedness In-Reply-To: References: <20030507120418.Q13810@localhost.localdomain> Message-ID: <20030508180838.D29954@localhost.localdomain> > the parse function requires an 8-bit string, and Python defaults > to ASCII when converting Unicode to 8-bit data. I must be dense when it comes to unicode. So Python converts unicode to a 7-bit (ASCII) string? You solution worked, but then I immediately ame up ith a new problem when I tried to test the speed of this funciton: # assume the same exact funtion from below, which I cut and pasted for j in range(10): data = u'text\u201cthext,' validate(data) The first time the string is tested, it comes out as valid. But every single instance afterwards comes out all ill-formed XML. Thanks Paul On Thu, May 08, 2003 at 11:54:57AM +0200, Fredrik Lundh wrote: > To: xml-sig@python.org > From: "Fredrik Lundh" > Subject: [XML-SIG] Re: checking a string for well-formedness > Date: Thu, 8 May 2003 11:54:57 +0200 > > Paul Tremblay wrote: > > > import xml.parsers.expat > > parser = xml.parsers.expat.ParserCreate() > > import sys > > > > def validate(data): > > parser.Parse(data) > > try: > > parser.Parse(data) > > return 0 > > except xml.parsers.expat.ExpatError: > > sys.stderr.write('tagging text will result in invalid XML\n') > > return 1 > > > > data = 'texttext,' > > validate(data) > > > > The function validate returns 0 in this case. > > or raise an exception, if you don't remove the first call to > parser.Parse(data). > > unfortunately, even if you remove that line, the function may > still return 0 for invalid XML snippets, e.g: > > > data = 'texttext,' > > to fix this, you have to tell the parser that you won't call > it again with more data: > > parser.Parse(data, 1) > > > However, if I try this: > > > > data = u'texttext\u201c' > > > > I get the following error: > > > > Traceback (most recent call last): > > File "/home/paul/lib/python/paul/xml/expat.py", line 50, in ? > > parser.Parse(data) > > UnicodeError: ASCII encoding error: ordinal not in range(128) > > > > Any idea what is going on here? > > the parse function requires an 8-bit string, and Python defaults > to ASCII when converting Unicode to 8-bit data. > > the simplest way to work around this is to convert the string to > the XML default encoding (utf-8) on the way in: > > def validate(data): > try: > if isinstance(data, type(u"")): > data = data.encode("utf-8") > parser.Parse(data, 1) > return 0 > except xml.parsers.expat.ExpatError: > sys.stderr.write('tagging text will result in invalid XML\n') > return 1 > > > > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig -- ************************ *Paul Tremblay * *phthenry@earthlink.net* ************************ From fredrik@pythonware.com Thu May 8 23:16:26 2003 From: fredrik@pythonware.com (Fredrik Lundh) Date: Fri, 9 May 2003 00:16:26 +0200 Subject: [XML-SIG] Re: Re: checking a string for well-formedness References: <20030507120418.Q13810@localhost.localdomain> <20030508180838.D29954@localhost.localdomain> Message-ID: (please don't top-post) Paul Tremblay wrote: > > the parse function requires an 8-bit string, and Python defaults > > to ASCII when converting Unicode to 8-bit data. > > I must be dense when it comes to unicode. So Python converts unicode > to a 7-bit (ASCII) string? if you're using a Unicode string where Python expects an 8-bit string, Python refuses to guess, and raises an exception if the Unicode string contains anything that's not plain ASCII. > You solution worked, but then I immediately ame up ith a new problem > when I tried to test the speed of this funciton: > > # assume the same exact funtion from below, which I cut and pasted > for j in range(10): > data = u'text\u201cthext,' > validate(data) > > The first time the string is tested, it comes out as valid. But every > single instance afterwards comes out all ill-formed XML. You have to create a new parser for each run (my mistake; I'd already fixed two bugs in your code, and missed the third one ;-) > > def validate(data): > > try: > > if isinstance(data, type(u"")): > > data = data.encode("utf-8") + + parser = xml.parsers.expat.ParserCreate() > > parser.Parse(data, 1) > > return 0 > > except xml.parsers.expat.ExpatError: > > sys.stderr.write('tagging text will result in invalid XML\n') > > return 1 From phthenry@earthlink.net Fri May 9 02:51:52 2003 From: phthenry@earthlink.net (Paul Tremblay) Date: Thu, 8 May 2003 21:51:52 -0400 Subject: [XML-SIG] Re: Re: checking a string for well-formedness In-Reply-To: References: <20030507120418.Q13810@localhost.localdomain> <20030508180838.D29954@localhost.localdomain> Message-ID: <20030508215151.F29954@localhost.localdomain> On Fri, May 09, 2003 at 12:16:26AM +0200, Fredrik Lundh wrote: > > (please don't top-post) > > Paul Tremblay wrote: > > > > the parse function requires an 8-bit string, and Python defaults > > > to ASCII when converting Unicode to 8-bit data. > > > > I must be dense when it comes to unicode. So Python converts unicode > > to a 7-bit (ASCII) string? > > if you're using a Unicode string where Python expects an 8-bit > string, Python refuses to guess, and raises an exception if the > Unicode string contains anything that's not plain ASCII. > This makes a bit more sense. I'll have to read up on encoding. > > You solution worked, but then I immediately ame up ith a new problem > > when I tried to test the speed of this funciton: > > > > # assume the same exact funtion from below, which I cut and pasted > > for j in range(10): > > data = u'text\u201cthext,' > > validate(data) > > > > The first time the string is tested, it comes out as valid. But every > > single instance afterwards comes out all ill-formed XML. > > You have to create a new parser for each run (my mistake; I'd already > fixed two bugs in your code, and missed the third one ;-) > Thanks! I thought that it would take a lot of time to create a new instance each time (don't know why). It takes only one second on my 100 mhz machine to test my string 1,000 times. This method is much faster than my regular expression hack. Paul > > > def validate(data): > > > try: > > > if isinstance(data, type(u"")): > > > data = data.encode("utf-8") > > + + parser = xml.parsers.expat.ParserCreate() > > > > parser.Parse(data, 1) > > > return 0 > > > except xml.parsers.expat.ExpatError: > > > sys.stderr.write('tagging text will result in invalid XML\n') > > > return 1 > > > > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig -- ************************ *Paul Tremblay * *phthenry@earthlink.net* ************************ From Alexandre.Fayolle@logilab.fr Fri May 9 07:34:50 2003 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Fri, 9 May 2003 08:34:50 +0200 Subject: [XML-SIG] end element handling with XMLValidator? In-Reply-To: <200305081802.LAA19259@gleason.lucasdigital.com> References: <200305081802.LAA19259@gleason.lucasdigital.com> Message-ID: <20030509063450.GH20935@calvin> On Thu, May 08, 2003 at 11:02:31AM -0700, Jed Parsons wrote: > > I'm unable to get a hold of 'end element' events from the validating > parser. Can anyone explain to me what I'm doing wrong? I'm not able to reproduce your problem using either python2.1 or python2.2 and pyxml 0.8.2, on a Linux machine. There's been similar reports in the past, see bug report #658932 on sourceforge, available at the following URL: http://sourceforge.net/tracker/index.php?func=detail&aid=658932&group_id=6473&atid=106473 -- Alexandre Fayolle LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Développement logiciel avancé - Intelligence Artificielle - Formations From martin@v.loewis.de Fri May 9 13:57:40 2003 From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 09 May 2003 14:57:40 +0200 Subject: [XML-SIG] Re: checking a string for well-formedness In-Reply-To: <20030508180838.D29954@localhost.localdomain> References: <20030507120418.Q13810@localhost.localdomain> <20030508180838.D29954@localhost.localdomain> Message-ID: <3EBBA5C4.6070406@v.loewis.de> Paul Tremblay wrote: > I must be dense when it comes to unicode. So Python converts unicode > to a 7-bit (ASCII) string? In some cases, yes. If you use an API function that requires a byte string, such as file.write, it converts to byte strings using the system default encoding, which is ASCII. The resulting strings are still 8-bit strings (i.e. byte strings), since your computer cannot represent 7-bit quantities. However, for each byte, the MSB will be 0. > The first time the string is tested, it comes out as valid. But every > single instance afterwards comes out all ill-formed XML. The parser maintains internal state, to remember where inside the document it is. When parse completes, the state says "at the end of the document". It is an error to provide more markup at this point. You either need to throw away the parser object and create a new one, or reset the parser object that you already have. Regards, Martin From steve@bizcardscd.com Fri May 9 15:24:20 2003 From: steve@bizcardscd.com (Steve Scott (BizCardsCD.com)) Date: Fri, 09 May 2003 10:24:20 -0400 Subject: [XML-SIG] Books on CD or DVD Message-ID: <023401c31636$af1e00d0$651cd642@StevenLeonScott> Hello, I just visited your website and wanted to take a minute to introduce = myself and Disc Media Manufacturing. We offer a variety of products and = services your currently using. Disc Media Manufacturing is a full service CD-rom and DVD replication = facility that offers in house mastering, CD-ROM, DVD-5, DVD-9 and DVD-10 replication, custom package design and printing, fully automated and hand packaging, as well = as fulfillment and direct mail services. Disc Media Manufacturing currently = has capacity to manufacture over 13,000,000 CD=92s a month. We have also added = additional DVD-5, DVD-9 and DVD-10 capacity to meet your needs. We have the experience and ability to replicate your CD/DVD title with = the highest commitment to quality, security, and on time delivery! The = bottom line, Disc Media offers the most completive rates, on time delivery and = the best customer service in the business! I hope to learn more about your CD/DVD replication needs. I would like = the opportunity to earn your business! Call or reply to this e-mail for a complete price list and or additional information! Thank you for your time and interest in Disc Media. Sincerely, Steve Scott Disc Media Manufacturing 3820 E. 5th Street Long Beach, CA 90814 PH # 562-787-0527 From phthenry@earthlink.net Fri May 9 20:53:36 2003 From: phthenry@earthlink.net (Paul Tremblay) Date: Fri, 9 May 2003 15:53:36 -0400 Subject: [XML-SIG] state of py-xml? Message-ID: <20030509155336.G29954@localhost.localdomain> What is the state of the py.xml package? I downloaded the most recent version about a year ago, and upgraded my python to 2.2. However, I got frustrated with bugs and was told I needed to use python 2.1 in order to get SAX to work. Is it recommended to update both my version of python and the XML tools at this point? Paul -- ************************ *Paul Tremblay * *phthenry@earthlink.net* ************************ From carpet@firework.org Sun May 11 06:49:30 2003 From: carpet@firework.org (Danny Clark) Date: Sun, 11 May 2003 01:49:30 -0400 Subject: [XML-SIG] Link Sharing Message-ID: <005c01c31781$e2572d90$6601a8c0@danny> I did a web search on interior design and your site was one of those = listed. I think the content of our website is similar enough to yours = that our visitors would benefit from us sharing links. Therefore, I would like to make the proposal that we each put a link on = our website to the other=92s site. Hopefully, this will increase the = traffic of both sites and provide interest to our readers. If you are interested, send me the exact URL that you would like me to = use and a 1-5 word description. I intend to add it to the links page at http://www.rugs-direct.com/links = which has a link from our homepage at http://www.rugs-direct.com Below is the information for my website: Link Info: http://www.rugs-direct.com Description: Area Rugs For your convenience, here is the HTML Code: Area Rugs Thank you for your time. Karen Clark Rugs Direct From jensj@fysik.dtu.dk Mon May 12 15:23:12 2003 From: jensj@fysik.dtu.dk (Jens Jorgen Mortensen) Date: Mon, 12 May 2003 16:23:12 +0200 Subject: [XML-SIG] Floating exception on alpha machine Message-ID: <200305121623.12111.jensj@bose.fysik.dtu.dk> Hi, I am trying to install PyXML-0.8.2 on an alpha. After doing a build and an install, I try to import xml.xpath. The result is: Python 2.2.1 (#1, Jun 4 2002, 15:33:18) [C] on osf1V4 Type "help", "copyright", "credits" or "license" for more information. >>> import xml.xpath Floating exception (core dumped) When I installed PyXML on my linux machine, there were no problems. I don't know if it is important, but during build on the alpha, I got these two warnings: building '_xmlplus.parsers.pyexpat' extension creating build creating build/temp.osf1-V4.0-alpha-2.2 cc -DNDEBUG -O -Olimit 1500 -DXML_NS=1 -DXML_DTD=1 -DBYTEORDER=1234 -DXML_CONTEXT_BYTES=1024 -Iextensions/expat/lib -I/usr/local/include/python2.2 -c extensions/pyexpat.c -o build/temp.osf1-V4.0-alpha-2.2/pyexpat.o cc: Warning: extensions/expat/lib/expat.h, line 656: In this declaration, the enum "XML_Status" is not defined. (undefenum) XMLPARSEAPI(enum XML_Status) ^ cc -DNDEBUG -O -Olimit 1500 -DXML_NS=1 -DXML_DTD=1 -DBYTEORDER=1234 -DXML_CONTEXT_BYTES=1024 -Iextensions/expat/lib -I/usr/local/include/python2.2 -c extensions/expat/lib/xmlparse.c -o build/temp.osf1-V4.0-alpha-2.2/xmlparse.o cc: Warning: extensions/expat/lib/expat.h, line 656: In this declaration, the enum "XML_Status" is not defined. (undefenum) XMLPARSEAPI(enum XML_Status) ^ Does anybody know what could be wrong? Jens J. Mortensen From martin@v.loewis.de Mon May 12 22:05:05 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 12 May 2003 23:05:05 +0200 Subject: [XML-SIG] Floating exception on alpha machine In-Reply-To: <200305121623.12111.jensj@bose.fysik.dtu.dk> References: <200305121623.12111.jensj@bose.fysik.dtu.dk> Message-ID: Jens Jorgen Mortensen writes: > Does anybody know what could be wrong? No. Can you report a debugger backtrace? Regards, Martin From jberry@sandia.gov Mon May 12 23:04:03 2003 From: jberry@sandia.gov (Jon Berry) Date: Mon, 12 May 2003 16:04:03 -0600 Subject: [XML-SIG] best current way to access OO XML Schema features Message-ID: <3EC01A53.7050006@sandia.gov> I am interested in accessing the object-oriented features of XML Schema (from SOX, melded into XSDL as I understand it) from Python. I read that the Xerces parser has the required support, so I tried hooking up to that. The process had me compile Xerces-c, then try to hook up to it with Pirxx. For reasons involving versioning, I haven't been able to do it yet. On another tack, I looked at XSV, but all I could find was a status page. It claims to conform to XML Schema Part 1: Structures, but makes no claim about Part 2, so that gave me pause. Some of my limited reading suggests that I might still be all right (still having access to OO features), but I'm not sure since Part 1 seems to reference Part 2. Hopefully I am overlooking an easy solution. I've got the most recent PyXML and 4Suite installed. What's the easiest way to get access from Python to the most complete XML Schema parser? -Jon Berry From jedp@ilm.com Mon May 12 23:31:18 2003 From: jedp@ilm.com (Jed Parsons) Date: Mon, 12 May 2003 15:31:18 -0700 (PDT) Subject: [XML-SIG] end element handling with XMLValidator? In-Reply-To: <20030509063450.GH20935@calvin> References: <200305081802.LAA19259@gleason.lucasdigital.com> <20030509063450.GH20935@calvin> Message-ID: <200305122231.PAA41347@gleason.lucasdigital.com> Thanks - updating my PyXML to 0.8.2 did the trick. j Alexandre Fayolle writes: > On Thu, May 08, 2003 at 11:02:31AM -0700, Jed Parsons wrote: > >=20 > > I'm unable to get a hold of 'end element' events from the validatin= g > > parser. Can anyone explain to me what I'm doing wrong? >=20 > I'm not able to reproduce your problem using either python2.1 or > python2.2 and pyxml 0.8.2, on a Linux machine.=20 >=20 > There's been similar reports in the past, see bug report #658932 on > sourceforge, available at the following URL:=20 >=20 > http://sourceforge.net/tracker/index.php?func=3Ddetail&aid=3D658932&g= roup_id=3D6473&atid=3D106473 >=20 >=20 > --=20 > Alexandre Fayolle > LOGILAB, Paris (France). > http://www.logilab.com http://www.logilab.fr http://www.logilab.or= g > D=E9veloppement logiciel avanc=E9 - Intelligence Artificielle - Forma= tions --=20 Jed Parsons Industrial Light + Magic (415) 448-2974=20 =09 =20 grep(do{for(ord){(!$_&&print"$s\n")||(($O+=3D(($_-1)%6+1)and grep(vec($s,$O++,1)=3D1,1..int(($_-6*6-1)/6))))}},(split(//, "++,++2-27,280,481=3D1-7.1++2,800+++2,8310/1+4131+1++2,80\0. What!?"))= );=20 From jberry@sandia.gov Mon May 12 23:33:58 2003 From: jberry@sandia.gov (Jon Berry) Date: Mon, 12 May 2003 16:33:58 -0600 Subject: [XML-SIG] XSLT sorting by "authors" element, with multiple authors Message-ID: <3EC02156.7040304@sandia.gov> The XSLT sorting articles I've seen on the web don't address the issue of sorting citations by "authors," where the latter is a group of authors (not the single authors typical of web examples). For example:
"An Article"
... I'd like to sort by the last name, alphabetically by first differing author. So the algorithm would be: * within each article, sort authors by lastname * to compare Article A with Article B: * Look at last names of first author (if different, comparison done) * else if first authors are the same, look at second authors, * etc. * if not distinguished, go on to next sorting key (say 'title') Noting of course that we might be comparing articles with different numbers of authors. In initial searches, it looked like the xsl:for-each-group and/or xsl:function constructs might help, but they don't seem to be supported by the current PyXML/4Suite implementations. So, with the constraint that I'm trying to avoid buying a book for now, is this doable using templates, easy, and currently implementable with free software? thanks, Jon Berry From tpassin@comcast.net Tue May 13 02:42:19 2003 From: tpassin@comcast.net (Thomas B. Passin) Date: Mon, 12 May 2003 21:42:19 -0400 Subject: [XML-SIG] XSLT sorting by "authors" element, with multiple authors References: <3EC02156.7040304@sandia.gov> Message-ID: <001e01c318f0$e6ed60c0$6401a8c0@tbp1> [Jon Berry] >... > In initial searches, it looked like the xsl:for-each-group > and/or xsl:function constructs might help, but they don't seem > to be supported by the current PyXML/4Suite implementations. > > So, with the constraint that I'm trying to avoid buying a book for now, > is this doable using templates, easy, and currently implementable > with free software? > Best to go to the Mulberry xslt list with xslt questions! The two instructions you mention are not in xslt1.0 but rather in the draft xslt2.0. It seems unlikely that the 4Thought people will ever have much interest in getting 4xslt conformant with xslt2.0, so you need another approach. Your problem is a grouping problem, and most grouping problems can be solved with xslt1.0 - you have to learn a few patterns that people have discovered, and sometimes be a little creative, but you can generally solve them. Check the FAQs, including Jeni Tennison's web site. If all else fails, you can accomplish your task by using a 2-pass process. Use one stylesheet to do an intermediate transform, then a final one to do the rest (which may at that point just be the sorting). The latest version of Mike Kay's Saxon supports good parts of the draft 2.0 Rec. You could use it from Python by means of a system call. Bear in mind, though, that xslt2.0 is not completely stable yet, and also there are some subtle (and some not-so-subtle, but those are less likely to cause problems) differences between similar constructions in 1.0 and 2.0. Cheers, Tom P From phthenry@earthlink.net Tue May 13 06:37:51 2003 From: phthenry@earthlink.net (Paul Tremblay) Date: Tue, 13 May 2003 01:37:51 -0400 Subject: [XML-SIG] XSLT sorting by "authors" element, with multiple authors In-Reply-To: <001e01c318f0$e6ed60c0$6401a8c0@tbp1> References: <3EC02156.7040304@sandia.gov> <001e01c318f0$e6ed60c0$6401a8c0@tbp1> Message-ID: <20030513013751.I29954@localhost.localdomain> On Mon, May 12, 2003 at 09:42:19PM -0400, Thomas B. Passin wrote: > > [Jon Berry] > >... > > In initial searches, it looked like the xsl:for-each-group > > and/or xsl:function constructs might help, but they don't seem > > to be supported by the current PyXML/4Suite implementations. > > > > So, with the constraint that I'm trying to avoid buying a book for now, > > is this doable using templates, easy, and currently implementable > > with free software? > > > > Best to go to the Mulberry xslt list with xslt questions! The two > instructions you mention are not in xslt1.0 but rather in the draft xslt2.0. > It seems unlikely that the 4Thought people will ever have much interest in > getting 4xslt conformant with xslt2.0, so you need another approach. Why is this? Paul > > Your problem is a grouping problem, and most grouping problems can be solved > with xslt1.0 - you have to learn a few patterns that people have discovered, > and sometimes be a little creative, but you can generally solve them. Check > the FAQs, including Jeni Tennison's web site. > > If all else fails, you can accomplish your task by using a 2-pass process. > Use one stylesheet to do an intermediate transform, then a final one to do > the rest (which may at that point just be the sorting). > > The latest version of Mike Kay's Saxon supports good parts of the draft 2.0 > Rec. You could use it from Python by means of a system call. Bear in mind, > though, that xslt2.0 is not completely stable yet, and also there are some > subtle (and some not-so-subtle, but those are less likely to cause problems) > differences between similar constructions in 1.0 and 2.0. > > Cheers, > > Tom P > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig -- ************************ *Paul Tremblay * *phthenry@earthlink.net* ************************ From jensj@fysik.dtu.dk Tue May 13 13:04:11 2003 From: jensj@fysik.dtu.dk (Jens Jorgen Mortensen) Date: Tue, 13 May 2003 14:04:11 +0200 Subject: [XML-SIG] Floating exception on alpha machine In-Reply-To: References: <200305121623.12111.jensj@bose.fysik.dtu.dk> Message-ID: <200305131404.11795.jensj@bose.fysik.dtu.dk> Martin v. L=F6wis: > Jens Jorgen Mortensen writes: > > Does anybody know what could be wrong? > > No. Can you report a debugger backtrace? I have included my best shot at a backtrace to the end of this mail. I a= m=20 afraid it is not very useful - the debugger reads the symbols before I im= port=20 xml.xpath! - how should this be done? =20 On an alpha, it is my experience that the "floating point overflow" error= =20 often indicates that a floating point operation was performed on some=20 uninitialized data. =20 Jens J=F8rgen Backtrace: Welcome to the Ladebug Debugger Version 4.0-49 ------------------ object file name: /usr/local/bin/python2 Reading symbolic information ...done (ladebug) run Python 2.2.1 (#1, Jun 4 2002, 15:33:18) [C] on osf1V4 Type "help", "copyright", "credits" or "license" for more information. >>> import xml.xpath Thread received signal FPE stopped at [ UnknownProcedure16FromFile31(...) 0x1200710c8] Information: An type was presented during execution of the prev= ious=20 command. For complete type information on this symbol, recompilation of = the=20 program will be necessary. Consult the compiler man pages for details on= =20 producing full symbol table information using the -g (and -gall for cxx)=20 flags. (ladebug) where >0 0x1200710c8 in UnknownProcedure16FromFile31(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #1 0x1200a4170 in UnknownProcedure14FromFile45(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #2 0x1200a43c8 in UnknownProcedure15FromFile45(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #3 0x1200a4bd4 in PyNumber_Multiply(0x1400172e0, 0x1400172e0, 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #4 0x120074fa8 in UnknownProcedure12FromFile33(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #5 0x12007c234 in PyEval_EvalCodeEx(0x1400172e0, 0x1400172e0, 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #6 0x120074408 in PyEval_EvalCode(0x1400172e0, 0x1400172e0, 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #7 0x12002f420 in PyImport_ExecCodeModuleEx(0x1400172e0, 0x1400172e0, 0x= 10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #8 0x12002fca4 in UnknownProcedure14FromFile11(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #9 0x1200307b8 in UnknownProcedure18FromFile11(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #10 0x12002ff20 in UnknownProcedure15FromFile11(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #11 0x120030854 in UnknownProcedure18FromFile11(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #12 0x120031da4 in UnknownProcedure27FromFile11(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #13 0x120031660 in UnknownProcedure25FromFile11(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #14 0x120031064 in PyImport_ImportModuleEx(0x1400172e0, 0x1400172e0, 0x10= ,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #15 0x120068e24 in UnknownProcedure0FromFile28(0x1400172e0, 0x1400172e0, = 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #16 0x12006f790 in PyCFunction_Call(0x1400172e0, 0x1400172e0, 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #17 0x1200a7da4 in PyObject_Call(0x1400172e0, 0x1400172e0, 0x10, 0x140038= 090,=20 0x1, 0x30) in /usr/local/bin/python2 #18 0x12007ce10 in PyEval_CallObjectWithKeywords(0x1400172e0, 0x1400172e0= ,=20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #19 0x120078da0 in UnknownProcedure12FromFile33(0x1400172e0, 0x1400172e0,= =20 0x10, 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #20 0x12007c234 in PyEval_EvalCodeEx(0x1400172e0, 0x1400172e0, 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #21 0x120074408 in PyEval_EvalCode(0x1400172e0, 0x1400172e0, 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #22 0x120017320 in UnknownProcedure36FromFile3(0x1400172e0, 0x1400172e0, = 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #23 0x120015694 in PyRun_InteractiveOneFlags(0x1400172e0, 0x1400172e0, 0x= 10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #24 0x120015330 in PyRun_InteractiveLoopFlags(0x1400172e0, 0x1400172e0, 0= x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #25 0x120015108 in PyRun_AnyFileExFlags(0x1400172e0, 0x1400172e0, 0x10,=20 0x140038090, 0x1, 0x30) in /usr/local/bin/python2 #26 0x120014080 in Py_Main(0x1400172e0, 0x1400172e0, 0x10, 0x140038090, 0= x1,=20 0x30) in /usr/local/bin/python2 #27 0x120013570 in main(0x1400172e0, 0x1400172e0, 0x10, 0x140038090, 0x1,= =20 0x30) in /usr/local/bin/python2 #28 0x1200134f8 in __start(0x1400172e0, 0x1400172e0, 0x10, 0x140038090, 0= x1,=20 0x30) in /usr/local/bin/python2 From tpassin@comcast.net Tue May 13 14:33:37 2003 From: tpassin@comcast.net (Thomas B. Passin) Date: Tue, 13 May 2003 09:33:37 -0400 Subject: [XML-SIG] XSLT sorting by "authors" element, with multiple authors References: <3EC02156.7040304@sandia.gov> <001e01c318f0$e6ed60c0$6401a8c0@tbp1> <20030513013751.I29954@localhost.localdomain> Message-ID: <001201c31954$42501e40$6401a8c0@tbp1> [Paul Tremblay] > > > > Best to go to the Mulberry xslt list with xslt questions! The two > > instructions you mention are not in xslt1.0 but rather in the draft xslt2.0. > > It seems unlikely that the 4Thought people will ever have much interest in > > getting 4xslt conformant with xslt2.0, so you need another approach. > > Why is this? > Read the archives of xml-dev for the last few months, or of the xslt list for the last 6 months, and you will see a lot of anguish over the complexity of combining XSL Schema types with xslt and xpath (sometimes the discussion is more focused on XQuery, but there is a close relationship because XQuery's needs have strongly influenced xpath2). Uche has said many times that he thinks that xpath2/xslt2 are complex abominations (my words, trying to capture what I think he expressed) and that he has no interest in implementing them. He is interested in an alternative approach to a new xpath/xslt and there is a small list where it is being discussed. Cheers, Tom P From martin@v.loewis.de Tue May 13 16:13:39 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 13 May 2003 17:13:39 +0200 Subject: [XML-SIG] Floating exception on alpha machine In-Reply-To: <200305131404.11795.jensj@bose.fysik.dtu.dk> References: <200305121623.12111.jensj@bose.fysik.dtu.dk> <200305131404.11795.jensj@bose.fysik.dtu.dk> Message-ID: Jens Jorgen Mortensen writes: > I have included my best shot at a backtrace to the end of this mail. > I am afraid it is not very useful - the debugger reads the symbols > before I import xml.xpath! - how should this be done? If you can't get your debugger to load symbols for shared libraries afterwards, you could either try a different debugger (gdb comes to mind), or you could try to link all code with the executable. For this, you would have to relink the python binary, after putting initpyexpat into config.c (pyexpat is likely the extension module for which it can't find the symbols). HTH, Martin From fredrik@pythonware.com Tue May 13 17:22:28 2003 From: fredrik@pythonware.com (Fredrik Lundh) Date: Tue, 13 May 2003 18:22:28 +0200 Subject: [XML-SIG] ANN: ElementTree 1.1 Message-ID: The Element type is a simple but flexible container object, designed to store hierarchical data structures, such as simplified XML infosets, in memory. The ElementTree toolkit contains an Element implementation in Python, and code to read XML and HTML files into trees of Element objects, and write them out as XML. You can get the ElementTree toolkit from: http://effbot.org/downloads#elementtree Changes include XML literal factory, a self-contained ElementTree module, use ASCII as default encoding, and various minor speed and memory optimizations, etc. See the README file for details: http://effbot.org/downloads/index.cgi/elementtree-1.1-20030511.zip/README Brief documentation and some code samples (including an XML-RPC unmarshaller in 16 lines) are available from: http://effbot.org/zone/element-index.htm For more background, see Uche Ogbuji's xml.com article: "Simple XML Processing With elementtree" http://www.xml.com/pub/a/2003/02/12/py-xml.html Report bugs to this list, and/or (preferred) directly to me. enjoy /F From jh@web.de Tue May 13 18:10:06 2003 From: jh@web.de (Juergen Hermann) Date: Tue, 13 May 2003 19:10:06 +0200 Subject: [XML-SIG] best current way to access OO XML Schema features In-Reply-To: <3EC01A53.7050006@sandia.gov> Message-ID: On=20Mon,=2012=20May=202003=2016:04:03=20-0600,=20Jon=20Berry=20wrote: >I=20read=20that=20the=20Xerces=20parser=20has=20the=20required >support,=20so=20I=20tried=20hooking=20up=20to=20that.=20=20The >process=20had=20me=20compile=20Xerces-c,=20then=20try >to=20hook=20up=20to=20it=20with=20Pirxx.=20=20For=20reasons=20involving >versioning,=20I=20haven't=20been=20able=20to=20do=20it=20yet. What=20was=20the=20exact=20version=20combination,=20including=20platform=20= (gcc?). Ciao,=20J=FCrgen From kajiyama@grad.sccs.chukyo-u.ac.jp Tue May 13 18:14:03 2003 From: kajiyama@grad.sccs.chukyo-u.ac.jp (Tamito KAJIYAMA) Date: Wed, 14 May 2003 02:14:03 +0900 Subject: [XML-SIG] XSLT sorting by "authors" element, with multiple authors In-Reply-To: <3EC02156.7040304@sandia.gov> (jberry@sandia.gov) Message-ID: <200305131714.h4DHE3215527@grad.sccs.chukyo-u.ac.jp> "Jon Berry" writes: | | I'd like to sort by the last name, alphabetically by | first differing author. So the algorithm would be: | * within each article, sort authors by lastname | * to compare Article A with Article B: | * Look at last names of first author (if different, | comparison done) | * else if first authors are the same, look at second authors, | * etc. | * if not distinguished, go on to next sorting key (say | 'title') | | Noting of course that we might be comparing articles with | different numbers of authors. | | In initial searches, it looked like the xsl:for-each-group | and/or xsl:function constructs might help, but they don't seem | to be supported by the current PyXML/4Suite implementations. | | So, with the constraint that I'm trying to avoid buying a book for now, | is this doable using templates, easy, and currently implementable | with free software? I believe that using extension functions is a good approach to cope with your problem. The primary cause of your problem is that the xsl:sort looks at only the value of the first node, instead of all the selected nodes. For example, the following xsl:sort element selects all the last names of an article's authors, but only the first last name is used as a sort key. So, a simple solution is to change this behavior of xsl:sort by defining an extension function like this: where the extension function ext:strings() can be defined, for example, as follows: def strings(context, nodeset): return "".join(map(lambda x: x.nodeValue, nodeset)) It seems very Pythonic, doesn't it? :-) See documentation for more information on extension functions in Python. Another comment: | * if not distinguished, go on to next sorting key (say 'title') XSLT allows multiple xsl:sort elements, so this should not be a problem. Hope this helps, -- KAJIYAMA, Tamito From rsalz@datapower.com Tue May 13 16:55:38 2003 From: rsalz@datapower.com (Rich Salz) Date: Tue, 13 May 2003 11:55:38 -0400 Subject: [XML-SIG] XSLT sorting by "authors" element, with multiple authors In-Reply-To: <20030513013751.I29954@localhost.localdomain> References: <3EC02156.7040304@sandia.gov> <001e01c318f0$e6ed60c0$6401a8c0@tbp1> <20030513013751.I29954@localhost.localdomain> Message-ID: <3EC1157A.9080107@datapower.com> >>It seems unlikely that the 4Thought people will ever have much interest in >>getting 4xslt conformant with xslt2.0, so you need another approach. > > Why is this? If you look at the xml-dev list (hosted at lists.oasis.org, to skim archives), you can see that Uche has been very emphatic about his disdain for XSLT2.0, and his interest in seeing alternatives develop. /r$ From grobinson@transpose.com Wed May 14 02:21:26 2003 From: grobinson@transpose.com (Gary Robinson) Date: Tue, 13 May 2003 21:21:26 -0400 Subject: [XML-SIG] Installation trouble on OS X Message-ID: Hello, I'm trying to install PyXML-0.8.2 on OS X. I think I've located a bug in the installer. I'm running python setup.py install I get the error: % python setup.py install Traceback (most recent call last): File "setup.py", line 58, in ? if sys.platform[:6] == "darwin" and \ NameError: name 'distutils' is not defined When I examine setup.py, at the top I see: from distutils.sysconfig import get_config_vars but lines 58 and 59, where it is choking, say: if sys.platform[:6] == "darwin" and \ distutils.sysconfig.get_config_var("LDSHARED").find("-flat_namespace") \ == -1: The problem seems to be that distutils hasn't been imported; just get_config_vars has been. So the above should say: if sys.platform[:6] == "darwin" and \ get_config_vars("LDSHARED").find("-flat_namespace") == -1: Is that right? Also -- I'm a newbie to the process of doing this kind of install. Does it matter where in my directory structure I place the PyXML-0.8.2 directory which contains setup.py when I run it? Or can it be anywhere (in which case it must be smart enough to know where everything should go). Many thanks in advance for any help you can give. --Gary -- Gary Robinson CEO Transpose, LLC grobinson@transpose.com 207-942-3463 http://www.transpose.com http://radio.weblogs.com/0101454 From phthenry@earthlink.net Wed May 14 02:35:29 2003 From: phthenry@earthlink.net (Paul Tremblay) Date: Tue, 13 May 2003 21:35:29 -0400 Subject: [XML-SIG] XSLT sorting by "authors" element, with multiple authors In-Reply-To: <001201c31954$42501e40$6401a8c0@tbp1> References: <3EC02156.7040304@sandia.gov> <001e01c318f0$e6ed60c0$6401a8c0@tbp1> <20030513013751.I29954@localhost.localdomain> <001201c31954$42501e40$6401a8c0@tbp1> Message-ID: <20030513213528.K29954@localhost.localdomain> On Tue, May 13, 2003 at 09:33:37AM -0400, Thomas B. Passin wrote: > > Read the archives of xml-dev for the last few months, or of the xslt list > for the last 6 months, >and you will see a lot of anguish over the complexity > of combining XSL Schema types with xslt and xpath (sometimes the discussion > is more focused on XQuery, but there is a close relationship because > XQuery's needs have strongly influenced xpath2). I had kind of picked this up on the xslt mailing list the last few weeks, and just wanted to confirm my guess. Thanks Paul > > Uche has said many times that he thinks that xpath2/xslt2 are complex > abominations (my words, trying to capture what I think he expressed) and > that he has no interest in implementing them. > > He is interested in an alternative approach to a new xpath/xslt and there is > a small list where it is being discussed. > > Cheers, > > Tom P > > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig -- ************************ *Paul Tremblay * *phthenry@earthlink.net* ************************ From ryanwilcox@mac.com Wed May 14 04:26:06 2003 From: ryanwilcox@mac.com (Ryan Wilcox) Date: Tue, 13 May 2003 23:26:06 -0400 Subject: [XML-SIG] Installation trouble on OS X In-Reply-To: Message-ID: On 5/13/03, at 9:21 PM, Gary Robinson said: >Hello, > >I'm trying to install PyXML-0.8.2 on OS X. I think I've located a bug in the >installer. > >I'm running > > python setup.py install > >I get the error: > >% python setup.py install >Traceback (most recent call last): > File "setup.py", line 58, in ? > if sys.platform[:6] == "darwin" and \ >NameError: name 'distutils' is not defined Yup, I get this error too. I believe you're correct in how to fix it. BTW: I did this work for Darwin Ports maybe a month back. Darwin Ports makes it pretty much braindead to install PyXML ('cause you also should install xpat, etc). (It's a package manager, like Fink) http://www.opendarwin.org/projects/darwinports/. This has also been discussed in the archives a few times... Good luck, -Ryan Who recommends downloading Darwin Ports, even if to just get my patch file for setup.py ;) >-- > > >Gary Robinson >CEO >Transpose, LLC >grobinson@transpose.com >207-942-3463 >http://www.transpose.com >http://radio.weblogs.com/0101454 > > > >_______________________________________________ >XML-SIG maillist - XML-SIG@python.org >http://mail.python.org/mailman/listinfo/xml-sig --------------------------------------------------------------------- Wilcox Design: Understanding Data http://www.wilcoxd.com From jensj@fysik.dtu.dk Wed May 14 09:32:39 2003 From: jensj@fysik.dtu.dk (Jens Jorgen Mortensen) Date: Wed, 14 May 2003 10:32:39 +0200 Subject: [XML-SIG] Floating exception on alpha machine In-Reply-To: References: <200305121623.12111.jensj@bose.fysik.dtu.dk> <200305131404.11795.jensj@bose.fysik.dtu.dk> Message-ID: <200305141032.39323.jensj@bose.fysik.dtu.dk> Martin v. L=F6wis: > Jens Jorgen Mortensen writes: > > I have included my best shot at a backtrace to the end of this mail. > > I am afraid it is not very useful - the debugger reads the symbols > > before I import xml.xpath! - how should this be done? > > If you can't get your debugger to load symbols for shared libraries > afterwards, you could either try a different debugger (gdb comes to > mind) Here is what I get with gdb: GNU gdb 4.17 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you = are welcome to change it and/or distribute copies of it under certain conditi= ons. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for detail= s. This GDB was configured as "alphaev56-dec-osf4.0d"... (no debugging symbols found)... (gdb) set heuristic-fence-post 1000000 (gdb) run Starting program: /usr/local/bin/python2 (no debugging symbols found)...(no debugging symbols found)...Python 2.2.= 1=20 (#1, Jun 4 2002, 15:33:18) [C] on osf1V4 Type "help", "copyright", "credits" or "license" for more information. >>> import xml.xpath Program received signal SIGFPE, Arithmetic exception. 0x1200710c8 in PyFloat_AsReprString () (gdb) backtrace #0 0x1200710c8 in PyFloat_AsReprString () #1 0x1200a4174 in PyNumber_Check () #2 0x1200a4174 in PyNumber_Check () #3 0x1200a43cc in PyNumber_Check () #4 0x1200a4bd8 in PyNumber_Multiply () #5 0x120074fac in PyEval_EvalCode () #6 0x12007c238 in PyEval_EvalCodeEx () #7 0x12007440c in PyEval_EvalCode () #8 0x12002f424 in PyImport_ExecCodeModuleEx () #9 0x12002fca8 in PyImport_ExecCodeModuleEx () #10 0x1200307bc in PyImport_ExecCodeModuleEx () #11 0x12002ff24 in PyImport_ExecCodeModuleEx () >, or you could try to link all code with the executable. For > this, you would have to relink the python binary, after putting > initpyexpat into config.c (pyexpat is likely the extension module for > which it can't find the symbols). OK, hmmm... I will try to experiment with this some day when I have some= =20 time. Regards, Jens J=F8rgen From jensj@fysik.dtu.dk Wed May 14 10:10:26 2003 From: jensj@fysik.dtu.dk (Jens Jorgen Mortensen) Date: Wed, 14 May 2003 11:10:26 +0200 Subject: [XML-SIG] Re: Floating exception on alpha machine In-Reply-To: <200305121623.12111.jensj@bose.fysik.dtu.dk> References: <200305121623.12111.jensj@bose.fysik.dtu.dk> Message-ID: <200305141110.27285.jensj@bose.fysik.dtu.dk> Jens Jorgen Mortensen: > Hi, > > I am trying to install PyXML-0.8.2 on an alpha. After doing a build an= d an > install, I try to import xml.xpath. The result is: > > Python 2.2.1 (#1, Jun 4 2002, 15:33:18) [C] on osf1V4 > Type "help", "copyright", "credits" or "license" for more information. > >>> import xml.xpath > Floating exception (core dumped) I have found that the core dump comes from line 20 in=20 PyXML-0.8.2/xml/xpath/__init__.py. It goes like this: Python 2.2.1 (#1, Jun 4 2002, 15:33:18) [C] on osf1V4 Type "help", "copyright", "credits" or "license" for more information. >>> Inf =3D Inf =3D 1e300 * 1e300 Floating exception (core dumped) A very simple way to crash the python interpreter!! Does anybody know wh= at to=20 do about it? How should Inf be defined? Regards, Jens J=F8rgen From grobinson@transpose.com Wed May 14 13:02:13 2003 From: grobinson@transpose.com (Gary Robinson) Date: Wed, 14 May 2003 08:02:13 -0400 Subject: [XML-SIG] Installation trouble on OS X In-Reply-To: Message-ID: Thanks Ryan! will darwinports work to install pyxml for the apple-supplied version of Python, or does it expect a darwinports-supplied python? (I notice that fink tends to put things in its own locations, which are sometimes incompatible with the locations non-fink packages expect...) Which brings me also to my other question, when I run setup.py for a package, does it move it automatically find the right directory to put the files into, no matter where I ran it from? --Gary -- Gary Robinson CEO Transpose, LLC grobinson@transpose.com 207-942-3463 http://www.transpose.com http://radio.weblogs.com/0101454 > From: Ryan Wilcox > Date: Tue, 13 May 2003 23:26:06 -0400 > To: Gary Robinson , xml-sig@python.org > Subject: Re: [XML-SIG] Installation trouble on OS X > > On 5/13/03, at 9:21 PM, Gary Robinson said: > >> Hello, >> >> I'm trying to install PyXML-0.8.2 on OS X. I think I've located a bug in the >> installer. >> >> I'm running >> >> python setup.py install >> >> I get the error: >> >> % python setup.py install >> Traceback (most recent call last): >> File "setup.py", line 58, in ? >> if sys.platform[:6] == "darwin" and \ >> NameError: name 'distutils' is not defined > > Yup, I get this error too. > > I believe you're correct in how to fix it. > > BTW: I did this work for Darwin Ports maybe a month back. Darwin Ports makes > it > pretty much braindead to install PyXML ('cause you also should install xpat, > etc). (It's a package manager, like Fink) > > http://www.opendarwin.org/projects/darwinports/. > > This has also been discussed in the archives a few times... > > Good luck, > -Ryan > > Who recommends downloading Darwin Ports, even if to just get my patch file for > setup.py ;) > > >> -- >> >> >> Gary Robinson >> CEO >> Transpose, LLC >> grobinson@transpose.com >> 207-942-3463 >> http://www.transpose.com >> http://radio.weblogs.com/0101454 >> >> >> >> _______________________________________________ >> XML-SIG maillist - XML-SIG@python.org >> http://mail.python.org/mailman/listinfo/xml-sig > > --------------------------------------------------------------------- > Wilcox Design: Understanding Data http://www.wilcoxd.com > From ryanwilcox@mac.com Wed May 14 14:34:42 2003 From: ryanwilcox@mac.com (Ryan Wilcox) Date: Wed, 14 May 2003 09:34:42 -0400 Subject: [XML-SIG] Installation trouble on OS X In-Reply-To: Message-ID: On 5/14/03, at 8:02 AM, Gary Robinson said: >Thanks Ryan! > >will darwinports work to install pyxml for the apple-supplied version of >Python, or does it expect a darwinports-supplied python? (I notice that fink >tends to put things in its own locations, which are sometimes incompatible >with the locations non-fink packages expect...) Darwin Ports puts stuff in it's own location as well. You _should_ be able to set the PYTHONPATH enviromental variable to include where DP puts stuff (in this case: /opt/local/lib/python2.2/site-packages/). > >Which brings me also to my other question, when I run setup.py for a >package, does it move it automatically find the right directory to put the >files into, no matter where I ran it from? It should, yes. If you want to explicitly tell it, call setup.py like so: python setup.py --prefix /usr/something/somewhere Hope this helps, -Ryan Wilcox >>> >>> >>> Gary Robinson >>> CEO >>> Transpose, LLC >>> grobinson@transpose.com >>> 207-942-3463 >>> http://www.transpose.com >>> http://radio.weblogs.com/0101454 --------------------------------------------------------------------- Wilcox Design: Understanding Data http://www.wilcoxd.com From txagcs98@hotmail.com Wed May 14 18:58:24 2003 From: txagcs98@hotmail.com (William McLendon) Date: Wed, 14 May 2003 11:58:24 -0600 Subject: [XML-SIG] How do I process CDATA when the characters looks like XML Message-ID: Hi, I'm working with the basic python SAX parser (xml.sax.handler, etc) and am needing to process some XML that has "<" and ">" characters going in CDATA fields. For example, the model I want to have is this: X where "X" is some kind of character data. but if X = giving me this situation: A parser is going to think that "" is a new XML tag and it'll die saying that I have mismatched tags. This field also may or may not have <*> type entries in it, or it might have many blocks like that. Is there a way to set a processing flag to specify that I want to grab everything between and as a character buffer? If there's not a built in way to grab that, is there a trick I can use to get the same effect? I can't just flip a state in the parser and 'build' the string from startElement and endElement names because the character data won't be well formed XML (no style elements) Any help is greatly appreciated! -William _________________________________________________________________ Tired of spam? Get advanced junk mail protection with MSN 8. http://join.msn.com/?page=features/junkmail From mike@skew.org Wed May 14 19:19:33 2003 From: mike@skew.org (Mike Brown) Date: Wed, 14 May 2003 12:19:33 -0600 (MDT) Subject: [XML-SIG] How do I process CDATA when the characters looks like XML In-Reply-To: "from William McLendon at May 14, 2003 11:58:24 am" Message-ID: <200305141819.h4EIJXrm049202@chilled.skew.org> William McLendon wrote: > I'm working with the basic python SAX parser (xml.sax.handler, etc) and am > needing to process some XML that has "<" and ">" characters going in CDATA > fields. > > For example, the model I want to have is this: > > X where "X" is some kind of character data. > > but if X = giving me this situation: > > If you're wanting to put markup characters "<" or "&" in character data, you must escape them, either using character or entity references, or the special markup for a CDATA section: <B> (escaping the > is optional) or ]]> From txagcs98@hotmail.com Wed May 14 20:58:26 2003 From: txagcs98@hotmail.com (William McLendon) Date: Wed, 14 May 2003 13:58:26 -0600 Subject: [XML-SIG] How do I process CDATA when the characters looks like XML Message-ID: Thanks! That did the trick. >From: Mike Brown >To: William McLendon >CC: xml-sig@python.org >Subject: Re: [XML-SIG] How do I process CDATA when the characters looks >like XML >Date: Wed, 14 May 2003 12:19:33 -0600 (MDT) >William McLendon wrote: > > I'm working with the basic python SAX parser (xml.sax.handler, etc) and >am > > needing to process some XML that has "<" and ">" characters going in >CDATA > > fields. > > > > For example, the model I want to have is this: > > > > X where "X" is some kind of character data. > > > > but if X = giving me this situation: > > > > > >If you're wanting to put markup characters "<" or "&" in character data, >you >must escape them, either using character or entity references, or the >special >markup for a CDATA section: > ><B> (escaping the > is optional) >or >]]> _________________________________________________________________ Tired of spam? Get advanced junk mail protection with MSN 8. http://join.msn.com/?page=features/junkmail From martin@v.loewis.de Wed May 14 21:38:40 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 14 May 2003 22:38:40 +0200 Subject: [XML-SIG] Floating exception on alpha machine In-Reply-To: <200305141032.39323.jensj@bose.fysik.dtu.dk> References: <200305121623.12111.jensj@bose.fysik.dtu.dk> <200305131404.11795.jensj@bose.fysik.dtu.dk> <200305141032.39323.jensj@bose.fysik.dtu.dk> Message-ID: Jens Jorgen Mortensen writes: > Here is what I get with gdb: Thanks. Doesn't sound too convincing: those functions don't call each other. Did you strip the executables? > >, or you could try to link all code with the executable. For > > this, you would have to relink the python binary, after putting > > initpyexpat into config.c (pyexpat is likely the extension module for > > which it can't find the symbols). > > OK, hmmm... I will try to experiment with this some day when I have some > time. I think this might give additional insights still. Regards, Martin From martin@v.loewis.de Wed May 14 21:41:20 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 14 May 2003 22:41:20 +0200 Subject: [XML-SIG] Re: Floating exception on alpha machine In-Reply-To: <200305141110.27285.jensj@bose.fysik.dtu.dk> References: <200305121623.12111.jensj@bose.fysik.dtu.dk> <200305141110.27285.jensj@bose.fysik.dtu.dk> Message-ID: Jens Jorgen Mortensen writes: > >>> Inf = Inf = 1e300 * 1e300 > Floating exception (core dumped) > > A very simple way to crash the python interpreter!! The code is fine; the interpreter should not crash. Please report a bug as sf.net/projects/python. The interpreter should not crash. At worst, it should raise a Python exception. > Does anybody know what to a do about it? How should Inf be defined? Only a platform expert could know. Regards, Martin From rodsenra@gpr.com.br Wed May 14 22:48:43 2003 From: rodsenra@gpr.com.br (Rodrigo Senra) Date: Wed, 14 May 2003 18:48:43 -0300 Subject: [XML-SIG] Problems passing paths to xmlproc_catalog Message-ID: <20030514214833.066D4128082@gunga.terra.com.br> Hi, First of all thanks for PyXML. It is great! Just recently I found out something that might me a bug. (But it could be a feature, in that case I apologize in advance ;o) This works fine: cat=catalog.xmlproc_catalog("catalog.soc",\ catalog.CatParserFactory()) However, that does not: cat=catalog.xmlproc_catalog("\catalog.soc",\ catalog.CatParserFactory()) If I pass a path in the first parameter instead of the file name, it does not work anymore. This can be reproduced with xvcmd.py: OK -> python xvcmd.py -c catalog.soc urls.xml BUG -> python xvcmd.py -c /tmp/catalog.soc urls.xml I'm using xmlproc that is shipped in PyXML 0.8.2. I have already forwarded this message to the xmlproc maintainer, but I though this would interest xml-sig users also. Thanks in advance, Rod Senra -- Rodrigo Senra (ICQ 114477550) MSc Computer Engineer rodsenra@gpr.com.br GPr Sistemas Ltda http://www.gpr.com.br From tpassin@comcast.net Thu May 15 04:59:39 2003 From: tpassin@comcast.net (Thomas B. Passin) Date: Wed, 14 May 2003 23:59:39 -0400 Subject: [XML-SIG] Re: Floating exception on alpha machine References: <200305121623.12111.jensj@bose.fysik.dtu.dk> <200305141110.27285.jensj@bose.fysik.dtu.dk> Message-ID: <002601c31a96$68b6c190$6401a8c0@tbp1> [Martin v. L=F6wis] > Jens Jorgen Mortensen writes: > > > >>> Inf =3D Inf =3D 1e300 * 1e300 > > Floating exception (core dumped) > > > > A very simple way to crash the python interpreter!! > > The code is fine; the interpreter should not crash. Please report a > bug as sf.net/projects/python. The interpreter should not crash. At > worst, it should raise a Python exception. > > > Does anybody know what to a do about it? How should Inf be defin= ed? > I only know that it does not crash on Python 2.2.1 on Windows 2000. Cheers, Tom P From tpassin@comcast.net Thu May 15 05:02:17 2003 From: tpassin@comcast.net (Thomas B. Passin) Date: Thu, 15 May 2003 00:02:17 -0400 Subject: [XML-SIG] Problems passing paths to xmlproc_catalog References: <20030514214833.066D4128082@gunga.terra.com.br> Message-ID: <002b01c31a96$c690c9a0$6401a8c0@tbp1> [Rodrigo Senra] > > This can be reproduced with xvcmd.py: > > OK -> python xvcmd.py -c catalog.soc urls.xml > BUG -> python xvcmd.py -c /tmp/catalog.soc urls.xml > Try using a file: url, either file:/tmp/catalog.soc urls.xml or file:///tmp/catalog.soc urls.xml (the latter is correct but some code might still be wanting the first form). This is just a guess but it is worth trying. Cheers, Tom P From fredrik@pythonware.com Thu May 15 08:24:23 2003 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 15 May 2003 09:24:23 +0200 Subject: [XML-SIG] Re: Re: Floating exception on alpha machine References: <200305121623.12111.jensj@bose.fysik.dtu.dk><200305141110.27285.jensj@bose.fysik.dtu.dk> Message-ID: Martin v. Löwis wrote: > > >>> Inf = Inf = 1e300 * 1e300 > > Floating exception (core dumped) > > > > A very simple way to crash the python interpreter!! > > The code is fine; the interpreter should not crash. Please report a > bug as sf.net/projects/python. The interpreter should not crash. At > worst, it should raise a Python exception. > > > Does anybody know what to a do about it? How should Inf be defined? > > Only a platform expert could know. brief summary: ax005> cat q.c main() { double value = 1e300; printf("%f\n", value * value); } ax005> cc q.c ax005> ./a.out Floating exception (core dumped) ax005> cc -ieee q.c ax005> ./a.out INF ax005> man ieee ... more information ... From noreply@sourceforge.net Thu May 15 17:12:43 2003 From: noreply@sourceforge.net (SourceForge.net) Date: Thu, 15 May 2003 09:12:43 -0700 Subject: [XML-SIG] [ pyxml-Bugs-738362 ] XHtmlPrettyPrint: Bad URL in DOCTYPE Message-ID: Bugs item #738362, was opened at 2003-05-15 18:12 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=738362&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jacek Konieczny (jajcus) Assigned to: Nobody/Anonymous (nobody) Summary: XHtmlPrettyPrint: Bad URL in DOCTYPE Initial Comment: Following code results in invalid XHTML - the URL in DOCTYPE is wrong. --- cut --- #!/usr/bin/python from xml.dom import implementation from xml.dom import DOMException from xml.dom.ext import XHtmlPrettyPrint doc=implementation.createDocument(None,None,None) doc.appendChild(doc.createElement("html")) XHtmlPrettyPrint(doc) --- cut --- Here is the output: --- cut --- --- cut --- ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=738362&group_id=6473 From noreply@sourceforge.net Thu May 15 17:23:54 2003 From: noreply@sourceforge.net (SourceForge.net) Date: Thu, 15 May 2003 09:23:54 -0700 Subject: [XML-SIG] [ pyxml-Bugs-738366 ] Unneccessary comments from external DTD Message-ID: Bugs item #738366, was opened at 2003-05-15 18:23 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=106473&aid=738366&group_id=6473 Category: DOM Group: None Status: Open Resolution: None Priority: 5 Submitted By: Jacek Konieczny (jajcus) Assigned to: Nobody/Anonymous (nobody) Summary: Unneccessary comments from external DTD Initial Comment: When reading document that uses external DTD with xml.dom.ext.reader.Sax2.Reader and then writting it with xml.dom.ext.Print or PrettyPrint the output includes unneccessary comments from the DTD. Here is a sample of code: -- cut --- from xml.dom import implementation from xml.dom import DOMException from xml.dom.ext.reader.Sax2 import Reader from xml.dom.ext import PrettyPrint import cStringIO doc_str=""" """ reader=Reader() doc=reader.fromStream(cStringIO.StringIO(doc_str)) PrettyPrint(doc) -- cut --- And here is a fragment of its output: -- cut ---