From sbaush at gmail.com Fri Feb 3 11:09:47 2006 From: sbaush at gmail.com (Sbaush) Date: Fri, 3 Feb 2006 11:09:47 +0100 Subject: [XML-SIG] Use DOM for do it In-Reply-To: References: Message-ID: Hi all. I've this function that write a XML string. Is possible to do it without ElementTree but with DOM? Thanks. import sys import elementtree.ElementTree as ET root = ET.Element("manager") req=ET.SubElement(root,"request") app= ET.SubElement(req,"append") app.set("mode","INPUT") met=ET.SubElement(app,"method") met.set("type","GOOD") src=ET.SubElement(app,"source") src.set("address"," 127.0.0.1") act=ET.SubElement(app,"action") act.set("option","OK") tree = ET.ElementTree(root) tree.write(sys.stdout) print -- Sbaush -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20060203/36438523/attachment.htm From bob at redivi.com Fri Feb 3 23:30:23 2006 From: bob at redivi.com (Bob Ippolito) Date: Fri, 3 Feb 2006 14:30:23 -0800 Subject: [XML-SIG] PyXML 0.8.4 and expat byteorder Message-ID: <11FE3731-82C4-4D1E-9ECF-AE50ABE314E4@redivi.com> Here's the PyXML patch that gets expat byteorder from pyconfig.h. I don't know who the maintainer is nor do I have any interest in subscribing to xml-sig (this CC will probably bounce, or get stuck in mod queue for days/weeks/forever). If you give a damn about PyXML please make sure to get the patch to the right person. I've never even installed the 4Suite stuff, so I'm not going to put together a patch for that. Such a patch should be roughly the same as this one. -bob -------------- next part -------------- A non-text attachment was scrubbed... Name: PyXML-0.8.4-byteorder.patch Type: application/octet-stream Size: 1002 bytes Desc: not available Url : http://mail.python.org/pipermail/xml-sig/attachments/20060203/ac81574d/attachment.obj -------------- next part -------------- From noreply at sourceforge.net Fri Feb 3 23:59:45 2006 From: noreply at sourceforge.net (SourceForge.net) Date: Fri, 03 Feb 2006 14:59:45 -0800 Subject: [XML-SIG] [ pyxml-Patches-1423775 ] expat byteorder breaks for OS X universal binary builds Message-ID: Patches item #1423775, was opened at 2006-02-03 17:59 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=306473&aid=1423775&group_id=6473 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: expat Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mike Taylor (code-bear) Assigned to: Nobody/Anonymous (nobody) Summary: expat byteorder breaks for OS X universal binary builds Initial Comment: I've copying this patch from the pythonmac-sig mailing list where the issue was talked about. The author of the patch is not part of the PyXML community and I wanted to make sure the patch was noticed. The patch mail entry: http://mail.python.org/pipermail/pythonmac-sig/2006-February/015878.html ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=306473&aid=1423775&group_id=6473 From bear42 at code-bear.com Sat Feb 4 00:01:28 2006 From: bear42 at code-bear.com (bear) Date: Fri, 03 Feb 2006 18:01:28 -0500 Subject: [XML-SIG] [Pythonmac-SIG] PyXML 0.8.4 and expat byteorder In-Reply-To: <11FE3731-82C4-4D1E-9ECF-AE50ABE314E4@redivi.com> References: <11FE3731-82C4-4D1E-9ECF-AE50ABE314E4@redivi.com> Message-ID: <43E3E0C8.7080509@code-bear.com> I've taken the patch and submitted it to the PyXML sourceforge project and included a link to your mailing list archive entry for reference. http://sourceforge.net/tracker/index.php?func=detail&aid=1423775&group_id=6473&atid=306473 Bob Ippolito wrote: > Here's the PyXML patch that gets expat byteorder from pyconfig.h. I > don't know who the maintainer is nor do I have any interest in > subscribing to xml-sig (this CC will probably bounce, or get stuck in > mod queue for days/weeks/forever). If you give a damn about PyXML > please make sure to get the patch to the right person. From evdo.hsdpa at gmail.com Sat Feb 4 01:38:04 2006 From: evdo.hsdpa at gmail.com (Robert Kim Wireless Internet Advisor) Date: Fri, 3 Feb 2006 16:38:04 -0800 Subject: [XML-SIG] [Pythonmac-SIG] PyXML 0.8.4 and expat byteorder In-Reply-To: <43E3E0C8.7080509@code-bear.com> References: <11FE3731-82C4-4D1E-9ECF-AE50ABE314E4@redivi.com> <43E3E0C8.7080509@code-bear.com> Message-ID: <1ec620e90602031638u643d37f6wc3dec8b325cdcc33@mail.gmail.com> verrrrry cool! thanks! - bk On 2/3/06, bear wrote: > I've taken the patch and submitted it to the PyXML sourceforge project > and included a link to your mailing list archive entry for reference. -- Robert Q Kim, Wireless Internet Advisor http://hsdpa-coverage.com http://www.antennacoverage.com/cell-repeater.html 2611 S. Pacific Coast Highway 101 Suite 102 Cardiff by the Sea, CA 92007 206 984 0880 From sbaush at gmail.com Mon Feb 6 18:35:25 2006 From: sbaush at gmail.com (Sbaush) Date: Mon, 6 Feb 2006 18:35:25 +0100 Subject: [XML-SIG] problem in ElementTree SubElement Message-ID: Hi all. I would get this element in xml: I have write this: date=ET.SubElement(idsreq,"date") date.set("month",month) date.set("day",day) but i get this: The attributes are not in my order!! how i can get the attributes in right order??? Thanks all. -- Sbaush -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20060206/c3128ac5/attachment.htm From radovan.chytracek at gmail.com Mon Feb 6 19:21:18 2006 From: radovan.chytracek at gmail.com (Radovan Chytracek) Date: Mon, 6 Feb 2006 19:21:18 +0100 Subject: [XML-SIG] problem in ElementTree SubElement In-Reply-To: References: Message-ID: Hi, you simply can't rely on the order of attributes unless your XML data are in canonical form which keeps attributes alphabetically ordered. I guess this a very simple way of saying that the SAX parser likely to be running behind ElementTree API layer does not preserve the order of attributes. In general SAX(2) does not have to. Please correct me if I am wrong about this. Cheers Radovan On 2/6/06, Sbaush wrote: > Hi all. > I would get this element in xml: > > > > I have write this: > > date=ET.SubElement(idsreq,"date") > date.set("month",month) > date.set("day",day) > > but i get this: > > > > The attributes are not in my order!! > how i can get the attributes in right order??? > Thanks all. > > -- > Sbaush > _______________________________________________ > XML-SIG maillist - XML-SIG at python.org > http://mail.python.org/mailman/listinfo/xml-sig > > > -- Radovan Chytracek CERN IT PSS mailto:Radovan.Chytracek at cern.ch phone: +41227674578 fax: +41227669830 From sbaush at gmail.com Mon Feb 6 20:10:31 2006 From: sbaush at gmail.com (Sbaush) Date: Mon, 6 Feb 2006 20:10:31 +0100 Subject: [XML-SIG] problem in ElementTree SubElement In-Reply-To: References: Message-ID: is possible to preserve the order building the XML tree with DOM? 2006/2/6, Radovan Chytracek : > > Hi, > > you simply can't rely on the order of attributes unless your XML > data are in canonical form which keeps attributes alphabetically > ordered. I guess this a very simple way of saying that the SAX parser > likely to be running behind ElementTree API layer does not preserve > the order of attributes. In general SAX(2) does not have to. Please > correct me if I am wrong about this. > > Cheers > Radovan > > On 2/6/06, Sbaush wrote: > > Hi all. > > I would get this element in xml: > > > > > > > > I have write this: > > > > date=ET.SubElement(idsreq,"date") > > date.set("month",month) > > date.set("day",day) > > > > but i get this: > > > > > > > > The attributes are not in my order!! > > how i can get the attributes in right order??? > > Thanks all. > > > > -- > > Sbaush > > _______________________________________________ > > XML-SIG maillist - XML-SIG at python.org > > http://mail.python.org/mailman/listinfo/xml-sig > > > > > > > > > -- > Radovan Chytracek CERN IT PSS > mailto:Radovan.Chytracek at cern.ch > phone: +41227674578 fax: +41227669830 > _______________________________________________ > XML-SIG maillist - XML-SIG at python.org > http://mail.python.org/mailman/listinfo/xml-sig > -- Sbaush -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20060206/6f3ae73e/attachment.htm From fredrik at pythonware.com Mon Feb 6 20:14:28 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon, 6 Feb 2006 20:14:28 +0100 Subject: [XML-SIG] problem in ElementTree SubElement References: Message-ID: Sbaush wrote: > is possible to preserve the order building the XML tree with DOM? no, because the order isn't important in XML. if you want to invent your own file format, you shouldn't call it XML, and you shouldn't use XML tools. From dkgunter at lbl.gov Wed Feb 8 05:13:49 2006 From: dkgunter at lbl.gov (Dan Gunter) Date: Tue, 07 Feb 2006 20:13:49 -0800 Subject: [XML-SIG] problem in ElementTree SubElement In-Reply-To: References: Message-ID: <43E96FFD.4000105@lbl.gov> Right, in general XML processors don't care about attribute order (I don't know much about canonicalization but that does sound like the obvious exception). The XML Infoset specifically says they are an unordered set: http://www.w3.org/TR/xml-infoset/#infoitem.element ; so, if you care about order, rather than canonicalizing everything, maybe you should switch to using elements, e.g. 0206 -Dan Radovan Chytracek wrote: >Hi, > > you simply can't rely on the order of attributes unless your XML >data are in canonical form which keeps attributes alphabetically >ordered. I guess this a very simple way of saying that the SAX parser >likely to be running behind ElementTree API layer does not preserve >the order of attributes. In general SAX(2) does not have to. Please >correct me if I am wrong about this. > >Cheers > Radovan > >On 2/6/06, Sbaush wrote: > > >>Hi all. >>I would get this element in xml: >> >> >> >>I have write this: >> >>date=ET.SubElement(idsreq,"date") >> date.set("month",month) >> date.set("day",day) >> >>but i get this: >> >> >> >>The attributes are not in my order!! >>how i can get the attributes in right order??? >>Thanks all. >> >>-- >>Sbaush >>_______________________________________________ >>XML-SIG maillist - XML-SIG at python.org >>http://mail.python.org/mailman/listinfo/xml-sig >> >> >> >> >> > > >-- >Radovan Chytracek CERN IT PSS >mailto:Radovan.Chytracek at cern.ch >phone: +41227674578 fax: +41227669830 >_______________________________________________ >XML-SIG maillist - XML-SIG at python.org >http://mail.python.org/mailman/listinfo/xml-sig > > From fredrik at pythonware.com Wed Feb 8 09:09:47 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed, 8 Feb 2006 09:09:47 +0100 Subject: [XML-SIG] problem in ElementTree SubElement References: <43E96FFD.4000105@lbl.gov> Message-ID: Dan Gunter wrote: > Right, in general XML processors don't care about attribute order (I > don't know much about canonicalization but that does sound like the > obvious exception). http://www.w3.org/TR/xml-c14n says to sort lexicographically on (namespace uri, local tag). (which, of course, is exactly what ET's default writer does) From cesar.ortiz at gmail.com Wed Feb 8 11:46:01 2006 From: cesar.ortiz at gmail.com (Cesar Ortiz) Date: Wed, 8 Feb 2006 11:46:01 +0100 Subject: [XML-SIG] Encoding detection in the html parser from libxml2 Message-ID: <90255a70602080246xc182997s3c64229925e31133@mail.gmail.com> Hi, I am parsing html documents using the html parser from libxml2, and if the encoding is included in the document it works perfectly but if it is not, I think it does not work well (probably because I am doing something wrong). As it is said in http://xmlsoft.org/encoding.htmlthe parser should detect the encoding. So I tested it putting an utf-8 word in a file and it does not detect it (it generates a wrong string). Example: reducci??n --> reducci???n. I just use the parser as a SAX parser because I do not need a tree, so to parse the file I use the htmlParseChunk() function and I create the context with htmlCreatePushParser(). Is it posible that the encoding detection does not work with htmlParseChunk? If it is so, what method should I use? Thanks, Cesar -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20060208/2c6c0901/attachment.htm From veillard at redhat.com Wed Feb 8 12:55:31 2006 From: veillard at redhat.com (Daniel Veillard) Date: Wed, 8 Feb 2006 06:55:31 -0500 Subject: [XML-SIG] Encoding detection in the html parser from libxml2 In-Reply-To: <90255a70602080246xc182997s3c64229925e31133@mail.gmail.com> References: <90255a70602080246xc182997s3c64229925e31133@mail.gmail.com> Message-ID: <20060208115531.GF30975@redhat.com> On Wed, Feb 08, 2006 at 11:46:01AM +0100, Cesar Ortiz wrote: > Hi, > > I am parsing html documents using the html parser from libxml2, and if > the encoding is included in the document it works perfectly but if it > is not, I think it does not work well (probably because I am doing > something wrong). Well first thing wrong is that this is not libxml2 help mailing list, see http://xmlsoft.org/bugs.html > As it is said in > http://xmlsoft.org/encoding.htmlthe > parser should > detect the encoding. autodetection is done on XML based on the XMLDecl and the default values as specified by the XML specification. On HTML all bets are off if you don't have a meta tag or if you didn't indicate the encoding to the parser. > So I tested it putting an utf-8 word in a file and > it does not detect it (it generates a wrong string). Example: > reducci??n --> reducci???n. encoding is an entity property (i.e. per file) not per word. So either I don't understand your test or this just can't work. http://xmlsoft.org/html/libxml-HTMLparser.html#htmlCreatePushParserCtxt use the encoding field when creating your parser. For further informations/help, subscribe and use the libxml2 mailing-list, thanks, Daniel -- Daniel Veillard | Red Hat http://redhat.com/ veillard at redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ From clerc at uni-bremen.de Thu Feb 9 14:09:23 2006 From: clerc at uni-bremen.de (Daniel Clerc) Date: Thu, 9 Feb 2006 14:09:23 +0100 Subject: [XML-SIG] problems with encoding and SAX Message-ID: <82d07e750602090509p6e07a5dcwbf74f1f665d4a84d@mail.gmail.com> Hi everybody! I have some trouble with SAX and encondings... When I try to parse the following XML-code: K'R® ^^^^^^^^^^^^^^ ... I get this error message. self._err_handler.fatalError(exc) File "C:\Python24\Lib\site-packages\_xmlplus\sax\handler.py", line 38, in fatalError raise exception SAXParseException: xml_temp.xml:3766:13: not well-formed (invalid token) ... Here you can find the python-code I use: http://knopaste.de/index.php?module=hilight&id=142 Maybe the encoding of the content between the xml-elements is mismatching from the encoding specified. As I have to parse quite a lot of log files (~1GB zipped), and there are only a handful of such errors I would be very happy when I could find a way to tell sax just not to worry and write the string anyway. Parsing the xml-code with the MS-XML-DOM, or a JAVA-based parser is not a problem, but I would prefer a solution in Python. Thanks, Daniel From uche.ogbuji at fourthought.com Thu Feb 9 23:08:36 2006 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Thu, 09 Feb 2006 15:08:36 -0700 Subject: [XML-SIG] PyXML 0.8.4 and expat byteorder In-Reply-To: <11FE3731-82C4-4D1E-9ECF-AE50ABE314E4@redivi.com> References: <11FE3731-82C4-4D1E-9ECF-AE50ABE314E4@redivi.com> Message-ID: <43EBBD64.9050502@fourthought.com> Bob Ippolito wrote: > Here's the PyXML patch that gets expat byteorder from pyconfig.h. I > don't know who the maintainer is nor do I have any interest in > subscribing to xml-sig (this CC will probably bounce, or get stuck in > mod queue for days/weeks/forever). If you give a damn about PyXML > please make sure to get the patch to the right person. > > I've never even installed the 4Suite stuff, so I'm not going to put > together a patch for that. Such a patch should be roughly the same as > this one. Never a worry. 4Suite developers track expat *very* closely (and even contribute back to expat itself). We came across and addressed this issue months ago. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ From uche.ogbuji at fourthought.com Thu Feb 9 22:57:50 2006 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Thu, 09 Feb 2006 14:57:50 -0700 Subject: [XML-SIG] PyXML 0.8.4 and expat byteorder In-Reply-To: <11FE3731-82C4-4D1E-9ECF-AE50ABE314E4@redivi.com> References: <11FE3731-82C4-4D1E-9ECF-AE50ABE314E4@redivi.com> Message-ID: <43EBBADE.8050506@fourthought.com> Bob Ippolito wrote: > Here's the PyXML patch that gets expat byteorder from pyconfig.h. I > don't know who the maintainer is nor do I have any interest in > subscribing to xml-sig (this CC will probably bounce, or get stuck in > mod queue for days/weeks/forever). If you give a damn about PyXML > please make sure to get the patch to the right person. > > I've never even installed the 4Suite stuff, so I'm not going to put > together a patch for that. Such a patch should be roughly the same as > this one. You can relax. 4Suite developers track expat *very* closely (and even contribute back to expat itself). We came across and addressed this issue months ago. -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ From uche.ogbuji at fourthought.com Thu Feb 9 23:30:55 2006 From: uche.ogbuji at fourthought.com (Uche Ogbuji) Date: Thu, 09 Feb 2006 15:30:55 -0700 Subject: [XML-SIG] Canonical XML and attribute order In-Reply-To: References: <43E96FFD.4000105@lbl.gov> Message-ID: <43EBC29F.5050102@fourthought.com> Fredrik Lundh wrote: > Dan Gunter wrote: > > >> Right, in general XML processors don't care about attribute order (I >> don't know much about canonicalization but that does sound like the >> obvious exception). >> > > http://www.w3.org/TR/xml-c14n says to sort lexicographically on > (namespace uri, local tag). > > (which, of course, is exactly what ET's default writer does) > I just want to clarify that there is a lot more to canonicalization than that. There's surely no problem with adopting conventions from Canonical XML, but it doesn't really make sense to treat that spec as an authority in snippets. Either you have Canonical XML or you don't. FYI if you do want Canonical XML, you can use PyXML's c14n module, or you can use PyGenx to generate XML: http://software.translucentcode.org/pygenx/ PyGenx is based on Genx, which always creates Canonical XML. Side note: I have a c14n module I've put together for Amara, and it's intended for the next release. It's based on 4Suite's fast SAX parser, contrasting PyXML's, which is DOM-based (PyGenx is expat based, and thus SAX-like). Ob c14n reference: http://www.ibm.com/developerworks/xml/library/x-c14n/ All that having been said, the OP is looking to address a common problem among makers of XML authoring tools--the need to respect the user's choice of attribute order and other such lexical details. It's not really useful to repeat over and over that the XML spec states that attribute order is not considered significant in determining the conformance of a parser. And it's very unfair to state that the OP is somehow fudging the grand name of "XML". Just as a fun exercise in monkey-wrench throwing, if you read carefully enough, there's the little-known fact that XML 1.0 doesn't require parsers to report child elements in any particular order, either. It's more useful to say that most XML parsers do choose to ignore attribute order , because they are based on an abstract information model of XML (such as the Infoset, the XPath data model or the like) rather than the lexical form of the entities. For this reason most XML editing tools rely on either specialized raw text frameworks, or a hybrid of raw text with XML events (more usually the latter). This does not mean that they are not XML processors, but just that they do choose to preserve details that the XML spec does not *require* them to preserve. The OP's best bet is to reuse another engine that already gets this right, although I admit that I don't know of one available for Python. I certainly do not write such tools, but my colleague Simon St.Laurent did have a go at such a generic tool for Java. Ob XML and information ordering reference: http://www-128.ibm.com/developerworks/xml/library/x-eleord.html -- Uche Ogbuji Fourthought, Inc. http://uche.ogbuji.net http://fourthought.com http://copia.ogbuji.net http://4Suite.org Articles: http://uche.ogbuji.net/tech/publications/ From mike at skew.org Thu Feb 9 23:26:45 2006 From: mike at skew.org (Mike Brown) Date: Thu, 9 Feb 2006 15:26:45 -0700 (MST) Subject: [XML-SIG] problems with encoding and SAX In-Reply-To: <82d07e750602090509p6e07a5dcwbf74f1f665d4a84d@mail.gmail.com> Message-ID: <200602092226.k19MQjRS084852@chilled.skew.org> Daniel Clerc wrote: > Hi everybody! > > I have some trouble with SAX and encondings... > > When I try to parse the following XML-code: > > > > DURATION="1001"> > K'R® > ^^^^^^^^^^^^^^ Look where the closing tag for TRANSACTION is. Copy-paste error in your email? Or does the XML actually look like that? You also seem to have a couple of illegal control characters in your QUESTION element. My editor shows them as ^Y^N, so I guess they are U+0019 and U+000E, respectively. Both are disallowed in XML. From clerc at uni-bremen.de Fri Feb 10 13:20:03 2006 From: clerc at uni-bremen.de (Daniel Clerc) Date: Fri, 10 Feb 2006 13:20:03 +0100 Subject: [XML-SIG] [SOLVED] Re: problems with encoding and SAX In-Reply-To: <82d07e750602090509p6e07a5dcwbf74f1f665d4a84d@mail.gmail.com> References: <82d07e750602090509p6e07a5dcwbf74f1f665d4a84d@mail.gmail.com> Message-ID: <82d07e750602100420m30fe7cbevdb5b199b2a28379e@mail.gmail.com> Hi! Thanks for your help! In the XML-file are illegal chars. See: http://www.w3.org/TR/2004/REC-xml-20040204/#charsets for legal chars. So I need to build a regexp in order to get rid off the unwanted characters. best regards, Daniel On 2/9/06, Daniel Clerc wrote: > Hi everybody! > > I have some trouble with SAX and encondings... > > When I try to parse the following XML-code: > > > > DURATION="1001"> > K'R® From guthrie at mum.edu Sat Feb 11 18:34:30 2006 From: guthrie at mum.edu (Gregory Guthrie) Date: Sat, 11 Feb 2006 11:34:30 -0600 Subject: [XML-SIG] python XML install problem.. Message-ID: <6.2.5.6.2.20060211113119.01cfb4e8@mum.edu> I am trying to use a package from: From Python Cookbook; http://aspn.activestate.com/ASPN/WebServices/SWSAPI/pytut It uses XML package; so I got: PyXML-0.8.4 When I try ot instsall in (On WIndows..) python setup.py install; I get: D:\Temp\Python\PyXML-0.8.4>python setup.py install running install running build running build_py running build_ext error: The .NET Framework SDK needs to be installed before building extensions for Python. Thanks. ----------------------------------------------- Gregory Guthrie MUM Faculty Mail - FM 1068 Fairfield, IA 52557 http://www.mum.edu/~guthrie (641)472-7773 ------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20060211/ae74afde/attachment.html From mike at skew.org Sat Feb 11 22:25:50 2006 From: mike at skew.org (Mike Brown) Date: Sat, 11 Feb 2006 14:25:50 -0700 (MST) Subject: [XML-SIG] python XML install problem.. In-Reply-To: <6.2.5.6.2.20060211113119.01cfb4e8@mum.edu> Message-ID: <200602112125.k1BLPoOO014285@chilled.skew.org> > D:\Temp\Python\PyXML-0.8.4>python setup.py install > running install > running build > running build_py > running build_ext > error: The .NET Framework SDK needs to be installed before building > extensions for Python. On Windows, you don't need to build PyXML from source. Go to http://sourceforge.net/project/showfiles.php?group_id=6473 and download the appropriate installer file. For Python 2.4 you just need PyXML-0.8.4.win32-py2.4.exe. From ken.beesley at xrce.xerox.com Sun Feb 12 14:20:49 2006 From: ken.beesley at xrce.xerox.com (Ken Beesley) Date: Sun, 12 Feb 2006 14:20:49 +0100 Subject: [XML-SIG] Python 2.4.2, OS X, ucs4 build, unicodedata problem In-Reply-To: References: Message-ID: <43EF3631.8050609@xrce.xerox.com> Python 2.4.2, OS X, ucs4 build, unicodedata problem I need a ucs4 build of Python to reliably handle XML files that can contain supplemental Unicode characters (newer characters beyond the Basic Multilingual Plane). I recently upgraded to OS X 10.4.4 and downloaded (from http://www.python.org/download) the sources for Python 2.4.2. After detarring the package, I did ./configure --enable-framework --enable-unicode-ucs4 make sudo make install Which created and installed a 2.4.2 Python executable, /Library/Frameworks/Python.framework/Versions/2.4/bin/python I can run it, and I confirmed that it is a ucs4 build, e.g. len(u'\U00010400') returns 1, rather than the 2 returned by a ucs2 build. (Python 2.3.5, supplied with 10.4, is a ucs2 build.) THE PROBLEM: when I try (manually, or in a script) to import the unicodedata package, I get the traceback below, which seems to complain about a symbol __PyUnicodeUCS2_ToNumeric not being found when the unicodedata module is imported. Has anyone out there seen or dealt with this problem? Am I just doing something wrong? Thanks, Ken *********************** Traceback **************************** % python Python 2.4.2 (#2, Oct 24 2005, 22:26:37) [GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import unicodedata Traceback (most recent call last): File "", line 1, in ? ImportError: Failure linking new module: /Library/Frameworks/ Python.framework/Versions/2.4/lib/python2.4/lib-dynload/ unicodedata.so: Symbol not found: __PyUnicodeUCS2_ToNumeric Referenced from: /Library/Frameworks/Python.framework/Versions/2.4/ lib/python2.4/lib-dynload/unicodedata.so Expected in: dynamic lookup From mike at skew.org Mon Feb 13 00:28:48 2006 From: mike at skew.org (Mike Brown) Date: Sun, 12 Feb 2006 16:28:48 -0700 (MST) Subject: [XML-SIG] Python 2.4.2, OS X, ucs4 build, unicodedata problem In-Reply-To: <43EF3631.8050609@xrce.xerox.com> Message-ID: <200602122328.k1CNSmPG030138@chilled.skew.org> Ken Beesley wrote: > % python > Python 2.4.2 (#2, Oct 24 2005, 22:26:37) > [GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> import unicodedata > Traceback (most recent call last): > File "", line 1, in ? > ImportError: Failure linking new module: /Library/Frameworks/ > Python.framework/Versions/2.4/lib/python2.4/lib-dynload/ unicodedata.so: > Symbol not found: __PyUnicodeUCS2_ToNumeric > Referenced from: /Library/Frameworks/Python.framework/Versions/2.4/ > lib/python2.4/lib-dynload/unicodedata.so > Expected in: dynamic lookup Since it doesn't have anything directly to do with XML in Python, I suggest posting to python-list / comp.lang.python: http://mail.python.org/mailman/listinfo/python-list More people who can help you monitor that forum. Good luck. From inguin at gmx.de Mon Feb 13 15:30:42 2006 From: inguin at gmx.de (Ingo van Lil) Date: Mon, 13 Feb 2006 15:30:42 +0100 Subject: [XML-SIG] pyexpat: Comments before DOCTYPE Message-ID: <20060213143042.GA10101@marvin.csn.tu-chemnitz.de> Hello there, I ran into a minor problem using the xml.dom.minidom XML parser: An XML document having a comment before a DOCTYPE node seems to leave the DOM data structures in an inconsistent state. Let's say I have a little test.xml file: Hello world and a little Python program to parse it: from xml.dom.minidom import parse dom = parse("test.xml") print "document node:", dom print len(dom.childNodes), "children" print "first child:", dom.firstChild print "next sibling:", dom.firstChild.nextSibling The output of that program is: document node: 3 children first child: next sibling: None I.e. the document node does have three children (a comment node, a DocumentType instance and an element), but the first child's nextSibling pointer isn't set correctly. This breaks my algorithm, which is supposed to recursively walk the entire DOM tree, but stops after the first node instead. I'm not entirely sure whether this really is a bug in pyexpat or an error in my XML file. I haven't found any hints whether an XML document is allowed to have comment before the DOCTYPE declaration. xmllint doesn't seem to complain about it, though. Cheers, Ingo From inguin at gmx.de Mon Feb 13 21:47:25 2006 From: inguin at gmx.de (Ingo van Lil) Date: Mon, 13 Feb 2006 21:47:25 +0100 Subject: [XML-SIG] pyexpat: Comments before DOCTYPE In-Reply-To: <20060213143042.GA10101@marvin.csn.tu-chemnitz.de> References: <20060213143042.GA10101@marvin.csn.tu-chemnitz.de> Message-ID: <20060213204725.GA6320@marvin.csn.tu-chemnitz.de> On 13 Feb 2006, Ingo van Lil wrote: > I ran into a minor problem using the xml.dom.minidom XML parser: An XML > document having a comment before a DOCTYPE node seems to leave the DOM > data structures in an inconsistent state. Hi again. I had a look at the source code, and the reason for the effect I observed isn't all that hard to spot: The start_doctype_decl_handler in expatbuilder.py:240 directly manipulates the document's childNodes vector rather than using the _append_child function responsible for keeping all those nextSibling/previousSibling/parentNode pointers up-to-date. Unless the current behaviour is for some reason intentional (I doubt it), the appended patch (against Python 2.4.2) should fix the problem. Cheers, Ingo -------------- next part -------------- --- Lib/xml/dom/expatbuilder.py.orig 2006-02-13 20:53:44.000000000 +0100 +++ Lib/xml/dom/expatbuilder.py 2006-02-13 20:55:29.000000000 +0100 @@ -242,7 +242,7 @@ doctype = self.document.implementation.createDocumentType( doctypeName, publicId, systemId) doctype.ownerDocument = self.document - self.document.childNodes.append(doctype) + _append_child(self.document, doctype) self.document.doctype = doctype if self._filter and self._filter.acceptNode(doctype) == FILTER_REJECT: self.document.doctype = None From ajay at infogridpacific.com Mon Feb 20 14:33:44 2006 From: ajay at infogridpacific.com (Ajay Abhyankar) Date: Mon, 20 Feb 2006 19:03:44 +0530 Subject: [XML-SIG] Namespace prefix being changed while saving file. Message-ID: <43F9C538.4050409@infogridpacific.com> Hi, I was trying cElementTree for reading and updating an xml file. I am using iterparse to parse and make relevant changes to the xml as required. Everything works very fine till I use a valid xml namespace in xml file. It is not giving any problems in manipulation of file content, but only changes the namespace prefix on its own to something like "ns0" and retains the original URL, when the file is written back after updates. Can the namespace prefix be retained after manipultion? Am I doing something wrong or have I missed out on something. Please help to understand and solve the problem. Thanks in advance. Ajay From vincent at hydrosoft.com.br Thu Feb 23 14:31:23 2006 From: vincent at hydrosoft.com.br (Vincent Buonomano) Date: Thu, 23 Feb 2006 10:31:23 -0300 Subject: [XML-SIG] Constructing complex trees from relational data bases using XML schemas Message-ID: <002001c6387d$73b5ce80$0100000a@star> The XMLServer presents an XML view of a relational database which is defined by an XML schema with all the necessary information contained in the appinfo elements of the schema. It serves the constructed XML to a browser, Java or Mathematica. You may get a record by key and the next or previous one. There are currently a large number of products dedicated to this end (see Bourret ). What distinguishes this product is it's ability to construct arbitrarily complex trees. Examples You may download it from XMLServer Beta Version 1, and freely use and distribute it. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20060223/f0ed5d55/attachment.html From fredrik at pythonware.com Tue Feb 28 08:31:19 2006 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue, 28 Feb 2006 08:31:19 +0100 Subject: [XML-SIG] Namespace prefix being changed while saving file. References: <43F9C538.4050409@infogridpacific.com> Message-ID: Ajay Abhyankar wrote: > I was trying cElementTree for reading and updating an xml file. I am > using iterparse to parse and make relevant changes to the xml as required. > Everything works very fine till I use a valid xml namespace in xml file. > It is not giving any problems in manipulation of file content, but only > changes the namespace prefix on its own to something like "ns0" and > retains the original URL, when the file is written back after updates. > Can the namespace prefix be retained after manipultion? Am I doing > something wrong or have I missed out on something. > Please help to understand and solve the problem. the standard ET parser throws away the prefix, and the standard serializer generates new prefixes on the fly. for many applications, this is not a problem -- it's the namespace URL that matters in XML, not the prefix. if you want to preserve namespaces under stock ET, your best bet is to use iterparse's namespace events to collect prefix information, and either update the _namespace_map dictionary: from elementtree import ElementTree # undocumented, guaranteed to be supported in all 1.2 releases ElementTree._namespace_map[url] = prefix ElementTree._namespace_map[url] = prefix ... ... the serializer now maps {url}foo to prefix:foo, for all url/prefix ... pairs in the namespace map ... or use a custom serializer (or a postprocessing step). hope this helps! From dieter at handshake.de Tue Feb 28 19:40:53 2006 From: dieter at handshake.de (Dieter Maurer) Date: Tue, 28 Feb 2006 19:40:53 +0100 Subject: [XML-SIG] Python 2.4.2, OS X, ucs4 build, unicodedata problem In-Reply-To: <43EF3631.8050609@xrce.xerox.com> References: <43EF3631.8050609@xrce.xerox.com> Message-ID: <17412.39221.220486.888777@gargle.gargle.HOWL> Ken Beesley wrote at 2006-2-12 14:20 +0100: > ... >THE PROBLEM: when I try (manually, or in a script) to import the >unicodedata package, I get the traceback below, which seems to >complain about a symbol __PyUnicodeUCS2_ToNumeric not being >found when the unicodedata module is imported. Looks that you did not rebuild "unicodedata". Whenever you change the generations options for a Python build, you usually need to regenerate all extensions (such as "unicodedata") to ensure that they use the same options. Usually, a "make clean" should be sufficient to get rid of the old versions. -- Dieter