From martin@loewis.home.cs.tu-berlin.de Sun Jul 1 19:20:59 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Sun, 1 Jul 2001 20:20:59 +0200 Subject: [XML-SIG] Re: Narval 1.0 and Python 2.1 In-Reply-To: <3B3E05A7.CA87E510@zolera.com> (message from Rich Salz on Sat, 30 Jun 2001 13:00:23 -0400) References: <200106210745.f5L7jrm01579@mira.informatik.hu-berlin.de> <3B33669D.9EB0C620@FourThought.com> <200106301634.f5UGYDQ08994@mira.informatik.hu-berlin.de> <3B3E05A7.CA87E510@zolera.com> Message-ID: <200107011820.f61IKxA01012@mira.informatik.hu-berlin.de> > > I wanted to integrate 4XSLT into PyXML, in a way that does not > > require 4Suite. > > That means 4XPATH also, right? Right. Martin From tpassin@home.com Mon Jul 2 03:41:58 2001 From: tpassin@home.com (Thomas B. Passin) Date: Sun, 1 Jul 2001 22:41:58 -0400 Subject: [XML-SIG] 4xslt bug involving key() References: <006601c0f077$eb31fc70$f803a8c0@zeus> <004101c0f21c$fa2e0560$7cac1218@reston1.va.home.com> <3B3DE70A.653002A6@FourThought.com> Message-ID: <000f01c102a0$911cef20$7cac1218@reston1.va.home.com> [Mike Olson] Thanks, Mike, I'll see if I can retrieve it and get it to work. Much appreciated. Cheers, Tom P > "Thomas B. Passin" wrote: > > > Thomas, > > I forget if someone replied to you, but this appears to be fixed in > CVS. > > Mike > > > > I've just found that a stylesheet construction that I need to use doesn't > > work right with 4xslt (the python 1.5.2 version I got from the 4suite.org > > site several weeks ago). > > > > The stylesheet takes a number of elements that have duplicated content and > > produce a list without duplicates. It's a simplified Muenchian method, > > using and key(). It works right with msxml3, saxon, and xalan, > > but not 4xslt. I need to use this in a project I'm in the middle of at > > work, so I request the 4thought people (Mike, would that be?) to take a look > > at it. > > > > If 4xslt doesn't implement keys (I thought it did), then at least it should > > throw an error. > > XML strategy, XML tools (http://4Suite.org), knowledge management From uche.ogbuji@fourthought.com Mon Jul 2 06:02:32 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Sun, 01 Jul 2001 23:02:32 -0600 Subject: [XML-SIG] Re: [4suite] 4xslt: bug and patch: variable import order References: <15096.25410.753829.204197@lindm.dm> Message-ID: <3B400068.BF01CAC8@fourthought.com> Dieter Maurer wrote: > > The XSLT spec specifies that definitions and template rules > in an importing stylesheet take precedence over those from > an imported stylesheet. This is essential for easy customization > of imported stylesheets. > > "4xslt" implements this feature only partially: > > Top level variables in an importing stylesheet do not > take precedence over imported ones. > > The attached patch hopefully fixes the problem. > It ensures that variables in importing style sheets > take precedence over those defined in imported style sheets > and that all style sheets use the same top level variables. Note: my fix was quite different. I hadn't applied this patch because I knew the problem was more fundamental. Thanks, though. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4Suite.org), knowledge management From uogbuji@fourthought.com Mon Jul 2 06:56:47 2001 From: uogbuji@fourthought.com (Uche Ogbuji) Date: Sun, 01 Jul 2001 23:56:47 -0600 Subject: [XML-SIG] Reader() newbie question Message-ID: <200107020556.f625ulp02108@localhost.local> > > p = make_parser("xml.sax.drivers.drv_xmlproc") > > reader = Sax.Reader(parser=p) > > Sorry it took so long to get back to you. This works fine for me, Thanks. > In my case since I want to use a validating parser I use: > > p = make_parser("xml.sax.drivers.drv_xmlproc_val") > reader = Sax.Reader(parser=p) Oops. Right. > Since I can also write: > > p = make_parser("xml.sax.drivers.drv.pyexpat") > reader = Sax.Reader(parser=p) > > what is considered the "correct" method if I want to use expat? The above > line or what I have seen more often: > > reader = PyExpat.Reader() > > I mean, sure the second method is one line shorter, but the first one is > consistent across all the parsers on my machine under 'xml.sax.drivers' > Does the second method do important init stuff (or whatever) that I am > missing? Their both right, and equivalent, since PyXML sets up "xml.sax.drivers.drv.pyexpat" as the default non-validating SAX driver. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4Suite.org), knowledge management From uche.ogbuji@fourthought.com Mon Jul 2 19:29:33 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Mon, 02 Jul 2001 12:29:33 -0600 Subject: [XML-SIG] Reader() newbie question In-Reply-To: Message from Uche Ogbuji of "Sun, 01 Jul 2001 23:56:47 MDT." <200107020556.f625ulp02108@localhost.local> Message-ID: <200107021829.f62ITXN04420@localhost.local> Me: > Their both right, and equivalent, since PyXML sets up... ^^^^^ Ouch. I'm more sleep-deprived than I thought. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4Suite.org), knowledge management From tpassin@home.com Tue Jul 3 01:20:21 2001 From: tpassin@home.com (Thomas B. Passin) Date: Mon, 2 Jul 2001 20:20:21 -0400 Subject: [XML-SIG] 4xslt bug involving key() References: <006601c0f077$eb31fc70$f803a8c0@zeus> <004101c0f21c$fa2e0560$7cac1218@reston1.va.home.com> <3B3DE70A.653002A6@FourThought.com> Message-ID: <006b01c10355$f2eff480$7cac1218@reston1.va.home.com> 4xslt from CVS doesn't run. Here's what I did. I have pyxml 0.65/python 1.5.2 on Windows. I copied the three directories xslt, xpath, util from the CVS on SourceForge, renamed the corresponding 0.65 directories to save them, and copied the three CVS directories in their place. Then I ran my command line wrapper for 4xslt. I get an error, of which this is the salient part: File "D:\PROGRA~2\PYTHON\xml\xpath\Conversions.py", line 23, in ? from xml.utils import boolean ImportError: cannot import name boolean There is no file called "boolean" in the CVS, nor does xml\util\__init__.py define boolean. What do I need to make this work? Cheers, Tom P [Mike Olson] > "Thomas B. Passin" wrote: > > > Thomas, > > I forget if someone replied to you, but this appears to be fixed in > CVS. > From larsga@garshol.priv.no Tue Jul 3 07:44:21 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 03 Jul 2001 08:44:21 +0200 Subject: [XML-SIG] SAX event: internalEntityDecl In-Reply-To: <3B3DD62E.22498.3C2354@localhost> References: <3B3DD62E.22498.3C2354@localhost> Message-ID: * Arne Krug | | is there a way to distinguish between SAX entity-events: | one Entity is declared in an external dtd and | the other one is directly in the xml-file It is possible to do this, though the information is not directly present in any specific event. Using other events it is possible to figure out where you are at any given time. startDTD(...) # events here come from the internal subset startEntity("[dtd"] # events here from the external subset endEntity("[dtd]") endDTD(...) I hope this helps. --Lars M. From jeremy.kloth@fourthought.com Tue Jul 3 15:44:31 2001 From: jeremy.kloth@fourthought.com (Jeremy Kloth) Date: Tue, 3 Jul 2001 08:44:31 -0600 Subject: [XML-SIG] 4xslt bug involving key() References: <006601c0f077$eb31fc70$f803a8c0@zeus> <004101c0f21c$fa2e0560$7cac1218@reston1.va.home.com> <3B3DE70A.653002A6@FourThought.com> <006b01c10355$f2eff480$7cac1218@reston1.va.home.com> Message-ID: <002901c103ce$acf26e80$703d64c0@den.xcare.net> From: "Thomas B. Passin" > 4xslt from CVS doesn't run. Here's what I did. I have pyxml 0.65/python > 1.5.2 on Windows. I copied the three directories xslt, xpath, util from the > CVS on SourceForge, renamed the corresponding 0.65 directories to save them, > and copied the three CVS directories in their place. Then I ran my command > line wrapper for 4xslt. > > I get an error, of which this is the salient part: > > File "D:\PROGRA~2\PYTHON\xml\xpath\Conversions.py", line 23, in ? > from xml.utils import boolean > ImportError: cannot import name boolean > > There is no file called "boolean" in the CVS, nor does xml\util\__init__.py > define boolean. What do I need to make this work? > The boolean module is an extension module that should have been made if 'setup.py install' was run. The source for that extension lives in (from CVS root) xml/extensions. -- Jeremy Kloth Consultant jeremy.kloth@fourthought.com +1 303 583 9900 x 105 Fourthought, Inc. http://fourthought.com 4735 East Walnut St, Suite C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4suite.org), knowledge management From uche.ogbuji@fourthought.com Tue Jul 3 16:38:04 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Tue, 03 Jul 2001 09:38:04 -0600 Subject: [XML-SIG] 4xslt bug involving key() In-Reply-To: Message from "Thomas B. Passin" of "Mon, 02 Jul 2001 20:20:21 EDT." <006b01c10355$f2eff480$7cac1218@reston1.va.home.com> Message-ID: <200107031538.f63Fc4t09145@localhost.local> > 4xslt from CVS doesn't run. Here's what I did. I have pyxml 0.65/python > 1.5.2 on Windows. I copied the three directories xslt, xpath, util from the > CVS on SourceForge, renamed the corresponding 0.65 directories to save them, > and copied the three CVS directories in their place. Then I ran my command > line wrapper for 4xslt. > > I get an error, of which this is the salient part: > > File "D:\PROGRA~2\PYTHON\xml\xpath\Conversions.py", line 23, in ? > from xml.utils import boolean > ImportError: cannot import name boolean > > There is no file called "boolean" in the CVS, nor does xml\util\__init__.py > define boolean. What do I need to make this work? Weird. None of this should have changed since the beta. xml.utils.boolean.so (or .pyd) should have ben built with your PyXML build. For instance, on my machine: /usr/local/lib/python2.1/site-packages/_xmlplus/utils/boolean.so How did you build/install PyXML? BTW, you'll want the most recent CVS 4Suite (from a few hours ago): important fixes. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4Suite.org), knowledge management From law@otelnet.com Tue Jul 3 16:34:41 2001 From: law@otelnet.com (Katherina Law) Date: Tue, 3 Jul 2001 08:34:41 -0700 Subject: [XML-SIG] build question Message-ID: <65E7CA3B34A0D211B65300A0C9E1CF4F065E9D50@bluewhale.otelnet.com> We have Python 2 running on Sun OS 2.6, when I tried to complile, I'm getting the following error, do I need to have libcurses.so.1? Where can I find it? >python setup.py build ld.so.1: python: fatal: libcurses.so.1: open failed: No such file or directory Killed Many thanks, Katherina From Alexandre.Fayolle@logilab.fr Tue Jul 3 18:38:14 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Tue, 3 Jul 2001 19:38:14 +0200 (CEST) Subject: [XML-SIG] The new version of XPath Message-ID: Hello, Just a quick question. The new version of XPath is 8bit character friendly, and possibly unicode friendly, which is great news as far as I'm concerned. Is it thread safe ? Alexandre Fayolle -- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). From uche.ogbuji@fourthought.com Tue Jul 3 18:46:01 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Tue, 03 Jul 2001 11:46:01 -0600 Subject: [XML-SIG] The new version of XPath In-Reply-To: Message from Alexandre Fayolle of "Tue, 03 Jul 2001 19:38:14 +0200." Message-ID: <200107031746.f63Hk1j09490@localhost.local> > Just a quick question. The new version of XPath is 8bit character > friendly, and possibly unicode friendly, which is great news as far as I'm > concerned. It's unicode friendly using UTF-8. > Is it thread safe ? It should be. If you find any problems with threading, do let us know. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4Suite.org), knowledge management From jeremy.kloth@fourthought.com Tue Jul 3 19:46:20 2001 From: jeremy.kloth@fourthought.com (Jeremy Kloth) Date: Tue, 3 Jul 2001 12:46:20 -0600 Subject: [XML-SIG] The new version of XPath References: Message-ID: <00da01c103f0$738c6660$703d64c0@den.xcare.net> From: "Alexandre Fayolle" > Hello, > > Just a quick question. The new version of XPath is 8bit character > friendly, and possibly unicode friendly, which is great news as far as I'm > concerned. Is it thread safe ? > Both the C and pure Python parsers are completely stateless. The concurrency issues from before went away when we removed Flex (Bison was already stateless). -- Jeremy Kloth Consultant jeremy.kloth@fourthought.com +1 303 583 9900 x 105 Fourthought, Inc. http://fourthought.com 4735 East Walnut St, Suite C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4suite.org), knowledge management From jeremy.kloth@fourthought.com Tue Jul 3 19:46:25 2001 From: jeremy.kloth@fourthought.com (Jeremy Kloth) Date: Tue, 3 Jul 2001 12:46:25 -0600 Subject: [XML-SIG] The new version of XPath References: Message-ID: <00db01c103f0$768ef800$703d64c0@den.xcare.net> From: "Alexandre Fayolle" > Hello, > > Just a quick question. The new version of XPath is 8bit character > friendly, and possibly unicode friendly, which is great news as far as I'm > concerned. Is it thread safe ? > Both the C and pure Python parsers are completely stateless. The concurrency issues from before went away when we removed Flex (Bison was already stateless). -- Jeremy Kloth Consultant jeremy.kloth@fourthought.com +1 303 583 9900 x 105 Fourthought, Inc. http://fourthought.com 4735 East Walnut St, Suite C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4suite.org), knowledge management From tpassin@home.com Wed Jul 4 00:20:14 2001 From: tpassin@home.com (Thomas B. Passin) Date: Tue, 3 Jul 2001 19:20:14 -0400 Subject: [XML-SIG] 4xslt bug involving key() References: <200107031538.f63Fc4t09145@localhost.local> Message-ID: <004801c10416$b6d09880$7cac1218@reston1.va.home.com> [Uche Ogbuji] > > There is no file called "boolean" in the CVS, nor does xml\util\__init__.py > > define boolean. What do I need to make this work? > > Weird. None of this should have changed since the beta. > > xml.utils.boolean.so (or .pyd) should have ben built with your PyXML build. > For instance, on my machine: > > /usr/local/lib/python2.1/site-packages/_xmlplus/utils/boolean.so > > How did you build/install PyXML? > I installed the Pyxml 0.65 binary for Windows Python 1.5.2. I did not install a complete new installation from the CVS. I also don't own any Microsoft C compilers and I'm not about to shell out to get one, so any "setyp.py install" that wants to compile something is out of luck. But the required file must be pretty simple, right? Do I have to get the whole CVS compiled/installed to get the latest version of 4xslt working, or what? Cheers, Tom P From noreply@sourceforge.net Wed Jul 4 01:12:02 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 03 Jul 2001 17:12:02 -0700 Subject: [XML-SIG] [ pyxml-Bugs-438397 ] truncated content passed to characters() Message-ID: Bugs item #438397, was opened at 2001-07-03 17:12 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=438397&group_id=6473 Category: SAX Group: None Status: Open Resolution: None Priority: 5 Submitted By: Mr. Codepage (codepage) Assigned to: Nobody/Anonymous (nobody) Summary: truncated content passed to characters() Initial Comment: Parsing a pretty simple 500k xml file. The bad output lines in question look like c <--- truncated, should be com.xxxxxx.ejb.domain.intfc com/xxxxxx/ejb/domain/intfc/AdverseReactionType.java com.xxxxxx.e <--- truncated com/xxxxxx/ejb/service/hsif/msgHandler/intfc/HLSevenHan dler.java This is an xml file that describes the source pool at a certain release point in time. I rewrote the small script in java with Xereces and it is fine. The XML file does NOT contain truncated data. If I extract the portions of the datafile above that are having problems and put it in its own xml file, it works fine (with the code below). It is only this configuration of the datafile that is truncating the value of content passed to characters(). The XML file is well formed. class packageScan(saxutils.DefaultHandler): def __init__(self): self.showText = 0 self.grabPath = 0 self.Path = "" def startElement(self, name, attrs): if name == "package": self.showText = 1 elif name == "path": self.grabPath = 1 def characters(self, content): if self.showText == 1: if len(content) < 13: print content print self.Path self.showText = 0 if self.grabPath == 1: self.Path = content self.grabPath = 0 python 2.1 pyxml 0.6.5 I would be happy to test any workarounds, patches, etc. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=438397&group_id=6473 From tpassin@home.com Wed Jul 4 04:43:42 2001 From: tpassin@home.com (Thomas B. Passin) Date: Tue, 3 Jul 2001 23:43:42 -0400 Subject: [XML-SIG] 4xslt bug involving key() References: <200107031538.f63Fc4t09145@localhost.local> Message-ID: <000e01c1043b$85d51380$7cac1218@reston1.va.home.com> [Uche Ogbuji] > > > > File "D:\PROGRA~2\PYTHON\xml\xpath\Conversions.py", line 23, in ? > > from xml.utils import boolean > > ImportError: cannot import name boolean > > > > There is no file called "boolean" in the CVS, nor does xml\util\__init__.py > > define boolean. What do I need to make this work? > > Weird. None of this should have changed since the beta. > > xml.utils.boolean.so (or .pyd) should have ben built with your PyXML build. > For instance, on my machine: > > /usr/local/lib/python2.1/site-packages/_xmlplus/utils/boolean.so > > How did you build/install PyXML? > OK, I'm making progress. My installation on Windows has a boolean.pyd in both the ft/Lib and ft/extensions directories, both the same file. This is the 0.11 version. I copied that file to the xml/utils directory so the xpath script could find it. Apparently this file is now supposed to be in xml/utils, not extensions. Now there is a different failure: File "D:\PROGRA~2\PYTHON\xml\xpath\CoreFunctions.py", line 21, in ? from xml.xpath import Util, Conversions File "D:\PROGRA~2\PYTHON\xml\xpath\Conversions.py", line 179, in ? _strConversions = { AttributeError: BooleanType I looked at the boolean.h and boolean.c files in the cvs, and they contain PyBoolean_Type, not PyBooleanType. There is no string BooleanType. Also I looked at my boolean.pyd with a hex editor, and it doesn't contain BooleanType or Boolean_Type at all. RIght now, it looks like several things are happening: 1) boolean.pyd (or .so, I guess) is expected by xpath.Conversions to be in xml\utils, but it's in extensions\ in the cvs. 2) It seems that xpath.Conversions now expects objects of type BooleanType, but boolean.pyd/.so thinks it should be called Boolean_Type. 3) It looks like Boolean_Type and BooleanType were not used in the 0.11 version of 4suite. Perhaps these are all incorrect deductions, someone please enlighten me. Anyway, I can't use the new versions in cvs until someone makes a 1.52 binary version for Windows. Would someone be willing to do that? Cheers, Tom P From sales@now.net.cn Wed Jul 4 05:58:25 2001 From: sales@now.net.cn (网络时代) Date: Wed, 4 Jul 2001 12:58:25 +0800 Subject: [XML-SIG] 一个域名建6个网站(域名注册大优惠!) Message-ID: <200107040458.f644wP612788@localhost.localdomain> 一个域名建6个网站(域名注册大优惠!) 尊敬的客户:您好! 凡在6月23号到7月23号之间在网络时代完成国内、国际、新域名注册,就可以一个域名就能同时建立六个网站,也就是说次级域名数由原来的3个变为6个!再加上VDNS特有的三种指向功能,要建设自己的未来网易和搜狐,就要从现在开始! 互联网上每两秒种就会消失一个域名!投资自己的网络资源,您的收获将大大超出您的想象! Today’s Network(http://www.now.net.cn)创先开发的VDNS域名服务器,能实现URL转发、主机A记录、MX邮件记录、IP指向控制等操作,更可以随心所欲地增加自己的次级域名, 帮助您建立多个网站,你可以让她指向任何空间,也可以申请一次空间就建六个网站,也可以不申请空间而用原来的空间,甚至用免费空间, 更利于您优化使用空间资源。 和您共同推动互联网发展,我们现推出“注册域名暑期大优惠活动”, 除此之外,我们还为您送上我们特有的网站管理工具WEB-ADMIN,它是集上传,下载 网页编辑 文件移动 删除 拷贝等功能于一体网站管理工具,与传统FTP合用,更有效,更方便地管理您的网站。 本次活动截止于7月23号,请抓住您的机遇,开拓您的网上商机,赶快注册您梦想的域名!请点击http://www.now.net.cn/register/ 我们一直以专业、优质、领先为宗旨,热诚为您服务! 欢迎致信 Today's Network support@now.net.cn 欢迎你访问 我们的网站 http://www.now.net.cn From b.fathi@gmx.net Wed Jul 4 10:39:29 2001 From: b.fathi@gmx.net (Bijan Fathi) Date: Wed, 4 Jul 2001 11:39:29 +0200 (MEST) Subject: [XML-SIG] illigal character encoding bug in minidom + patch Message-ID: <8659.994239569@www33.gmx.net> This is a MIME encapsulated multipart message - please use a MIME-compliant e-mail program to open it. Dies ist eine mehrteilige Nachricht im MIME-Format - bitte verwenden Sie zum Lesen ein MIME-konformes Mailprogramm. --========GMXBoundary8659994239569 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Category: DOM/Minidom Group: None Status: Solved/Patch supplied Resolution: None Priority: 4 Submitted By: Bijan Fathi (b.fathi@gmx.net) #Assigned to: Nobody/Anonymous (nobody) Summary: illigal characters have not been escaped Initial Comment: characters with the code above 127 have been written to the xml file by Text, but minidom couldn't open the xml file because it contained illigal characters. (it was not represented as well formed in ms ie as well) the supplied patch escapes all characters above 127 (including unicode) in ordinary hex character reference notation (&#xnnnn;) of course this only the representaion in the xml file, after loading the file the data is well represented as unicode or 8-bit character python 2.0 pyxml 0.6.5 I would be thankful if you would supply this patch to minidom.py -- Bijan Fathi GMX - Die Kommunikationsplattform im Internet. http://www.gmx.net GMX Tipp: Machen Sie Ihr Hobby zu Geld bei unserem Partner 1&1! http://profiseller.de/info/index.php3?ac=OM.PS.PS003K00596T0409a --========GMXBoundary8659994239569 Content-Type: text/plain; name="charenc-bug.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="charenc-bug.patch" MjY4YTI2OSwyNzMKPiAgICAgZGF0YXRtcCA9IGRhdGEKPiAgICAgZGF0YSA9ICIiCj4gICAgIGZv ciBpIGluIGRhdGF0bXA6Cj4gICAgICAgICBpZiBvcmQoaSkgPiAxMjIgOiAgZGF0YSA9IGRhdGEg KyAiJiN4JTA0eDsiICUgb3JkKGkpCQo+ICAgICAgICAgZWxzZSA6IAkJCSAgIGRhdGEgPSBkYXRh ICsgaQo= --========GMXBoundary8659994239569-- From noreply@sourceforge.net Wed Jul 4 13:25:29 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 04 Jul 2001 05:25:29 -0700 Subject: [XML-SIG] [ pyxml-Bugs-438514 ] syntax error on xml.dom.ext.__init__ Message-ID: Bugs item #438514, was opened at 2001-07-04 05:25 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=438514&group_id=6473 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Alexandre Fayolle (afayolle) Assigned to: Nobody/Anonymous (nobody) Summary: syntax error on xml.dom.ext.__init__ Initial Comment: Using the latest version of PyXML from CVS (just did an update), I got a SyntaxError on xml.dom.ext File "/home/alf/Narval/narval/lib.py", line 32, in ? from xml.dom.ext import Print, PrettyPrint, StripXml File "/home/alf/lib/python/_xmlplus/dom/ext/__init__.py", line 285 elif attr.namespaceURI: Here's a patch which fixes this indentation problem. --- __init__.py~ Sat Jun 23 19:11:08 2001 +++ __init__.py Wed Jul 4 14:25:00 2001 @@ -282,7 +282,7 @@ nss[''] = attr.value else: nss[attr.localName] = attr.value - elif attr.namespaceURI: - nss[attr.prefix] = attr.namespaceURI + elif attr.namespaceURI: + nss[attr.prefix] = attr.namespaceURI SeekNss(child, nss) return nss Cheers Alexandre Fayolle Logilab ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=438514&group_id=6473 From noreply@sourceforge.net Fri Jul 6 05:17:45 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 05 Jul 2001 21:17:45 -0700 Subject: [XML-SIG] [ pyxml-Bugs-438967 ] indentation error in current cvs Message-ID: Bugs item #438967, was opened at 2001-07-05 21:17 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=438967&group_id=6473 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Gregory P. Smith (greg) Assigned to: Nobody/Anonymous (nobody) Summary: indentation error in current cvs Initial Comment: after python setup.py install i get a "bad syntax" on line 285 of xml/dom/ext/__init__.py in SeekNss(). Looks like an indentation error. elif matches up with the for when it should be over four spaces to match up with the if. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=438967&group_id=6473 From noreply@sourceforge.net Fri Jul 6 13:37:49 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Fri, 06 Jul 2001 05:37:49 -0700 Subject: [XML-SIG] [ pyxml-Bugs-439031 ] startEntity/endEntity event Message-ID: Bugs item #439031, was opened at 2001-07-06 05:37 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=439031&group_id=6473 Category: xmlproc Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: startEntity/endEntity event Initial Comment: in the following example no startEntity/endEntity event (LexicalHandler) ocurred: mail.xml: ]> &henning; &ingo; Mon, 21 Apr 1997 09:27:55 +0200 XML literature mail.dtd: ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=439031&group_id=6473 From geert@boskant.nl Sun Jul 8 16:05:50 2001 From: geert@boskant.nl (Geert Jansen) Date: Sun, 8 Jul 2001 17:05:50 +0200 Subject: [XML-SIG] DocumentFragment bug in minidom Message-ID: Hi! While playing around with minidom and DocumentFragments, I ran across a small bug in the handling of DocumentFragments. (I'm using vanilla Python 2.1) When you're adding a DocumentFragment to a node with Node.appendNode(), this is supposed to add all children of the DocumentFragment to the Node. I noticed however that when the DocumentFragment has more than one node, its _last_ node is skipped. Looking through the sources, the problem seems to be caused in minidom.py, lines 140-141: def appendChild(self, node): if node.nodeType == self.DOCUMENT_FRAGMENT_NODE: for c in node.childNodes: self.appendChild(c) ### The DOM does not clearly specify what to return in this case return node The call "self.appendChild(c)" changes the list node.childNodes under our feet, because it tries to remove the child from its parent. This apparently works out in such a way that the iteration of node.childNodes skips the last element. With the patch below, appendChild() does work as expected with DocumentFragment's. --- minidom.py.old Sat Jul 7 15:42:51 2001 +++ minidom.py Sat Jul 7 15:48:59 2001 @@ -137,7 +137,9 @@ def appendChild(self, node): if node.nodeType == self.DOCUMENT_FRAGMENT_NODE: - for c in node.childNodes: + # Make a copy of childNodes as appendChild() will change it. + children = [ c for c in node.childNodes ] + for c in children: self.appendChild(c) ### The DOM does not clearly specify what to return in this case return node Can this patch be applied? Please CC me in replies, as I'm not subscribed to the list. Greetings, Geert Jansen From rsalz@zolera.com Sat Jul 7 19:18:32 2001 From: rsalz@zolera.com (Rich Salz) Date: Sat, 07 Jul 2001 14:18:32 -0400 Subject: [XML-SIG] DocumentFragment bug in minidom References: Message-ID: <3B475278.DC93921B@zolera.com> > + # Make a copy of childNodes as appendChild() will change it. > + children = [ c for c in node.childNodes ] > + for c in children: Probably better to write it this way -- more clear, works in 1.5: for c in node.childNodes[:]: /r$ -- Zolera Systems, Securing web services (XML, SOAP, Signatures, Encryption) http://www.zolera.com From brian@sweetapp.com Sat Jul 7 20:27:13 2001 From: brian@sweetapp.com (Brian Quinlan) Date: Sat, 7 Jul 2001 12:27:13 -0700 Subject: [XML-SIG] Pyana (a Python interface to the Xalan XSLT engine) 0.1.0 released Message-ID: <000a01c1071a$d37922c0$445d4540@D1XYVL01> Windows binaries (you can get the source from CVS) for Pyana 0.1.0 have been released and are available at: http://sourceforge.net/project/showfiles.php?group_id=28142 It fixes a log of bugs and introduces the experimental ability to extend the Xalan XPath engine with Python functions. Here is a simple example: def sum(*args): """Compute the sum of all arguments""" s = 0 for i in args: s += i return s Pyana.install( 'exampleNS', sum, 'sum' ) inputExampleXSL = r''' ''' inputExampleXML = r''' ignored ''' print Pyana.transform(inputExampleXML, inputExampleXSL) # => '15' From elwinsoftware@hypermart.net Sat Jul 7 21:01:20 2001 From: elwinsoftware@hypermart.net (Elwin Software) Date: Sat, 07 Jul 2001 21:01:20 +0100 Subject: [XML-SIG] Software Developer Message-ID: <3B476A90.AA114174@hypermart.net> {You will only receive this message today.}
 

I visited your site and saw that you to are also a developer of software.

I simply want to let you know about a software registration service that
has been around since 1994 - called The Ordering Network.

And I will just point out a few benefits as i know them.
Yes if you sign up i will get credit as a referral.
 

   #They have very lows fees - the percentage is as low as 8.5%
   #They process the registration in seconds.
   #They can generate your key in seconds - no extra cost

I can keep going - but its really worth a look

Please follow this link so i get credit.  Or copy and paste into the address line.
 http://www.evergreennetworks.com/register2/devSignup.asp?refID=W1172
 

If you have any questions please let me know.
http://elwinsoftware.hypermart.net/

**  If you received this in error - God Bless your understanding and compassion.
You are not on a list.  Im just sending you this mail today. From Alexandre.Fayolle@logilab.fr Tue Jul 10 08:15:04 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Tue, 10 Jul 2001 09:15:04 +0200 (CEST) Subject: [XML-SIG] Semantext Message-ID: This has just arrived from comp.lang.python.announce: ---------------------------8<--------------------------- The 0.72 release of SemanText has just been posted at http://www.semantext.com/ Among the new features are: * Context-based harvesting - This allows topics and associations to be automatically constructed from XML documents by identifying specific information to be harvested. * Full topic map maintenance capability - Topics, associations, occurrences, and facets can be added, modified and deleted via the SemanText interface. * Choice of look-and-feel - A classic web browser style of interface or a push-button style of interface. * Choice of view - Users can select whether to look at the information from a topic map point of view (only the information contained in the topic map) or a knowledge base point of view (information based on interpretting the topic map or generated by the inference rules). * XTM export - Topic maps can be exported in accordance with the new XTM specification. About SemanText SemanText is a prototype application developed, using Python, to demonstrate how the topic map standard (ISO/IEC 13250:2000) and XML Topic Maps (XTM) can be used to represent the knowledge contained within documents by building semantic networks. Semantic networks are a building block for artificial intelligence applications such as inference engines and expert systems. ----------------------------8<---------------------------------------------- Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From faassen@vet.uu.nl Tue Jul 10 14:05:36 2001 From: faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 10 Jul 2001 15:05:36 +0200 Subject: [XML-SIG] XPath and Zope's ParsedXML DOM Message-ID: <20010710150536.A26286@vet.uu.nl> Hi there, I've been trying to make XPath work with Zope's DOM implementation, ParsedXML. In the process I've discovered some incompatibilities in XPath that I had to hack around. The problem is as follows. ParsedXML uses DOM nodes that are descendants of ExtensionClass. This means that type(node) != types.InstanceType. Conversions.py depends on this in several places, however. After hacking around them XPath works better for me. Here are the two places where I had to hack: The function CoreStringValue has this line: result = _strConversions.get(type(object), _strUnknown)(object) but, since instances now don't trigger the InstanceType key in _strConversions, this fails and returns None for valid instances. I've hacked around this by doing the following: if hasattr(object, 'ownerDocument'): result = _strInstance(object) else: result = _strConversions.get(type(object), _strUnknown)(object) I'm not sure if this succeeds in all cases and it's a hack. I'll study ExtensionClasses to see if there may be a better way. The other hack is similar and involves the types.ListType entry in the _strConversions dictionary. Again the lookup that takes place in the value lambda fails due to ExtensionClass. Regards, Martijn From jeremy.kloth@fourthought.com Tue Jul 10 20:07:41 2001 From: jeremy.kloth@fourthought.com (Jeremy Kloth) Date: Tue, 10 Jul 2001 13:07:41 -0600 Subject: [XML-SIG] XPath and Zope's ParsedXML DOM References: <20010710150536.A26286@vet.uu.nl> Message-ID: <005501c10973$981fe280$703d64c0@den.xcare.net> From: "Martijn Faassen" > Hi there, > > I've been trying to make XPath work with Zope's DOM implementation, > ParsedXML. In the process I've discovered some incompatibilities in > XPath that I had to hack around. > > The problem is as follows. ParsedXML uses DOM nodes that are > descendants of ExtensionClass. This means that type(node) != types.InstanceType. > Instead of doing the check every time, I implemented a more lazy approach to it. Additionally, the performance hit happens only the first time through. def _strUnknown(object): # Allow for non-instance DOM node objects if hasattr(object, 'nodeType'): # Add this type to the mapping for next time through _strConversions[type(object)] = _strInstance return _strInstance(object) return and change type types.ListType entry in _strConversions to: types.ListType : lambda x: x and _strConversions.get(type(x[0]), _strUnknown)(x[0]) or '', -- Jeremy Kloth Consultant jeremy.kloth@fourthought.com +1 303 583 9900 x 105 Fourthought, Inc. http://fourthought.com 4735 East Walnut St, Boulder, CO 80301, USA XML strategy, XML tools (http://4suite.org), knowledge management From faassen@vet.uu.nl Tue Jul 10 22:59:31 2001 From: faassen@vet.uu.nl (Martijn Faassen) Date: Tue, 10 Jul 2001 23:59:31 +0200 Subject: [XML-SIG] XPath and Zope's ParsedXML DOM In-Reply-To: <005501c10973$981fe280$703d64c0@den.xcare.net> References: <20010710150536.A26286@vet.uu.nl> <005501c10973$981fe280$703d64c0@den.xcare.net> Message-ID: <20010710235931.A28790@vet.uu.nl> Jeremy Kloth wrote: > Instead of doing the check every time, I implemented a more lazy approach to > it. Additionally, the performance hit happens only the first time through. [snip source] Sweet! I'll be playing some more with XPath and ParsedXML next week, when I'm back from a (Zope) conference. Thanks, Martijn From uche.ogbuji@fourthought.com Tue Jul 10 23:05:08 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Tue, 10 Jul 2001 16:05:08 -0600 Subject: [XML-SIG] XPath and Zope's ParsedXML DOM In-Reply-To: Message from Martijn Faassen of "Tue, 10 Jul 2001 15:05:36 +0200." <20010710150536.A26286@vet.uu.nl> Message-ID: <200107102205.f6AM58g04686@localhost.local> > Hi there, > > I've been trying to make XPath work with Zope's DOM implementation, > ParsedXML. In the process I've discovered some incompatibilities in > XPath that I had to hack around. > > The problem is as follows. ParsedXML uses DOM nodes that are > descendants of ExtensionClass. This means that type(node) != types.InstanceType. > > Conversions.py depends on this in several places, however. After hacking > around them XPath works better for me. > > Here are the two places where I had to hack: > > The function CoreStringValue has this line: > > result = _strConversions.get(type(object), _strUnknown)(object) > > but, since instances now don't trigger the InstanceType key in > _strConversions, this fails and returns None for valid instances. I've hacked > around this by doing the following: > > if hasattr(object, 'ownerDocument'): > result = _strInstance(object) > else: > result = _strConversions.get(type(object), _strUnknown)(object) > > I'm not sure if this succeeds in all cases and it's a hack. I'll study > ExtensionClasses to see if there may be a better way. > > The other hack is similar and involves the types.ListType entry in the > _strConversions dictionary. Again the lookup that takes place in the > value lambda fails due to ExtensionClass. Thanks. Karl Anderson has pointed out all these issues, and they are on the docket to fix, but we haven't had a chance yet. Thanks for the fixes you offer, but you are right that they are problematic in the general case. If you do find features of ExtensionClass that make for a better fix, please let us know and we'll give them a try. Thanks. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4Suite.org), knowledge management From uche.ogbuji@fourthought.com Tue Jul 10 23:18:26 2001 From: uche.ogbuji@fourthought.com (Uche Ogbuji) Date: Tue, 10 Jul 2001 16:18:26 -0600 Subject: [XML-SIG] XPath and Zope's ParsedXML DOM In-Reply-To: Message from Uche Ogbuji of "Tue, 10 Jul 2001 16:05:08 MDT." <200107102205.f6AM58g04686@localhost.local> Message-ID: <200107102218.f6AMIQJ04717@localhost.local> > Thanks. Karl Anderson has pointed out all these issues, and they are on the > docket to fix, but we haven't had a chance yet. Never mind. Looks as if Jeremy has it sorted out. --Uche From noreply@sourceforge.net Wed Jul 11 14:08:04 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Wed, 11 Jul 2001 06:08:04 -0700 Subject: [XML-SIG] [ pyxml-Bugs-440396 ] 4Suite and PyXML DOMs differ. Message-ID: Bugs item #440396, was opened at 2001-07-11 06:08 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=440396&group_id=6473 Category: 4Suite Group: None Status: Open Resolution: None Priority: 5 Submitted By: Romain Slootmaekers (evilsloot) Assigned to: Nobody/Anonymous (nobody) Summary: 4Suite and PyXML DOMs differ. Initial Comment: XML Document and Domlette objects are not interchangeble for the xml.xslt.Processor api. (versions: 4Suite-0.11.1b2, PyXML-0.6.5 and Python 2.1) I included a small example program (30 or so lines) that fully demonstrates the problem. have fun, Sloot. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=440396&group_id=6473 From noreply@sourceforge.net Thu Jul 12 09:31:59 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Thu, 12 Jul 2001 01:31:59 -0700 Subject: [XML-SIG] [ pyxml-Patches-440604 ] ns_parse.py and bookmark.py patch Message-ID: Patches item #440604, was opened at 2001-07-12 01:31 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=440604&group_id=6473 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: ns_parse.py and bookmark.py patch Initial Comment: This tiny patch fixes a lot of problems (missing descriptions, separators, ...) I had when I tried to generate an XBEL file from my netscape bookmarks. It now includes all information available in the netcrap bookmark file in the result. I'M NOT A PYTHON HACKER, so please excuse the bad quality. The patch is against PyXML-0.6.5. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=440604&group_id=6473 From nicoml@webmails.com Thu Jul 12 11:03:46 2001 From: nicoml@webmails.com (Nicolas Villetard) Date: Thu, 12 Jul 2001 11:3:46 +0100 Subject: [XML-SIG] Xml query language for Python Message-ID: <20010712090346.29784.qmail@webmails.com> I have to deal with queries on a quite big XML Database (up to 5 Mo) for an application written in Python 2.1. I need also a quite performant query language (I'd like it to do more than pattern matching). Does anybody know which of these XML query languages are supported ? (XML-QL, YATL, Lorel, XQL, XML-RPC, ...) In which libraries ? You can also send me your suggestions about this topic. Thanks ____________________________________________________________________ - http://www.WebMailSPro.com - >> VOTRE service d'email sans pub avec VOTRE nom de domaine From wwwjessie@21cn.com Thu Jul 12 11:01:51 2001 From: wwwjessie@21cn.com (wwwjessie@21cn.com) Date: Thu, 12 Jul 2001 18:01:51 +0800 Subject: [XML-SIG] =?gb2312?B?xvPStcnPzfijrNK7sr21vc67KFlvdXIgb25saW5lIGNvbXBhbnkp?= Message-ID: <34f3401c10ab9$ac12df30$9300a8c0@ifood1gongxing> This is a multi-part message in MIME format. ------=_NextPart_000_34F35_01C10AFC.BA361F30 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: base64 1/C+tLXEu+HUsaOsxPq6w6Oh0rzKs8a31tC5+s34t/7O8dDFz6K5qcT6ss6/vKO6ICANCg0K07XT 0NfUvLq1xM34yc+5q8u+o6zVucq+uavLvrL6xre6zbf+zvGjrMzhuN/G89K1vrrV+cGmLMT609DB vdbW0aHU8aO6DQoNCjEvIM341b62qNbGIDxodHRwOi8vd3d3Lmlmb29kMS5jb20vYWJvdXR1cy9v dXJzZXJ2aWNlcy93ZWIuYXNwPiAgOg0K19S8us6su6S4/NDCo6y53MDtx7DMqLrzzKijrLj5vt3G 89K10OjSqqOsvajBotfUvLq1xM34yc+5q8u+o6zK/b7dv+LEo7/pyM7E+tGh1PGjusnMx+nQxc+i t6KyvCzN+MnPsvrGt9W5yr6jrL/Nu6e3/s7x1tDQxCzN+MnPubrO78+1zbMsv827p7nYDQrPtbnc wO0szfjJz8LbzLMszfjJz7vh0unW0NDELM34yc/V0Ma4LM22xrHPtc2zLNfKwc/PwtTY1tDQxCzO yr7ttfey6Swg1dCx6rLJubrPtc2zLLfDzsrV382zvMa31s72LCDBxMzsytIovbvB96GizLjF0Cmh raGtDQoNCs/rwcu94sr9vt2/4sSjv+nR3cq+1tDQxKO/x+vBqs+1o7ogc2FsZXNAaWZvb2QxLmNv bSA8bWFpbHRvOnNhbGVzQGlmb29kMS5jb20+DQqhobXnu7CjujA3NTUtMzc4NjMwOaGhz/rK27K/ yfLQob3jDQoNCjIvINK8zfjNqCA8aHR0cDovL29uZXQuaWZvb2QxLmNvbS8+DQot19TW+sq9vajN +KOsstnX97zytaWjrLy0vai8tNPDo7q/ydW5yr4zMNXFu/K4/Lbg1dXGrKOs19TW+sq9zqy7pKOs v8nL5sqxuPzQws28xqy6zc7E19bE2sjdo6zU2s/ft6KyvLL6xrfQxc+ioaK5q8u+tq/MrLXIo6zU +cvNtv68trn6vMrT8sP7KA0KyOdodHRwOi8veW91cm5hbWUuaWZvb2QxLmNvbSmjrNPr0rzKs8a3 1tC5+s34KNKzw+bkr8DAwb/UwtPiMjAwzfK0zim99MPcway906OszOG438LyvNK6zbnLv823w87K wb+jrLaoxtrK1bW90rzKsw0KxrfW0Ln6zfjM4bmptcS/zbun0OjH87rNssm5utDFz6Khow0KDQoN Cg0KN9TCMzDI1cewyerH67KiuLa/7sq508PSvM34zaijrMzYsfDTxbvdvNszODAw1KovxOqjrNT5 y83M9cLrueO45rKiw+K30dTayrPGt9eo0rXU09a+v6+1x7mpo6zH86OstPrA7aOsus/X99DFz6IN Cs/rwcu94rj8tuA/IKGhx+vBqs+1o7ogc2FsZXNAaWZvb2QxLmNvbSA8bWFpbHRvOnNhbGVzQGlm b29kMS5jb20+DQqhobXnu7CjujA3NTUtMzc4NjMwOaGhoaHP+srbsr/J8tChveMNCrvyILfDzsrO 0sPHtcTN+NKzIDxodHRwOi8vd3d3Lmlmb29kMS5jb20vYWJvdXR1cy9vdXJzZXJ2aWNlcy9jcHNl cnZpY2UuYXNwPg0KOnd3dy5pZm9vZDEuY29tDQoNCrvY1rSjqMfrtKvV5qO6MDc1NS0zMjM5MDQ3 u/K3orXn19PTyrz+o7ogc2FsZXNAaWZvb2QxLmNvbSA8bWFpbHRvOnNhbGVzQGlmb29kMS5jb20+ IKOpDQoNCqH1ILG+uavLvrbUzfjVvrao1sa40NDLyKShoaGhICAgICAgICAgICAgICAgICAgICAg ofUgsb65q8u+ttTSvM34zai3/s7xuNDQy8ikDQoNCrmry77D+7PGo7pfX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX1/Bqs+1yMujul9fX19fX19fX19fX19fX19fXw0K X19fX18gDQoNCrXnu7Cjul9fX19fX19fX19fX19fX19fX19fX7Sr1eajul9fX19fX19fX19fX19f X19fX19fX19FLW1haWyjul9fX19fX19fX19fX19fX18NCl9fX19fXyANCg0K ------=_NextPart_000_34F35_01C10AFC.BA361F30 Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: base64 PEhUTUw+DQo8SEVBRD4NCjxUSVRMRT5VbnRpdGxlZCBEb2N1bWVudDwvVElUTEU+IDxNRVRBIEhU VFAtRVFVSVY9IkNvbnRlbnQtVHlwZSIgQ09OVEVOVD0idGV4dC9odG1sOyBjaGFyc2V0PWdiMjMx MiI+IA0KPC9IRUFEPg0KDQo8Qk9EWSBCR0NPTE9SPSIjRkZGRkZGIiBURVhUPSIjMDAwMDAwIj4N CjxUQUJMRSBXSURUSD0iOTglIiBCT1JERVI9IjAiIENFTExTUEFDSU5HPSIwIiBDRUxMUEFERElO Rz0iMCI+PFRSPjxURD48UCBDTEFTUz1Nc29Ob3JtYWwgU1RZTEU9J21hcmdpbi1yaWdodDotMTcu ODVwdDtsaW5lLWhlaWdodDoxNTAlJz48Rk9OVCBTSVpFPSIyIj7X8L60tcS74dSxo6zE+rrDo6HS vMqzxrfW0Ln6zfi3/s7x0MXPormpxPqyzr+8o7ombmJzcDs8L0ZPTlQ+IA0KPC9QPjxQIENMQVNT PU1zb05vcm1hbCBTVFlMRT0nbWFyZ2luLXJpZ2h0Oi0xNy44NXB0O2xpbmUtaGVpZ2h0OjE1MCUn PjxGT05UIFNJWkU9IjIiPtO109DX1Ly6tcTN+MnPuavLvqOs1bnKvrmry76y+sa3us23/s7xo6zM 4bjfxvPStb661fnBpizE+tPQwb3W1tGh1PGjujxCUj48QlI+MS8gDQo8QQ0KSFJFRj0iaHR0cDov L3d3dy5pZm9vZDEuY29tL2Fib3V0dXMvb3Vyc2VydmljZXMvd2ViLmFzcCI+zfjVvrao1sY8L0E+ IDog19S8us6su6S4/NDCo6y53MDtx7DMqLrzzKijrLj5vt3G89K10OjSqqOsvajBotfUvLq1xM34 yc+5q8u+o6zK/b7dv+LEo7/pyM7E+tGh1PGjusnMx+nQxc+it6KyvCzN+MnPsvrGt9W5yr6jrL/N u6e3/s7x1tDQxCzN+MnPubrO78+1zbMsv827p7nYz7W53MDtLM34yc/C28yzLM34yc+74dLp1tDQ xCzN+MnP1dDGuCzNtsaxz7XNsyzXysHPz8LU2NbQ0MQszsq+7bX3suksIA0K1dCx6rLJubrPtc2z LLfDzsrV382zvMa31s72LCDBxMzsytIovbvB96GizLjF0CmhraGtPC9GT05UPjwvUD48UCBDTEFT Uz1Nc29Ob3JtYWwgU1RZTEU9J2xpbmUtaGVpZ2h0OjIwLjBwdCc+PEI+PEZPTlQgQ09MT1I9IiNG RjAwMDAiPs/rwcu94sr9vt2/4sSjv+nR3cq+1tDQxKO/PC9GT05UPjwvQj48Rk9OVCBTSVpFPSIy Ij7H68Gqz7WjujxBIEhSRUY9Im1haWx0bzpzYWxlc0BpZm9vZDEuY29tIj5zYWxlc0BpZm9vZDEu Y29tPC9BPiANCqGhtee7sKO6MDc1NS0zNzg2MzA5oaHP+srbsr/J8tChveM8L0ZPTlQ+PC9QPjxQ IENMQVNTPU1zb05vcm1hbCBTVFlMRT0nbGluZS1oZWlnaHQ6MjAuMHB0Jz48L1A+PFAgQ0xBU1M9 TXNvTm9ybWFsIFNUWUxFPSdsaW5lLWhlaWdodDoyMC4wcHQnPjxGT05UIFNJWkU9IjIiPjIvIA0K PEEgSFJFRj0iaHR0cDovL29uZXQuaWZvb2QxLmNvbS8iPtK8zfjNqDwvQT4t19TW+sq9vajN+KOs stnX97zytaWjrLy0vai8tNPDo7q/ydW5yr4zMNXFu/K4/Lbg1dXGrKOs19TW+sq9zqy7pKOsv8nL 5sqxuPzQws28xqy6zc7E19bE2sjdo6zU2s/ft6KyvLL6xrfQxc+ioaK5q8u+tq/MrLXIo6zU+cvN tv68trn6vMrT8sP7KMjnaHR0cDovL3lvdXJuYW1lLmlmb29kMS5jb20po6zT69K8yrPGt9bQufrN +CjSs8Pm5K/AwMG/1MLT4jIwMM3ytM4pvfTD3MGsvdOjrMzhuN/C8rzSus25y7/Nt8POysG/o6y2 qMbaytW1vdK8yrPGt9bQufrN+Mzhuam1xL/Nu6fQ6Mfzus2yybm60MXPoqGjPEJSPjwvRk9OVD48 L1A+PFAgQ0xBU1M9TXNvTm9ybWFsIFNUWUxFPSdtYXJnaW4tcmlnaHQ6LTE3Ljg1cHQ7bGluZS1o ZWlnaHQ6MTUwJSc+PEZPTlQgU0laRT0iMiI+PEJSPjwvRk9OVD4gDQo8Qj48Rk9OVCBDT0xPUj0i I0ZGMDAwMCI+NzwvRk9OVD48L0I+PEZPTlQgQ09MT1I9IiNGRjAwMDAiPjxCPtTCMzDI1cewyerH 67KiuLa/7sq508PSvM34zaijrMzYsfDTxbvdvNszODAw1KovxOqjrNT5y83M9cLrueO45rKiw+K3 0dTayrPGt9eo0rXU09a+v6+1x7mpo6zH86OstPrA7aOsus/X99DFz6I8L0I+PEJSPjwvRk9OVD4g DQo8Rk9OVCBTSVpFPSIyIj7P68HLveK4/LbgPyChocfrwarPtaO6PEEgSFJFRj0ibWFpbHRvOnNh bGVzQGlmb29kMS5jb20iPnNhbGVzQGlmb29kMS5jb208L0E+IA0KoaG157uwo7owNzU1LTM3ODYz MDmhoaGhz/rK27K/yfLQob3jPEJSPjwvRk9OVD48Rk9OVCBTSVpFPSIyIj678jxBDQpIUkVGPSJo dHRwOi8vd3d3Lmlmb29kMS5jb20vYWJvdXR1cy9vdXJzZXJ2aWNlcy9jcHNlcnZpY2UuYXNwIj63 w87KztLDx7XEzfjSszwvQT46d3d3Lmlmb29kMS5jb208L0ZPTlQ+PC9QPjxQIENMQVNTPU1zb05v cm1hbCBTVFlMRT0nbGluZS1oZWlnaHQ6MjAuMHB0JyBBTElHTj0iTEVGVCI+PC9QPjxQIENMQVNT PU1zb05vcm1hbCBBTElHTj1MRUZUIFNUWUxFPSdsaW5lLWhlaWdodDoyMC4wcHQnPjxGT05UIFNJ WkU9IjIiPjxCPrvY1rSjqMfrtKvV5qO6MDc1NS0zMjM5MDQ3u/K3orXn19PTyrz+o7o8L0I+PEEN CkhSRUY9Im1haWx0bzpzYWxlc0BpZm9vZDEuY29tIj5zYWxlc0BpZm9vZDEuY29tIDwvQT48Qj6j qTwvQj48L0ZPTlQ+PC9QPjxQPjxGT05UIFNJWkU9IjIiPqH1IA0Ksb65q8u+ttTN+NW+tqjWxrjQ 0MvIpKGhoaEmbmJzcDsmbmJzcDsgJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7 Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7IA0KJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7 Jm5ic3A7IKH1ILG+uavLvrbU0rzN+M2ot/7O8bjQ0MvIpDwvRk9OVD48L1A+PFAgQ0xBU1M9TXNv Tm9ybWFsIFNUWUxFPSdsaW5lLWhlaWdodDoyMC4wcHQnPjxGT05UIFNJWkU9IjIiPrmry77D+7PG o7pfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX1/Bqs+1yMujul9f X19fX19fX19fX19fX19fX19fX19fIA0KPEJSPiA8QlI+ILXnu7Cjul9fX19fX19fX19fX19fX19f X19fX7Sr1eajul9fX19fX19fX19fX19fX19fX19fX19FLW1haWyjul9fX19fX19fX19fX19fX19f X19fX18gDQo8L0ZPTlQ+PC9QPjxQIENMQVNTPU1zb05vcm1hbCBTVFlMRT0nbGluZS1oZWlnaHQ6 MjAuMHB0Jz48L1A+PC9URD48L1RSPjwvVEFCTEU+IA0KPC9CT0RZPg0KPC9IVE1MPg0K ------=_NextPart_000_34F35_01C10AFC.BA361F30-- From hungjunglu@yahoo.com Fri Jul 13 00:23:03 2001 From: hungjunglu@yahoo.com (Hung Jung Lu) Date: Thu, 12 Jul 2001 16:23:03 -0700 (PDT) Subject: [XML-SIG] SAX with DTD Message-ID: <20010712232303.61912.qmail@web12607.mail.yahoo.com> Hi, I am new to XML in Python. I have a few questions. (1) I have read that Expat is non-validating. Does it mean that it ignores DTD completely? (2) I have a DTD that specifies default attributes (via #FIXED) of an XML document. Is there some parser (DOM preferred, SAX ok) in Python that can take into account the attributes specified in DTD? I tried xml.dom.minidom and as one would guess, it does not do anything with DTD. What's a good XML parser in Python that builds DOM with DTD information? (3) If the above is not available in Python (DOM with DTD), is there any simple downloadable example out there of some SAX parser that uses both XML and DTD? I've read a bit about xmlproc, xmlval, but is there any simple example code? thanks, Hung Jung __________________________________________________ Do You Yahoo!? Get personalized email addresses from Yahoo! Mail http://personal.mail.yahoo.com/ From fdrake@acm.org Fri Jul 13 00:29:02 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 12 Jul 2001 19:29:02 -0400 (EDT) Subject: [XML-SIG] SAX with DTD In-Reply-To: <20010712232303.61912.qmail@web12607.mail.yahoo.com> References: <20010712232303.61912.qmail@web12607.mail.yahoo.com> Message-ID: <15182.12990.22102.223577@cj42289-a.reston1.va.home.com> Hung Jung Lu writes: > (1) I have read that Expat is non-validating. Does it > mean that it ignores DTD completely? Yes, pretty much. If you use Expat 1.95+ (see expat.sourceforge.net), then you can coerce Expat into reading the DTD and report what's in the DTD, but it won't perform validation. You certainly could use that to pick up the default values of attributes, however. > (2) I have a DTD that specifies default attributes > (via #FIXED) of an XML document. Is there some parser > (DOM preferred, SAX ok) in Python that can take into > account the attributes specified in DTD? I tried I suspect xmlproc can be used to build a DOM like this. I've used the latest versions of Expat to do this as well, but that DOM requires the acquisition machinery in Zope to work. It shouldn't be too hard to adapt that code to build a minidom DOM, but I've not had time to do so. You are free to work on that if you like. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From Juergen Hermann" Message-ID: On Thu, 12 Jul 2001 16:23:03 -0700 (PDT), Hung Jung Lu wrote: >(2) I have a DTD that specifies default attributes >(via #FIXED) of an XML document. Is there some parser >(DOM preferred, SAX ok) in Python that can take into >account the attributes specified in DTD? This is a full SAX example: import os import xml.sax import xml.sax.saxutils import xml.sax.handler import xml.sax.sax2exts class CopsConfigHandler(xml.sax.saxutils.DefaultHandler): xmlns_copscfg = u'http://www.cinetic.de/2000/COPS/Config' _debug = 0 def __init__(self, configfile): self.configfile = 'file://' + os.path.abspath(configfile) self.params = {} self.in_parameters = 0 # create parser parser = xml.sax.sax2exts.XMLValParserFactory.make_parser() ##print '+++ parser is', parser parser.setFeature(xml.sax.handler.feature_namespaces, 1) parser.setFeature(xml.sax.handler.feature_validation, 0) parser.setFeature(xml.sax.handler.feature_external_ges, 1) parser.setFeature(xml.sax.handler.feature_external_pes, 1) # set handlers parser.setContentHandler(self) parser.setDTDHandler(self) if not self._debug: # no tracebacks, print error msg only! parser.setErrorHandler(self) parser.setEntityResolver(self) # parse the XML into events parser.parse(self.configfile) ### error handler events def error(self,exception): raise exception def fatalError(self,exception): raise exception def warning(self,exception): sys.stderr.write("*** warning %s\n" % (str(exception),)) ### document handler events def startElementNS(self, name, qname, attrs): if name[0] == self.xmlns_copscfg: ##print name, qname, attrs.items() if name[1] == "parameters": self.in_parameters = 1 elif self.in_parameters and name[1] == "param": ##print '+++ attrs', attrs._attrs ##print '+++ qnames', attrs._qnames name = attrs.getValueByQName('name') value = attrs.getValueByQName('value') self.params[name] = value def endElementNS(self, name, qname): if name[0] == self.xmlns_copscfg: if name[1] == "parameters": self.in_parameters = 0 if __name__ == "__main__": copsconfig = CopsConfigHandler(os.path.join('conf', 'cops-config.xml')) keys = copsconfig.params.keys() keys.sort() klen = reduce(max, map(len, keys), 0) for key in keys: print key.ljust(klen), repr(copsconfig.params[key]) From larsga@garshol.priv.no Fri Jul 13 10:30:42 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 13 Jul 2001 11:30:42 +0200 Subject: [XML-SIG] SAX with DTD In-Reply-To: <20010712232303.61912.qmail@web12607.mail.yahoo.com> References: <20010712232303.61912.qmail@web12607.mail.yahoo.com> Message-ID: * Hung Jung Lu | | (2) I have a DTD that specifies default attributes (via #FIXED) of | an XML document. Is there some parser (DOM preferred, SAX ok) in | Python that can take into account the attributes specified in DTD? xmlproc does this, and there is a SAX driver for it, so that you can access it as a SAX parser. The DOM implementations use SAX to build their DOM trees, so you can use xmlproc to build your DOMs. [larsga@pc36 project]$ python2.1 Python 2.1 (#1, May 5 2001, 06:49:59) [GCC 2.95.1 19990816/Linux (release)] on linux2 Type "copyright", "credits" or "license" for more information. >>> from xml.dom.ext.reader.Sax2 import Reader >>> r = Reader(1) >>> doc = r.fromStream(open("engine-plan.xml")) >>> doc The 1 argument to Reader tells it to use a validating parser, so it will do much the same as J黵gen's example, except with the DOM rather than SAX. It uses xmlproc at the moment, because that's the only validating parser we have. --Lars M. From hungjunglu@yahoo.com Fri Jul 13 15:12:37 2001 From: hungjunglu@yahoo.com (Hung Jung Lu) Date: Fri, 13 Jul 2001 07:12:37 -0700 (PDT) Subject: [XML-SIG] SAX with DTD In-Reply-To: Message-ID: <20010713141237.50998.qmail@web12605.mail.yahoo.com> Cool. It works! Yours is probably the shortest way of attaching attributes from DTD to XML. Result can be seen by from xml.dom.ext import PrettyPrint PrettyPrint(doc) I did try xmlproc directly, too. More coding for the handlers, but I guess it's good if one wants to convert XML directly into Python objects instead of going through DOM. Thanks everyone! Hung Jung --- Lars Marius Garshol wrote: > > * Hung Jung Lu > | > | (2) I have a DTD that specifies default attributes > (via #FIXED) of > | an XML document. Is there some parser (DOM > preferred, SAX ok) in > | Python that can take into account the attributes > specified in DTD? > > xmlproc does this, and there is a SAX driver for it, > so that you can > access it as a SAX parser. The DOM implementations > use SAX to build > their DOM trees, so you can use xmlproc to build > your DOMs. > > [larsga@pc36 project]$ python2.1 > Python 2.1 (#1, May 5 2001, 06:49:59) > [GCC 2.95.1 19990816/Linux (release)] on linux2 > Type "copyright", "credits" or "license" for more > information. > >>> from xml.dom.ext.reader.Sax2 import Reader > >>> r = Reader(1) > >>> doc = r.fromStream(open("engine-plan.xml")) > >>> doc > > > The 1 argument to Reader tells it to use a > validating parser, so it > will do much the same as J黵gen's example, except > with the DOM rather > than SAX. It uses xmlproc at the moment, because > that's the only > validating parser we have. > > --Lars M. > > > _______________________________________________ > XML-SIG maillist - XML-SIG@python.org > http://mail.python.org/mailman/listinfo/xml-sig __________________________________________________ Do You Yahoo!? Get personalized email addresses from Yahoo! Mail http://personal.mail.yahoo.com/ From dkuhlman@cutter.rexx.com Fri Jul 13 21:29:15 2001 From: dkuhlman@cutter.rexx.com (Dave Kuhlman) Date: Fri, 13 Jul 2001 13:29:15 -0700 Subject: [XML-SIG] Python wrappers for libxml and libxslt Message-ID: <20010713132914.A20340@cutter.rexx.com> I've implemented wrappers for the parser in libxml2 and simple wrappers for the top level functionality in libxslt. You can learn more about libxml and libxslt at: http://xmlsoft.org And you can find my Python wrappers at: *** Caution -- This is alpha-ware. Use at your own risk. *** SAX interface: http://www.rexx.com/~dkuhlman/libxml_saxlib.html http://www.rexx.com/~dkuhlman/libxml_saxlib-1.0a.tar.gz DOM interface: http://www.rexx.com/~dkuhlman/libxml_domlib.html http://www.rexx.com/~dkuhlman/libxml_domlib-1.0a.tar.gz XSL-T: http://www.rexx.com/~dkuhlman/libxsltmod.html http://www.rexx.com/~dkuhlman/libxsltmod-1.0a.tar.gz Thanks so much to all those who work made this possible. Thanks to the people who did libxml and libxslt (these modules are 99.9% their work and 0.1% mine.) Thanks for Distutils, which made it so easy to package these modules. And, thanks to the core Python team for a great and extensible language. My wrappers are at a pretty low level (i.e. close to the libxml C code). That made it a bit easier for me. But it might also help with speed and memory use considerations for some uses. But, it also turns out to be very easy for a Python user of the wrappers. With libxml_saxlib, just create an instance a class that has methods like startDocument, endDocument, startElement, endElement, characters, etc, then call parse_file(instance, fileName) or parse_string(instance, string). With libxml_domlib, call parse_file or parse_string to parse the document, then call getRootElement, getFirstChild, getNextSibling, etc to walk the tree. With libxslt, just call a function or two. An additional educational part of this work -- In providing access to the DOM tree, I needed to implement several Python extension datatypes (as part of the Python extension module libxml_domlib). I had never done that before, believing that the Python C structures involved were too difficult for me to deal with. With some help, it turned out to be not as difficult as I thought. Here are two suggestions if you need to implement a Python extension type yourself: - Start by copying Objects/xxobject.c in the Python source code distribution. The structure and organization in this file will put you far ahead of where you would be if you start from scratch and it will save many errors, too. - Or, use the my extension datatype generator. You can find it at: http://www.rexx.com/~dkuhlman/dtGenerator.py For restricted purposes, this will save a lot of copy, paste, and rename work. You may be asking, Why did you implement XML capabilities for Python, when we already have PyXML/4Suite? PyXML is super. And there is no way that these wrappers for libxml/libxslt can be considered anywhere near as good as PyXML. (It's presumptuous for me to suggest that they are comparable.) However, let me give a couple of reasons for doing and offering this: - Because it's there. libxml2 and libxslt are available. - Because implementing the Python extension modules and extension datatypes was good training for me. - Because I believe that having a bit more breadth of coverage of something as important as XML is good for Python, even if it is not used very much. - Because it's easy to use. Using libxsltmod from Python is (almost) as easy as one function call. It won't give enough control for some situations. But where that control is not needed, calling from Python is very easy. - Because there may be special situations where this implementation is useful. For example, installing it on a new machine, may be as easy as copying a few shared libraries. For some purposes, that may be a benefit. - Because I'm grateful for all that the Python community has given me and I'd like to try to give a little back. If you have suggestions or find problems please let me know. - Dave -- Dave Kuhlman dkuhlman@rexx.com From Mike.Olson@fourthought.com Mon Jul 16 00:57:44 2001 From: Mike.Olson@fourthought.com (Mike Olson) Date: Sun, 15 Jul 2001 17:57:44 -0600 Subject: [XML-SIG] [ANN] 4Suite and 4SuiteServer 0.11.1 beta 3 release Message-ID: <3B522DF8.E083F256@fourthought.com> All, This should be out last beta release for the 0.11.1 final release. Thanks to every one for the help in finding all of our bugs. I think we have fixed most of them, the rest we will be fixing this week and hope to have the final release out at the end of the week. As always, any and all who can down load the beta and give it a try it would be greately appreciated. You can get the beta releases from the ftp site at ftp://ftp.fourthought.com/pub/4Suite or from the web site at: http://4suite.org/download.html Thanks Mike -- Mike Olson Principal Consultant mike.olson@fourthought.com +1 303 583 9900 x 102 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, http://4Suite.org Boulder, CO 80301-2537, USA XML strategy, XML tools, knowledge management From Alexandre.Fayolle@logilab.fr Mon Jul 16 08:19:30 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Mon, 16 Jul 2001 09:19:30 +0200 (CEST) Subject: [XML-SIG] 4Suite 0.11.1 and PyXML 0.6.5 In-Reply-To: <3B522DF8.E083F256@fourthought.com> Message-ID: On Sun, 15 Jul 2001, Mike Olson wrote: > This should be out last beta release for the 0.11.1 final release. > Thanks to every one for the help in finding all of our bugs. I think we > have fixed most of them, the rest we will be fixing this week and hope > to have the final release out at the end of the week. Great news. I've been quite busy these last weeks, and have not managed to follow the various mailing lists as closely as I would have wanted. Is there a release of PyXML 0.6.6 planned that would mainly feature the changes in xml.dom.ext that make PyXML compatible with 4Suite-0.11.1's pDomlette ? Yet another (dumb) question : is the new version of XPath (thread and unicode friendly) part of 4Suite 0.11.1 or PyXML 0.7 ? Thanks Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From wwwjessie@21cn.com Mon Jul 16 10:47:34 2001 From: wwwjessie@21cn.com (wwwjessie@21cn.com) Date: Mon, 16 Jul 2001 17:47:34 +0800 Subject: [XML-SIG] =?gb2312?B?tPPBrC0yMDAxxOq5+rzKwszJq8qzxrfT68jLwOC9ob+1sqnAwLvhKA==?= =?gb2312?B?QWdybyBBbm51YWwgTWVldGluZyBDaGluYSAyMDAxKQ0=?= Message-ID: <2d9a001c10ddc$5766b6c0$9300a8c0@ifood1gongxing> This is a multi-part message in MIME format. ------=_NextPart_000_2D9A1_01C10E1F.6589F6C0 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: base64 MjAwMcTq1tC5+rn6vMrFqdK1v8a8vMTqu+ENCrn6vMrCzMmryrPGt9PryMvA4L2hv7WyqcDAu+G8 sNGnyvXM1sLbu+ENCg0KCQ0K1bnG2qO6IAmhoTIwMDHE6jnUwjTI1S03yNUJDQq12LXjo7ogCaGh tPPBrNDHuqO74dW51tDQxAkNCtb3sOyjuiAJoaHW0LuqyMvD8bmyus25+sWp0rWyvw0KoaHW0Ln6 v8bRp7y8yvXQrbvhDQqhobTzwazK0MjLw/HV/riuDQoJDQqz0LDso7ogCaGh1tC5+sLMyavKs8a3 t6LVudbQ0MQNCqGh1tC5+sWp0ae74Q0KoaHW0Ln6wszJq8qzxrfQrbvhDQqhobTzwazK0MWp0rW+ 1g0KoaG088Gs0Me6o7vh1bnW0NDEDQoJDQrN+MLnt/7O8czhuanJzKO60rzKs8a31tC5+s34IGh0 dHA6Ly93d3cuaWZvb2QxLmNvbQ0KPGh0dHA6Ly93d3cuaWZvb2QxLmNvbS9pbmRleC5hc3A/ZnI9 eG1sLXNpZ0BweXRob24ub3JnPiAJDQogCQ0Kofogzai5/dK8yrPGt9bQufrN+LGow/uyztW5o7q+ xdXb08W73SixyMjnz9bT0MO/uPYgM00gWCAzTSC1xLHq17zVuc671K2821JNQjQ1MDCjrM2ouf3O 0sPH1rvQ6Li2Uk1CNDA1MCmjrA0KsajD+73Y1rnI1cbaMjAwMcTqN9TCMjDI1SA8aHR0cDovL2dy ZWVuMjAwMS5pZm9vZDEuY29tL2Zyb20xLmFzcD4gDQqh+iC7ttOtIMPit9HXorLhIDxodHRwOi8v d3d3Lmlmb29kMS5jb20vc2lnbnVwL3NldmFncmVlbS5hc3A+ILPJzqq5q8u+u+HUsaGjDQo31MIy MMjVx7DXorLho6zE+r2r1No31MIyNcjVx7DNqLn9tefX09PKvP63vcq9w+K30bvxtcMzMMz1ssm5 utDFz6Khow0KyOe5+8T6srvP68rVtb3O0sPHtcTTyrz+o6zH6yDBqs+1ztLDxyA8bWFpbHRvOnVu c3Vic2NyaWJlQGlmb29kMS5jb20+IKOsztLDx9LUuvO9q7K71Nm3otPKvP64+MT6oaMNCrLp0a+j uiBzYWxlc0BpZm9vZDEuY29tIDxtYWlsdG86c2FsZXNAaWZvb2QxLmNvbT4gIKGhoaG157uwo7ow NzU1LTM3ODYzMDmhoc/6ytuyvw0KyfLQob3jILbFz8jJ+g0KDQoNCiANCg0Ku9gg1rQgo6jH67Sr 1eajujA3NTUtMzIzOTA0N7vyILeitefX09PKvP6juiBzYWxlc0BpZm9vZDEuY29tIDxtYWlsdG86 c2FsZXNAaWZvb2QxLmNvbT4NCqOpCQ0KofUgsb65q8u+09DS4s2ouf3SvMqzxrfW0Ln6zfiyztW5 IKGhoaEgofUgsb65q8u+xOK9+NK7sr3By73iuMOyqcDAu+GjrMfr0+vO0sPHwarPtQ0KDQq5q8u+ w/uzxqO6X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCsGqz7XIy6O6X19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0Ktee7sKO6X19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fXw0KtKvV5qO6X19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fXw0KRS1tYWlso7pfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f DQoJDQogCQ0KIAkNCiAJDQogCQ0KIAkNCg== ------=_NextPart_000_2D9A1_01C10E1F.6589F6C0 Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: base64 PGh0bWw+DQo8aGVhZD4NCjx0aXRsZT5VbnRpdGxlZCBEb2N1bWVudDwvdGl0bGU+IDxtZXRhIGh0 dHAtZXF1aXY9IkNvbnRlbnQtVHlwZSIgY29udGVudD0idGV4dC9odG1sOyBjaGFyc2V0PWdiMjMx MiI+IA0KPHN0eWxlIHR5cGU9InRleHQvY3NzIj4NCjwhLS0NCnRkIHsgIGxpbmUtaGVpZ2h0OiAy NHB4fQ0KLS0+DQo8L3N0eWxlPiANCjwvaGVhZD4NCg0KPGJvZHkgYmdjb2xvcj0iI0ZGRkZGRiIg dGV4dD0iIzAwMDAwMCI+DQo8ZGl2IGFsaWduPSJDRU5URVIiPjx0YWJsZSB3aWR0aD0iNzUlIiBi b3JkZXI9IjAiIGNlbGxzcGFjaW5nPSIwIiBjZWxscGFkZGluZz0iMCI+PHRyPjx0ZCBhbGlnbj0i Q0VOVEVSIj48YSBocmVmPSJodHRwOy8vZ3JlZW4yMDAxLmlmb29kMS5jb20iPjxiPjIwMDHE6tbQ ufq5+rzKxanStb/GvLzE6rvhPGJyPrn6vMrCzMmryrPGt9PryMvA4L2hv7WyqcDAu+G8sNGnyvXM 1sLbu+E8L2I+PC9hPjxicj48YnI+PC90ZD48L3RyPjx0cj48dGQgYWxpZ249IkNFTlRFUiI+PHRh YmxlIHdpZHRoPSI3NSUiIGJvcmRlcj0iMCIgY2VsbHNwYWNpbmc9IjAiIGNlbGxwYWRkaW5nPSIw Ij48dHI+PHRkIGhlaWdodD0iMTIiIHdpZHRoPSIzOSUiIGFsaWduPSJSSUdIVCI+PGI+PGZvbnQg c2l6ZT0iMiI+1bnG2qO6IA0KPC9mb250PjwvYj48L3RkPjx0ZCBoZWlnaHQ9IjEyIiB3aWR0aD0i NjElIj48Zm9udCBzaXplPSIyIj6hoTIwMDHE6jnUwjTI1S03yNU8L2ZvbnQ+PC90ZD48L3RyPjx0 cj48dGQgaGVpZ2h0PSIxMiIgd2lkdGg9IjM5JSIgYWxpZ249IlJJR0hUIj48Yj48Zm9udCBzaXpl PSIyIj612LXjo7ogDQo8L2ZvbnQ+PC9iPjwvdGQ+PHRkIGhlaWdodD0iMTIiIHdpZHRoPSI2MSUi Pjxmb250IHNpemU9IjIiPqGhtPPBrNDHuqO74dW51tDQxDwvZm9udD48L3RkPjwvdHI+PHRyPjx0 ZCBoZWlnaHQ9IjEyIiB3aWR0aD0iMzklIiBhbGlnbj0iUklHSFQiIHZhbGlnbj0iVE9QIj48Yj48 Zm9udCBzaXplPSIyIj7W97Dso7ogDQo8L2ZvbnQ+PC9iPjwvdGQ+PHRkIGhlaWdodD0iMTIiIHdp ZHRoPSI2MSUiPjxmb250IHNpemU9IjIiPqGhPC9mb250Pjxmb250IHNpemU9IjIiPtbQu6rIy8Px ubK6zbn6xanStbK/PGJyPqGh1tC5+r/G0ae8vMr10K274Txicj6hobTzwazK0MjLw/HV/riuPGJy PjwvZm9udD48L3RkPjwvdHI+PHRyPjx0ZCBoZWlnaHQ9IjEyIiB3aWR0aD0iMzklIiBhbGlnbj0i UklHSFQiIHZhbGlnbj0iVE9QIj48Yj48Zm9udCBzaXplPSIyIj6z0LDso7ogDQo8L2ZvbnQ+PC9i PjwvdGQ+PHRkIGhlaWdodD0iMTIiIHdpZHRoPSI2MSUiPjxmb250IHNpemU9IjIiPqGhPC9mb250 Pjxmb250IHNpemU9IjIiPtbQufrCzMmryrPGt7ei1bnW0NDEPGJyPqGh1tC5+sWp0ae74Txicj6h odbQufrCzMmryrPGt9Ctu+E8YnI+oaG088GsytDFqdK1vtY8YnI+oaG088Gs0Me6o7vh1bnW0NDE PGJyPjwvZm9udD48L3RkPjwvdHI+PHRyPjx0ZCBjb2xzcGFuPSIyIiBhbGlnbj0iQ0VOVEVSIj48 Zm9udCBzaXplPSIyIj7N+MLnt/7O8czhuanJzKO60rzKs8a31tC5+s34IA0KPGEgaHJlZj0iaHR0 cDovL3d3dy5pZm9vZDEuY29tL2luZGV4LmFzcD9mcj14bWwtc2lnQHB5dGhvbi5vcmciPmh0dHA6 Ly93d3cuaWZvb2QxLmNvbTwvYT48L2ZvbnQ+PC90ZD48L3RyPjx0cj48dGQgY29sc3Bhbj0iMiIg YWxpZ249IkNFTlRFUiI+Jm5ic3A7PC90ZD48L3RyPjx0cj48dGQgY29sc3Bhbj0iMiIgYWxpZ249 IkxFRlQiPjxwPjxmb250IHNpemU9IjIiPqH6IA0Kzai5/dK8yrPGt9bQufrN+LGow/uyztW5o7o8 Yj48Zm9udCBzaXplPSIzIiBjb2xvcj0iI0ZGMDAwMCI+vsXV29PFu908L2ZvbnQ+PC9iPiixyMjn z9bT0MO/uPYgM00gWCAzTSANCrXEserXvNW5zrvUrbzbUk1CNDUwMKOszai5/c7Sw8fWu9DouLZS TUI0MDUwKaOsIDxhIGhyZWY9Imh0dHA6Ly9ncmVlbjIwMDEuaWZvb2QxLmNvbS9mcm9tMS5hc3Ai PjxiPjxmb250IHNpemU9IjMiIGNvbG9yPSIjRkYwMDAwIj6xqMP7vdjWucjVxtoyMDAxxOo31MIy MMjVPC9mb250PjwvYj48L2E+PGJyPqH6IA0Ku7bTrTxhIGhyZWY9Imh0dHA6Ly93d3cuaWZvb2Qx LmNvbS9zaWdudXAvc2V2YWdyZWVtLmFzcCI+w+K30deisuE8L2E+s8nOqrmry7674dSxoaMgPGZv bnQgY29sb3I9IiNGRjAwMDAiPjxiPjxmb250IHNpemU9IjMiPjfUwjIwyNXHsNeisuGjrMT6vavU 2jfUwjI1yNXHsM2ouf2159fT08q8/re9yr3D4rfRu/G1wzMwzPWyybm60MXPoqGjPC9mb250Pjwv Yj48L2ZvbnQ+PGJyPsjnufvE+rK7z+vK1bW9ztLDx7XE08q8/qOsx+s8YSBocmVmPSJtYWlsdG86 dW5zdWJzY3JpYmVAaWZvb2QxLmNvbSI+warPtc7Sw8c8L2E+o6zO0sPH0tS6872rsrvU2bei08q8 /rj4xPqhozxicj6y6dGvo7o8YSBocmVmPSJtYWlsdG86c2FsZXNAaWZvb2QxLmNvbSI+c2FsZXNA aWZvb2QxLmNvbTwvYT4gDQqhoaGhtee7sKO6MDc1NS0zNzg2MzA5oaHP+srbsr8gyfLQob3jILbF z8jJ+jxicj48L2ZvbnQ+PC9wPjxwPiZuYnNwOzwvcD48L3RkPjwvdHI+PHRyPjx0ZCBoZWlnaHQ9 IjMwIiBjb2xzcGFuPSIyIiBhbGlnbj0iQ0VOVEVSIj48Zm9udCBzaXplPSIyIj48Yj672CANCta0 IKOox+u0q9Xmo7owNzU1LTMyMzkwNDe78iC3orXn19PTyrz+o7ogPGEgaHJlZj0ibWFpbHRvOnNh bGVzQGlmb29kMS5jb20iPnNhbGVzQGlmb29kMS5jb208L2E+IA0Ko6k8L2I+PC9mb250PjwvdGQ+ PC90cj48dHI+PHRkIGhlaWdodD0iMTIiIGNvbHNwYW49IjIiPjxmb250IHNpemU9IjIiPqH1ILG+ uavLvtPQ0uLNqLn90rzKs8a31tC5+s34ss7VuSANCqGhoaEgofUgsb65q8u+xOK9+NK7sr3By73i uMOyqcDAu+GjrMfr0+vO0sPHwarPtTxicj48YnI+uavLvsP7s8ajul9fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fPGJyPsGqz7XIy6O6X19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fXzxicj48L2ZvbnQ+PGZvbnQgc2l6ZT0iMiI+tee7sKO6X19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fXzxicj60q9Xmo7pfX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fPGJyPkUtbWFpbKO6X19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fXzxicj48L2ZvbnQ+PC90ZD48L3RyPjx0cj48dGQgaGVpZ2h0PSIxMiIgY29sc3Bhbj0i MiIgYWxpZ249IkxFRlQiPiZuYnNwOzwvdGQ+PC90cj48dHI+PHRkIGhlaWdodD0iMTIiIGNvbHNw YW49IjIiIGFsaWduPSJMRUZUIj4mbmJzcDs8L3RkPjwvdHI+PHRyPjx0ZCBoZWlnaHQ9IjEyIiBj b2xzcGFuPSIyIiBhbGlnbj0iTEVGVCI+Jm5ic3A7PC90ZD48L3RyPjwvdGFibGU+PC90ZD48L3Ry Pjx0cj48dGQ+Jm5ic3A7PC90ZD48L3RyPjx0cj48dGQ+Jm5ic3A7PC90ZD48L3RyPjwvdGFibGU+ PC9kaXY+DQo8L2JvZHk+DQo8L2h0bWw+DQo= ------=_NextPart_000_2D9A1_01C10E1F.6589F6C0-- From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 13:57:16 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 16 Jul 2001 14:57:16 +0200 Subject: [XML-SIG] SAX with DTD In-Reply-To: <20010712232303.61912.qmail@web12607.mail.yahoo.com> (message from Hung Jung Lu on Thu, 12 Jul 2001 16:23:03 -0700 (PDT)) References: <20010712232303.61912.qmail@web12607.mail.yahoo.com> Message-ID: <200107161257.f6GCvGv02814@mira.informatik.hu-berlin.de> > (2) I have a DTD that specifies default attributes > (via #FIXED) of an XML document. Is there some parser > (DOM preferred, SAX ok) in Python that can take into > account the attributes specified in DTD? I recommend to use the functions and classes in xml.dom.ext.reader.Sax2, and turn validation on. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Mon Jul 16 14:14:45 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 16 Jul 2001 15:14:45 +0200 Subject: [XML-SIG] 4Suite 0.11.1 and PyXML 0.6.5 In-Reply-To: (message from Alexandre Fayolle on Mon, 16 Jul 2001 09:19:30 +0200 (CEST)) References: Message-ID: <200107161314.f6GDEj902846@mira.informatik.hu-berlin.de> > I've been quite busy these last weeks, and have not managed to follow the > various mailing lists as closely as I would have wanted. Is there a > release of PyXML 0.6.6 planned that would mainly feature the changes in > xml.dom.ext that make PyXML compatible with 4Suite-0.11.1's pDomlette ? The 0.6.6 branch is open for people to commit into it; I trust that anybody committing changes will follow a "bug fixes only" strategy there. Once there is actually stuff to release, I'd happily produce a release. > Yet another (dumb) question : is the new version of XPath (thread > and unicode friendly) part of 4Suite 0.11.1 or PyXML 0.7 ? The PyXPath in the current PyXML CVS is already thread any unicode-aware; it is not based on the recent "unicode friendly" code from 4Suite (which I understand uses UTF-8 strings). I have currently no plans to integrate the 4Suite XPath parsers into PyXML, mainly because of the build complexity. Of course, if anybody would take the challenge and put the extension modules into extensions/, that would be fine as well. Regards, Martin From emdpek@chron.com Mon Jul 16 23:27:22 2001 From: emdpek@chron.com (Philip King) Date: Mon, 16 Jul 2001 17:27:22 -0500 Subject: [XML-SIG] Parsing DTD Message-ID: <3B536A4A.9DE3859F@chron.com> I am looking for (or desiring to build) a DTD Browser tool. I am imagining a simple window, initially showing the "root" entity. Each entity can be optionally (clickably) expanded, which would reveal its children entity nodes. In a separate window, perhaps a listing of a nodes attributes. (For users of IE Explorer, a similar utility can be found in the NITF DTD docs: http://www.nitf.org/nitf-documentation/nitf.html) Here is my dilemna: I cannot figure the hoops one must just through in order to have a parser (either Expat, xmlproc, xmllib, etc...) to read and parse a DTD. They all seem to choke with a syntax error... Any ideas? Philip From uogbuji@fourthought.com Tue Jul 17 05:36:41 2001 From: uogbuji@fourthought.com (Uche Ogbuji) Date: Mon, 16 Jul 2001 22:36:41 -0600 (MDT) Subject: [4suite] Re: [XML-SIG] 4Suite 0.11.1 and PyXML 0.6.5 In-Reply-To: <200107161314.f6GDEj902846@mira.informatik.hu-berlin.de> Message-ID: On Mon, 16 Jul 2001, Martin v. Loewis wrote: > > I've been quite busy these last weeks, and have not managed to follow the > > various mailing lists as closely as I would have wanted. Is there a > > release of PyXML 0.6.6 planned that would mainly feature the changes in > > xml.dom.ext that make PyXML compatible with 4Suite-0.11.1's pDomlette ? > > The 0.6.6 branch is open for people to commit into it; I trust that > anybody committing changes will follow a "bug fixes only" strategy > there. I missed this. I'll be sure to sync all my changes from the tip to this branch. I would indeed like to see a PyXML 0.6.6 bug-fix release to go with the 4Suite 0.11.1 release. > Once there is actually stuff to release, I'd happily produce a release. > > > Yet another (dumb) question : is the new version of XPath (thread > > and unicode friendly) part of 4Suite 0.11.1 or PyXML 0.7 ? > > The PyXPath in the current PyXML CVS is already thread any > unicode-aware; it is not based on the recent "unicode friendly" code > from 4Suite (which I understand uses UTF-8 strings). I had been hoping Jeremy would open up a discussion about the differences between your approach and the one he chose. I think this is an important discussion to have, since you've both done valuable work on the issue and people will be wanting to know what XPath implementation to use, and why. I confess that I don't really know the answers to this myself. -- Uche Ogbuji Principal Consultant uche.ogbuji@fourthought.com +1 303 583 9900 x 101 Fourthought, Inc. http://Fourthought.com 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA XML strategy, XML tools (http://4Suite.org), knowledge management From Alexandre.Fayolle@logilab.fr Tue Jul 17 07:40:53 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Tue, 17 Jul 2001 08:40:53 +0200 (CEST) Subject: [XML-SIG] Parsing DTD In-Reply-To: <3B536A4A.9DE3859F@chron.com> Message-ID: On Mon, 16 Jul 2001, Philip King wrote: > Here is my dilemna: I cannot figure the hoops one must just through in > order to have a parser (either Expat, xmlproc, xmllib, etc...) to read > and parse a DTD. They all seem to choke with a syntax error... You want to use xmlproc's DTD parser. For an example, you may want to check xmltools (http://www.logilab.org/xmltools/), since xmleditor uses the DTD parser to get the valid elements. And of course, you should give a look at the full blown documentation which is available on Lars Marius Garshol's xmlproc page (http://www.garshol.priv.no/download/software/xmlproc/) Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From dirksen_lau@yahoo.com Tue Jul 17 08:16:04 2001 From: dirksen_lau@yahoo.com (Dirksen) Date: Tue, 17 Jul 2001 00:16:04 -0700 (PDT) Subject: [XML-SIG] How to get SAX to parse not well formed HTML doc? Message-ID: <20010717071604.11011.qmail@web5105.mail.yahoo.com> Hi, I need to parse a bunch of HTML documents, yet the parser is too strict for this task. It stops at places where considered correct by HTML rules, like unquoted attributes. Can I make the parser more relaxed toward HTML documents? Cheers Dirksen __________________________________________________ Do You Yahoo!? Get personalized email addresses from Yahoo! Mail http://personal.mail.yahoo.com/ From larsga@garshol.priv.no Tue Jul 17 08:45:23 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 17 Jul 2001 09:45:23 +0200 Subject: [XML-SIG] Parsing DTD In-Reply-To: <3B536A4A.9DE3859F@chron.com> References: <3B536A4A.9DE3859F@chron.com> Message-ID: * Philip King | | Here is my dilemna: I cannot figure the hoops one must just through in | order to have a parser (either Expat, xmlproc, xmllib, etc...) to read | and parse a DTD. They all seem to choke with a syntax error... As Alexandre says you need to use a special DTD parser. The XML parsers will assume they are given an XML document and freak when they find that they are chewing something completely different. The DTD parser of xmlproc is the only way I know of doing this in Python. It should work just fine, though. --Lars M. From python-te@mcwords.com Tue Jul 17 08:54:42 2001 From: python-te@mcwords.com (Martin C Brown) Date: Tue, 17 Jul 2001 08:54:42 +0100 Subject: [XML-SIG] How to get SAX to parse not well formed HTML doc? In-Reply-To: <20010717071604.11011.qmail@web5105.mail.yahoo.com> Message-ID: > I need to parse a bunch of HTML documents, yet the parser is too > strict for this task. It stops at places where considered correct by > HTML rules, like unquoted attributes. Can I make the parser more > relaxed toward HTML documents? You might have more luck using the HTML parser, rather than SAX, which is deigned for parsing XML. The HTML parser is in htmllib and works in much the same way, and it handles unquoted attributes without any problems. MC -- Martin 'MC' Brown, mc@mcwords.com http://www.mcwords.com Writer, Author, Consultant 'Life is pain, anyone who says differently is selling something' From Alexandre.Fayolle@logilab.fr Tue Jul 17 10:02:03 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Tue, 17 Jul 2001 11:02:03 +0200 (CEST) Subject: [XML-SIG] xml.dom.ext.reader.HtmlLib Message-ID: Hello, I was hunting for a bug in Narval, and ended up in xml.dom.ext.reader.HtmlLib. I would like some feedback on this to know is this is indeed a bug, a documentation issue, or just me daydreaming that all APIs should do what I'd like them to, instead of what the coder meant. When I use xml.dom.ext.reader.Sax2, if I pass an ownerDocument to the reader when reading the data, I'll get back a DocumentFragment, belonging to the same document. With HtmlLib's reader, this is not the case : the owner document I'm passing is getting emptied. Cf. line 42-46: if doc: while doc.firstChild: # Empty out the document node = doc.removeChild(doc.firstChild) ReleaseNode(node) First (minor) thing is, this supposes I'm using a 4DOM document, since it uses ReleaseNode, second (important) thing is, I'm much annoyed that the document should be emptied, since in the case at hand, it already had some contents, and I was merely passing it in order to be sure that the right DOM implementation would be used, and to avoid an expensive call to importNode. As a side note, Sgmlop.HtmlParser uses non NS methods to build it's DOM. Is this what is intended ? I'll be glad to work on some patches, hopefully in time for PyXML 0.6.6, once the correct behaviour has been agreed on. Cheers, Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From larsga@garshol.priv.no Tue Jul 17 11:22:02 2001 From: larsga@garshol.priv.no (Lars Marius Garshol) Date: 17 Jul 2001 12:22:02 +0200 Subject: [XML-SIG] xml.dom.ext.reader.HtmlLib In-Reply-To: References: Message-ID: * Alexandre Fayolle | | With HtmlLib's reader, this is not the case : the owner document I'm | passing is getting emptied. Cf. line 42-46: | if doc: | while doc.firstChild: | # Empty out the document | node = doc.removeChild(doc.firstChild) | ReleaseNode(node) | | First (minor) thing is, this supposes I'm using a 4DOM document, since it | uses ReleaseNode, second (important) thing is, I'm much annoyed that the | document should be emptied, since in the case at hand, it already had some | contents, and I was merely passing it in order to be sure that the right | DOM implementation would be used, and to avoid an expensive call to | importNode. Part of the problem here is that we have a separate Reader for HTML documents. IMHO it would be much preferrable to have a SAX driver for the HTML parser instead. That could then use the SAX Reader, and behaviour would be consistent. In addition, we would get increased flexibility by having a SAX driver for this parser. | As a side note, Sgmlop.HtmlParser uses non NS methods to build it's | DOM. Is this what is intended ? Should be, shouldn't it? HTML doesn't have namespaces, only XHTML does. --Lars M. From Alexandre.Fayolle@logilab.fr Tue Jul 17 11:53:41 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Tue, 17 Jul 2001 12:53:41 +0200 (CEST) Subject: [4suite] Re: [XML-SIG] xml.dom.ext.reader.HtmlLib In-Reply-To: Message-ID: On 17 Jul 2001, Lars Marius Garshol wrote: > In addition, we would get increased flexibility by having a SAX driver > for this parser. agreed. > > | As a side note, Sgmlop.HtmlParser uses non NS methods to build it's > | DOM. Is this what is intended ? > > Should be, shouldn't it? HTML doesn't have namespaces, only XHTML does. Well... yes, and no. This is the old setAttributeNS(EMPTY_NS, name,value) vs setAttribute(name,value) question. The problem happens when you try to get the value back and you don't know what API was used to set it. However, using a Sax driver for this parser should help, since then the DOM builder would be able to call whatever method is deemed necessary. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From fdrake@acm.org Tue Jul 17 12:50:16 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 17 Jul 2001 07:50:16 -0400 (EDT) Subject: [XML-SIG] How to get SAX to parse not well formed HTML doc? In-Reply-To: <20010717071604.11011.qmail@web5105.mail.yahoo.com> References: <20010717071604.11011.qmail@web5105.mail.yahoo.com> Message-ID: <15188.9848.137663.499928@cj42289-a.reston1.va.home.com> Dirksen writes: > I need to parse a bunch of HTML documents, yet the parser is too > strict for this task. It stops at places where considered correct by > HTML rules, like unquoted attributes. Can I make the parser more > relaxed toward HTML documents? Martin C Brown writes: > The HTML parser is in htmllib and works in much the same way, and it handles > unquoted attributes without any problems. Another possibility would be to use the HTMLParser module, which is new in Python 2.2. It was originally developed for another project and is stable and well-tested. Feel free to extract the module from the Python CVS repository. -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From noreply@sourceforge.net Tue Jul 17 14:44:44 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 17 Jul 2001 06:44:44 -0700 Subject: [XML-SIG] [ pyxml-Patches-442005 ] pDomletteReader.SaxReader patch Message-ID: Patches item #442005, was opened at 2001-07-17 06:44 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=442005&group_id=6473 Category: 4Suite Group: None Status: Open Resolution: None Priority: 5 Submitted By: Alexandre Fayolle (afayolle) Assigned to: Nobody/Anonymous (nobody) Summary: pDomletteReader.SaxReader patch Initial Comment: The attached patch fixes several bugs when using pDomletteReader.SaxReader. It was generated against 4Suite-0.11.1b3. Cheers Alexandre Fayolle ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=306473&aid=442005&group_id=6473 From Alexandre.Fayolle@logilab.fr Tue Jul 17 16:40:30 2001 From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle) Date: Tue, 17 Jul 2001 17:40:30 +0200 (CEST) Subject: [XML-SIG] ANN : xmltools 1.3 Message-ID: I've just made python xmltools 1.3 available from http://www.logilab.org/xmltools/ Python XmlTools is a set of high level tools to help using XML in Python. It features two pyGTK widgets, XmlTree and XmlEditor, which can respectively display and edit an XML document in a graphical fashion. This release should fix some compatibility problems with python 2.x that were observed in xmltools-1.2. Cheers. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). From noreply@sourceforge.net Tue Jul 17 18:24:19 2001 From: noreply@sourceforge.net (noreply@sourceforge.net) Date: Tue, 17 Jul 2001 10:24:19 -0700 Subject: [XML-SIG] [ pyxml-Bugs-442087 ] parsing an XML string Message-ID: Bugs item #442087, was opened at 2001-07-17 10:24 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=442087&group_id=6473 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: parsing an XML string Initial Comment: I'm using PyXML 0.6.5 with Python 2.0; the following code: --- code --- from xml.dom.ext.reader import Sax2 parser = Sax2.Reader(validate=1) xml_dom_object = parser.fromString(VALID_XML_STRING) --- code --- returns: --- traceback --- Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.0/site-packages/_xmlplus/dom/ext/reader/__init__.py", l ine 63, in fromString return self.fromStream(stream, ownerDoc) File "/usr/lib/python2.0/site-packages/_xmlplus/dom/ext/reader/Sax2.py", line 309, in fromStream self.parser.parse(s) File "/usr/lib/python2.0/site-packages/_xmlplus/sax/drivers2/drv_xmlproc.py", line 90, in parse parser.read_from(source.getByteStream(), bufsize) File "/usr/lib/python2.0/site-packages/_xmlplus/parsers/xmlproc/xmlval.py", li ne 105, in read_from self.parser.read_from(file,bufsize) File "/usr/lib/python2.0/site-packages/_xmlplus/parsers/xmlproc/xmlutils.py", line 137, in read_from self.feed(buf) File "/usr/lib/python2.0/site-packages/_xmlplus/parsers/xmlproc/xmlutils.py", line 185, in feed self.do_parse() File "/usr/lib/python2.0/site-packages/_xmlplus/parsers/xmlproc/xmlproc.py", l ine 104, in do_parse self.parse_doctype() File "/usr/lib/python2.0/site-packages/_xmlplus/parsers/xmlproc/xmlproc.py", l ine 494, in parse_doctype sys_id)) File "/usr/lib/python2.0/site-packages/_xmlplus/parsers/xmlproc/xmlutils.py", line 667, in join_sysids_general if urlparse.urlparse(base)[0]=="": File "/usr/lib/python2.0/urlparse.py", line 59, in urlparse i = find(url, ':') File "/usr/lib/python2.0/string.py", line 172, in find return s.find(*args) AttributeError: 'None' object has no attribute 'find' --- traceback --- Using a non validating parser (validate=0) the code works; it also works using the fromUri() method of the parser object. Obviously the VALID_XML_STRING is a valid XML string. Thank you. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=106473&aid=442087&group_id=6473 From dieter@handshake.de Tue Jul 17 22:34:19 2001 From: dieter@handshake.de (Dieter Maurer) Date: Tue, 17 Jul 2001 23:34:19 +0200 (CEST) Subject: [XML-SIG] How to get SAX to parse not well formed HTML doc? In-Reply-To: <20010717071604.11011.qmail@web5105.mail.yahoo.com> References: <20010717071604.11011.qmail@web5105.mail.yahoo.com> Message-ID: <15188.44891.27142.683220@lindm.dm> Dirksen writes: > I need to parse a bunch of HTML documents, yet the parser is too > strict for this task. It stops at places where considered correct by > HTML rules, like unquoted attributes. Can I make the parser more > relaxed toward HTML documents? Maybe, you can use "tidy" (--> www.w3.org) beforehand to clean up your HTML. Dieter From martin@loewis.home.cs.tu-berlin.de Wed Jul 18 00:02:49 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 18 Jul 2001 01:02:49 +0200 Subject: [XML-SIG] How to get SAX to parse not well formed HTML doc? In-Reply-To: (message from Martin C Brown on Tue, 17 Jul 2001 08:54:42 +0100) References: Message-ID: <200107172302.f6HN2nG01729@mira.informatik.hu-berlin.de> > > I need to parse a bunch of HTML documents, yet the parser is too > > strict for this task. It stops at places where considered correct by > > HTML rules, like unquoted attributes. Can I make the parser more > > relaxed toward HTML documents? > > You might have more luck using the HTML parser, rather than SAX, which is > deigned for parsing XML. > > The HTML parser is in htmllib and works in much the same way, and it handles > unquoted attributes without any problems. Alternatively, you can use xml.parsers.sgmlop in the SGML mode. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Wed Jul 18 00:13:45 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 18 Jul 2001 01:13:45 +0200 Subject: [XML-SIG] xml.dom.ext.reader.HtmlLib In-Reply-To: (message from Lars Marius Garshol on 17 Jul 2001 12:22:02 +0200) References: Message-ID: <200107172313.f6HNDj701738@mira.informatik.hu-berlin.de> > Part of the problem here is that we have a separate Reader for HTML > documents. IMHO it would be much preferrable to have a SAX driver for > the HTML parser instead. That could then use the SAX Reader, and > behaviour would be consistent. > > In addition, we would get increased flexibility by having a SAX driver > for this parser. Sounds like an interesting project for a volunteer. I'd personally recommend to build this SAX driver on top of sgmlop; the true challenge is to get the events right that only result from the SGML DTD for HTML (e.g. missing closing tags, etc). Regards, Martin From martin@loewis.home.cs.tu-berlin.de Wed Jul 18 00:17:14 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 18 Jul 2001 01:17:14 +0200 Subject: [XML-SIG] How to get SAX to parse not well formed HTML doc? In-Reply-To: <15188.9848.137663.499928@cj42289-a.reston1.va.home.com> (fdrake@acm.org) References: <20010717071604.11011.qmail@web5105.mail.yahoo.com> <15188.9848.137663.499928@cj42289-a.reston1.va.home.com> Message-ID: <200107172317.f6HNHEp01770@mira.informatik.hu-berlin.de> > Another possibility would be to use the HTMLParser module, which is > new in Python 2.2. It was originally developed for another project > and is stable and well-tested. Feel free to extract the module from > the Python CVS repository. Of course, a "true" HTML parser should get the DTD right, i.e. generate closing elements where they are missing, expand entities (to unicode strings), etc. Regards, Martin From martin@loewis.home.cs.tu-berlin.de Wed Jul 18 00:00:11 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Wed, 18 Jul 2001 01:00:11 +0200 Subject: [4suite] Re: [XML-SIG] 4Suite 0.11.1 and PyXML 0.6.5 In-Reply-To: (message from Uche Ogbuji on Mon, 16 Jul 2001 22:36:41 -0600 (MDT)) References: Message-ID: <200107172300.f6HN0BV01728@mira.informatik.hu-berlin.de> > I missed this. I'll be sure to sync all my changes from the tip to this > branch. I would indeed like to see a PyXML 0.6.6 bug-fix release to go > with the 4Suite 0.11.1 release. Ok, then I propose the following procedure: - Copy everything you want to see in 0.6.6 in the branch (it is the "o6maint" branch) - Once you are done, I'll investigate the remaining changes as to whether they contain missing pieces; I'll then try to contact the authors of these changes to see whether they should be merged (sometimes it may be clear from the check-in messages). - I'll then give advance warning of a couple of days that 0.6.6 is upcoming. Regards, Martin From douglas@paradise.net.nz Wed Jul 18 05:52:59 2001 From: douglas@paradise.net.nz (Douglas Bagnall) Date: Wed, 18 Jul 2001 16:52:59 +1200 Subject: [XML-SIG] How to get SAX to parse not well formed HTML doc? In-Reply-To: <200107172317.f6HNHEp01770@mira.informatik.hu-berlin.de> References: <15188.9848.137663.499928@cj42289-a.reston1.va.home.com> (fdrake@acm.org) Message-ID: <3B55BEEB.4349.1E410E2@localhost> --Message-Boundary-5927 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body Hi there, I've used the attached script to turn html into xml for minidom, and it seems to work fairly well so long as the html doesn't contain text cut and pasted from Microsoft Word. fix() prints out xmlish version of the file. fixstring() does the same to a string. obviously, you'd change this somewhere around line 110. The output is tested against minidom, so if you get no traceback, it will be xml safe. Which is not to say it'll look good. Another thing I've done is put tohtml() and writehtml() methods in my version of minidom. They're the same as toxml & writexml, except they test empty elements against a tuple: br, img, link and so forth are rendered
(note the space) while other empty tags are written the long way - ,

etc. It's really simple. Would this be of any use to anyone else, or would it be just clutter up minidom.py? Douglas --Message-Boundary-5927 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Text from file 'rehtml.py' #!/usr/bin/python """ Excerpt from experimental WCN auto page generating version of Kea editor. copyright katipo communications ltd 2001 by douglas bagnall fix() prints out xmlish version of the file. fixstring() does the same to a string. obviously, you'd change this somewhere around line 110. The output is tested against minidom, so if you get no traceback, it will be xml safe. Which is not to say it'll look good. Html entities are not handled, nor are valueless attributes, like selected in option (xhtml 1.0 asks for selected="selected"). Misunderstood attributes are omitted without notice. """ from xml.dom.minidom import parseString import sys,re,string,os singlelist=('img','br','link','hr','input','area',"meta") wf=re.compile(r'''\w+=('|")[^'"]+\1''') def attrify(tag): attrs=tag.split() fattrs=[re.sub("[^\w-]","x",attrs.pop(0).lower())] #deals rudely with non-alphanumeric tags while attrs: trying=attrs.pop(0) if wf.match(trying): fattrs.append(trying) else: trying=re.sub(r'[\'"]',"",trying) # clear quotes trying=trying.replace('=','="',1)+'"' # and requote (won't get valueless html attributes eg