From rsalz@zolera.com  Wed Oct  3 15:21:13 2001
From: rsalz@zolera.com (Rich Salz)
Date: Wed, 03 Oct 2001 10:21:13 -0400
Subject: [XML-SIG] Moving exceptions under Exception in yappsrt.py
Message-ID: <3BBB1ED9.EB1BDB78@zolera.com>

I sent mail to the YAPPS author a couple of days ago but haven't get got
a reply.  Any object to moving the SyntaxError classes in yappsrt.py and
pyxpath.py so that they inherit from Exception?
	/r$

-- 
Zolera Systems, Your Key to Online Integrity
Securing Web services: XML, SOAP, Dig-sig, Encryption
http://www.zolera.com


From martin@loewis.home.cs.tu-berlin.de  Wed Oct  3 17:47:37 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 3 Oct 2001 18:47:37 +0200
Subject: [XML-SIG] Moving exceptions under Exception in yappsrt.py
In-Reply-To: <3BBB1ED9.EB1BDB78@zolera.com> (message from Rich Salz on Wed, 03
 Oct 2001 10:21:13 -0400)
References: <3BBB1ED9.EB1BDB78@zolera.com>
Message-ID: <200110031647.f93Glbq04571@mira.informatik.hu-berlin.de>

> I sent mail to the YAPPS author a couple of days ago but haven't get got
> a reply.  Any object to moving the SyntaxError classes in yappsrt.py and
> pyxpath.py so that they inherit from Exception?

No, go ahead. Please note that I have just committed the 0.11.1
changes. In the process, I noticed two things:
- the XPathParser is now a pure Python (even though a generated file);
  so it is debatable whether pyxpath should be maintained (although
  I'm in favour of that).
- your changes to include CDATA_SECTION in a couple of places apparently
  have not been integrated into 4Suite. I'll try to maintain them after
  each merge, there is always the potential that they'll break unless
  they get synchronized with 4Suite.

Regards,
Martin


From dmoor@technology.serco.com  Thu Oct  4 12:08:19 2001
From: dmoor@technology.serco.com (David Moor)
Date: Thu, 4 Oct 2001 12:08:19 +0100
Subject: [XML-SIG] PyXML Question
Message-ID: <195F58F118C9D311B622009027DC812F77DC1B@mail1.technology.serco.com>

Hi

I have just downloaded PyXML-0.6.6.win32-py2.1.exe and tried to install it
because I have some example code which contains:

> from xml.dom.html_builder  import HtmlBuilder
> from xml.dom.walker        import Walker
> from xml.dom.writer        import HtmlWriter

I though I would then be able to run the example script.  Followind the
install my Python21\Lib directory has not changed although I now have a
Python21\_xmlplus and a Python21\xmldoc directory.  Is this correct?  The
script will still not run, I and using WinNT 4 and Python 2.1, do I need to
copy the _xmlplus directory contents into the Lib\xml directory?

This auto installer seems to have lulled me into a false sense of security,
any help would be greatly appreciated.

Dave Moor
This message, including attachments, is intended only for the use by the
person(s) to whom it is addressed. It may contain information which is
privileged and confidential. Copying or use by anybody else is not
authorised. If you are not the intended recipient, please contact the sender
as soon as possible. The views expressed in this communication may not
necessarily be the views held by Serco Integrated Transport.


From larsga@garshol.priv.no  Thu Oct  4 15:06:47 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 04 Oct 2001 16:06:47 +0200
Subject: [XML-SIG] drv_jython
Message-ID: <m3u1xfsdmw.fsf@lambda.garshol.priv.no>

I finally got round to making a SAX 2.0 driver for the Java SAX 2.0
parsers, for use in Jython. It is not yet complete, but I did use it
successfully last night to convert a 1.2 MB XML file into a topic map.

Should I check it into the main branch, or should I put it on some
other branch?

Also, I guess we should use different lists of default parsers in
Jython and CPython.

  Jython: drv_jython, drv_xmlproc
  CPython: expatreader, drv_xmlproc

This change is more risky, in that it will make people suddenly start
using drv_jython, before it has been properly tested. Comments?

--Lars M.


From noreply@sourceforge.net  Thu Oct  4 17:35:02 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 04 Oct 2001 09:35:02 -0700
Subject: [XML-SIG] [ pyxml-Bugs-467937 ] 4DOM Events broken on setAttribute
Message-ID: <E15pBSg-0007Rn-00@usw-sf-web1.sourceforge.net>

Bugs item #467937, was opened at 2001-10-04 09:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=467937&group_id=6473

Category: DOM
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Alexandre Fayolle (afayolle)
Assigned to: Alexandre Fayolle (afayolle)
Summary: 4DOM Events broken on setAttribute

Initial Comment:
Using setAttribute/setAttributeNS on 4DOM elements will
not cause a mutation event to be fired if the attribute
already exists. 

Alexandre Fayolle

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=467937&group_id=6473


From martin@loewis.home.cs.tu-berlin.de  Thu Oct  4 20:33:24 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 4 Oct 2001 21:33:24 +0200
Subject: [XML-SIG] PyXML Question
In-Reply-To: <195F58F118C9D311B622009027DC812F77DC1B@mail1.technology.serco.com>
 (message from David Moor on Thu, 4 Oct 2001 12:08:19 +0100)
References: <195F58F118C9D311B622009027DC812F77DC1B@mail1.technology.serco.com>
Message-ID: <200110041933.f94JXOD02080@mira.informatik.hu-berlin.de>

> I have just downloaded PyXML-0.6.6.win32-py2.1.exe and tried to install it
> because I have some example code which contains:
> 
> > from xml.dom.html_builder  import HtmlBuilder
> > from xml.dom.walker        import Walker
> > from xml.dom.writer        import HtmlWriter

Unfortunately, that won't help you: These interfaces disappeard with
PyXML 0.5.

> I though I would then be able to run the example script.  Followind the
> install my Python21\Lib directory has not changed although I now have a
> Python21\_xmlplus and a Python21\xmldoc directory.  Is this correct?  

It is.

> The script will still not run, I and using WinNT 4 and Python 2.1,
> do I need to copy the _xmlplus directory contents into the Lib\xml
> directory?

No, you probably need to port the script to PyXML 0.6. Alternatively,
you could try to install PyXML 0.5, although this is no longer
supported.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Thu Oct  4 20:31:37 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 4 Oct 2001 21:31:37 +0200
Subject: [XML-SIG] drv_jython
In-Reply-To: <m3u1xfsdmw.fsf@lambda.garshol.priv.no> (message from Lars Marius
 Garshol on 04 Oct 2001 16:06:47 +0200)
References: <m3u1xfsdmw.fsf@lambda.garshol.priv.no>
Message-ID: <200110041931.f94JVbQ02079@mira.informatik.hu-berlin.de>

> I finally got round to making a SAX 2.0 driver for the Java SAX 2.0
> parsers, for use in Jython. It is not yet complete, but I did use it
> successfully last night to convert a 1.2 MB XML file into a topic map.
> 
> Should I check it into the main branch, or should I put it on some
> other branch?

Go ahead and check it into the mainline. PyXML 0.7.0 will have further
changes that will go into the wild with it for the first time, and
we can always issue 0.7.1 if we get complaints.

> Also, I guess we should use different lists of default parsers in
> Jython and CPython.
> 
>   Jython: drv_jython, drv_xmlproc
>   CPython: expatreader, drv_xmlproc
> 
> This change is more risky, in that it will make people suddenly start
> using drv_jython, before it has been properly tested. Comments?

Well, PyXML will already look for the python.xml.sax.parser property
to select a parser. If using the Java SAX code causes troubles, we
already know a work-around.

I'm not sure drv_jython is a good name, though. Shouldn't it rather
indicate the specific Java API you are using, like jaxml (or what its
name is)? Or perhaps even the specific parser that you use?

Regards,
Martin


From larsga@garshol.priv.no  Thu Oct  4 22:07:05 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 04 Oct 2001 23:07:05 +0200
Subject: [XML-SIG] drv_jython
In-Reply-To: <200110041931.f94JVbQ02079@mira.informatik.hu-berlin.de>
References: <m3u1xfsdmw.fsf@lambda.garshol.priv.no> <200110041931.f94JVbQ02079@mira.informatik.hu-berlin.de>
Message-ID: <m3ofnnjerq.fsf@lambda.garshol.priv.no>

* Martin v. Loewis
| 
| Go ahead and check it into the mainline. 

Me do.
 
| Well, PyXML will already look for the python.xml.sax.parser property
| to select a parser. If using the Java SAX code causes troubles, we
| already know a work-around.

True enough. I'll check it in, then.
 
| I'm not sure drv_jython is a good name, though. Shouldn't it rather
| indicate the specific Java API you are using, like jaxml (or what
| its name is)? Or perhaps even the specific parser that you use?

The specific Java API is SAX. The driver uses JAXP to create the
parser (but not for anything else), and if JAXP doesn't find a parser
it falls back to using SAX.

We could always call it drv_javasax, I guess. That is perhaps a better
name. 

--Lars M.


From laurent.tardif@csse.monash.edu.au  Fri Oct  5 02:37:11 2001
From: laurent.tardif@csse.monash.edu.au (Laurent Tardif)
Date: Fri, 05 Oct 2001 11:37:11 +1000
Subject: [XML-SIG] hi,
Message-ID: <3BBD0EC7.EFE52BB6@csse.monash.edu.au>

with some friends we start to design an SVG authoring tool in python.

And, of course, we use the xml library. 

I have some questions :
	- what's the current activity on xml ?
	can we hop some improvement the the DOM classes, some bug fix, and so
one, 
	we will be please to fix some of them. 
	What is the way to contribute ? 
	For the moment we have found some field in the data structure which are
not up to date.

	- there is a plan to do some XSLT processor engine ?
	
	-how can we extend the documentation ?
	 for a python beginer, it's quit impossible to find how to :
		- laod a xml document
		- write a Xml document
		- set the validating property on the parser
		- .... 
		- and all the XML api is not documented, find 
		  the class in xml.dom.ext is very funny ;-)
	
A compliment :
	quit suprised by the speed of the parser, very nice.
 	
	
-- 
-----------------------------------------------------------------
.                       
..                                                
                       .' @`._         Laurent Tardif
        ~       ...._.'  ,__.-         Monach University
     _..------/`           .-';        mailBox 36 - Building 26       
    :     __./'       ,  .'-'-   ~     School of Computer Science 
 ~   `---(.-'''---.    \`._  -.._      and Software engineering
   _.--'(  .______.'.-' `-.`     `.  ~ Clayton Victoria 3168   
  :      `-..____`-.               `.  Australia               
  `.             ````                ; Phone : xxx55779        
    `-.__                            ; www.inrialpes.fr/opera
         ````-----.......----   __.-'


From Alexandre.Fayolle@logilab.fr  Fri Oct  5 08:39:52 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Fri, 5 Oct 2001 09:39:52 +0200 (CEST)
Subject: [XML-SIG] hi,
In-Reply-To: <3BBD0EC7.EFE52BB6@csse.monash.edu.au>
Message-ID: <Pine.LNX.4.21.0110050928060.853-100000@sagittarius.logilab.fr>

On Fri, 5 Oct 2001, Laurent Tardif wrote:

> with some friends we start to design an SVG authoring tool in python.
> 
> And, of course, we use the xml library. 
> 
> I have some questions :
> 	- what's the current activity on xml ?

It's quite high I think. Most people here are actively using the tools in
PyXML, so this means that you'll find a high level of support on this
list. 

> 	can we hop some improvement the the DOM classes, some bug fix, and so
> one, 
> 	we will be please to fix some of them. 
> 	What is the way to contribute ? 
> 	For the moment we have found some field in the data structure which are
> not up to date.

Now this is strange. I thought I had completely debugged 4DOM ;o)  Which
DOM implementation are you using? Please report bugs on the list (or even
better on the bugtracker of the sourceforge project
(http://pyxml.sf.net/), and submit patch to the patch manager, or on the
list.
 
> 	- there is a plan to do some XSLT processor engine ?

Please check 4Suite. http://www.4suite.org/, which provides a full blown
XSLT engine. There's also another project whose name I cannot remember
which provides python bindings to the C++ Xalan XSLT engine
(http://xml.apache.org/), and python bindings to the Sablotron XSLT
engine. 

> 	-how can we extend the documentation ?
> 	 for a python beginer, it's quit impossible to find how to :
> 		- laod a xml document
> 		- write a Xml document
> 		- set the validating property on the parser
> 		- .... 
> 		- and all the XML api is not documented, find 
> 		  the class in xml.dom.ext is very funny ;-)

The documentation is still being written. If you check the SIG's page, and
follow the documentation link, you'll eventually reach this page

http://py-howto.sourceforge.net/xml-howto/DOM.html

Does it answer your questions?


Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).


From hannu@tm.ee  Fri Oct  5 08:51:50 2001
From: hannu@tm.ee (Hannu Krosing)
Date: Fri, 05 Oct 2001 09:51:50 +0200
Subject: [XML-SIG] hi,
References: <Pine.LNX.4.21.0110050928060.853-100000@sagittarius.logilab.fr>
Message-ID: <3BBD6696.9F630239@tm.ee>

Alexandre Fayolle wrote:
> 
> >       - there is a plan to do some XSLT processor engine ?
> 
> Please check 4Suite. http://www.4suite.org/, which provides a full blown
> XSLT engine. There's also another project whose name I cannot remember
> which provides python bindings to the C++ Xalan XSLT engine
> (http://xml.apache.org/), and python bindings to the Sablotron XSLT
> engine.

And one for libxslt too http://www.rexx.com/~dkuhlman/ just for note :)

--------
Hannu


From martin@loewis.home.cs.tu-berlin.de  Fri Oct  5 08:54:24 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 5 Oct 2001 09:54:24 +0200
Subject: [XML-SIG] hi,
In-Reply-To: <3BBD0EC7.EFE52BB6@csse.monash.edu.au> (message from Laurent
 Tardif on Fri, 05 Oct 2001 11:37:11 +1000)
References: <3BBD0EC7.EFE52BB6@csse.monash.edu.au>
Message-ID: <200110050754.f957sOS01073@mira.informatik.hu-berlin.de>

> And, of course, we use the xml library. 
> 
> I have some questions :
> 	- what's the current activity on xml ?

I'm still planning to release PyXML 0.7.0 some time in the future.

> 	can we hop some improvement the the DOM classes, some bug fix,
> and so one, we will be please to fix some of them.

No, unless you give more detail what kind of improvements you plan,
and what kind of bugs you want to see fixed. Except for what is on SF,
I'm not aware of any potential improvements or any desirable bug
fixes.

> 	What is the way to contribute ? 

If you want to contribute patches or report bugs, please use
sf.net/projects/pyxml. For general discussions, this is the right
place.

> 	For the moment we have found some field in the data structure which are
> not up to date.

Can you give details?

> 	- there is a plan to do some XSLT processor engine ?

Yes, PyXML 0.7 will ship with 4XSLT. Please note that you can get
4XSLT today from www.4suite.org (as part of the 4Suite package).

> 	-how can we extend the documentation ?

Submit patches.

> 	 for a python beginer, it's quit impossible to find how to :
> 		- laod a xml document
> 		- write a Xml document

Did you read the tutorial? This explains both aspects. Of course,
contributions of documentation are greatly welcome. Please submit them
to SF.

> 	quit suprised by the speed of the parser, very nice.

Thanks. I assume you are using Expat here, so the glory actually goes
to James Clark and the current maintainers of Expat.

Regards,
Martin


From dmoor@technology.serco.com  Fri Oct  5 09:56:09 2001
From: dmoor@technology.serco.com (David Moor)
Date: Fri, 5 Oct 2001 09:56:09 +0100
Subject: [XML-SIG] PyXML Question
Message-ID: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com>


> -----Original Message-----
> From: Martin v. Loewis
> Subject: Re: [XML-SIG] PyXML Question
> 
> > I have just downloaded PyXML-0.6.6.win32-py2.1.exe and 
> tried to install it
> > because I have some example code which contains:
> > 
> > > from xml.dom.html_builder  import HtmlBuilder
> > > from xml.dom.walker        import Walker
> > > from xml.dom.writer        import HtmlWriter
> 

> > The script will still not run, I and using WinNT 4 and Python 2.1,
> > do I need to copy the _xmlplus directory contents into the Lib\xml
> > directory?
> 
> No, you probably need to port the script to PyXML 0.6. Alternatively,
> you could try to install PyXML 0.5, although this is no longer
> supported.
> 
> Regards,
> Martin

Thanks for the help Martin, I am trying to write a bot to access information
from a web site which requires me to log in.  Since I have just started to
use Python a couple of months ago and have not used PyXML before I was
hoping to use this sample script as a learning tool since it was designed to
download the authors online bank statements.  Because this functionality
isn't supported any more I think I would be best learning the new method,
which brings me to my next question.

Are there any 'Introduction to PyXML' documents, describing the different
parts and giving examples?  I have looked in the xml-howto.txt in /xmldocs,
the section I think I need is 4.5 Processing HTML, which contains 'Intro to
HTML builder' :)

TIA

Dave Moor
This message, including attachments, is intended only for the use by the
person(s) to whom it is addressed. It may contain information which is
privileged and confidential. Copying or use by anybody else is not
authorised. If you are not the intended recipient, please contact the sender
as soon as possible. The views expressed in this communication may not
necessarily be the views held by Serco Integrated Transport.


From Alexandre.Fayolle@logilab.fr  Fri Oct  5 10:32:18 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Fri, 5 Oct 2001 11:32:18 +0200 (CEST)
Subject: [XML-SIG] PyXML Question
In-Reply-To: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com>
Message-ID: <Pine.LNX.4.21.0110051118260.853-100000@sagittarius.logilab.fr>

On Fri, 5 Oct 2001, David Moor wrote:

> Are there any 'Introduction to PyXML' documents, describing the different
> parts and giving examples?  I have looked in the xml-howto.txt in /xmldocs,
> the section I think I need is 4.5 Processing HTML, which contains 'Intro to
> HTML builder' :)

The first thing you may want to note is that it is generally difficult to
map html to xml, and even harder to extract information from the resulting
xml. The reason for this is that html is too often used for presentation,
meaning that you get tons of nested tables in a typical html document,
quite often with badly nested elements, or misquoted attributes. 

This said, let's get into solving your problem:

the official way of creating a DOM tree is buy using a reader class, such
as xml.dom.ext.reader.Sax2.Reader class. If what you want to process html,
you'll want to use xml.dom.ext.reader.HtmlLib.Reader.

The first thing you want to do is build a new reader:
from xml.dom.ext.HtmlLib import Reader
r = Reader()

Then you can use the reader to parse the tree. A reader has 3 methods to
achieve this: fromString, fromUri and fromStream (which does the real work
for the other 2). fromString takes a string representation of the
document, fromUri takes a URL or URI string pointing to the document, and
fromStream takes a File-like object. All three methods return a Document.

doc = r.fromUri('http://www.logilab.org/')

This was the easy part. Now you still have to figure out where the
information you need is. There are no generic method for this, it all
depends on the document you're processing. I can suggest you to give a
good look at the DOM Traversal API from the W3C site, and at XPath, both
of which can be nice tools to perform such task. 

Cheers,

Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).


From akuchlin@mems-exchange.org  Fri Oct  5 14:18:21 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 05 Oct 2001 09:18:21 -0400
Subject: [XML-SIG] Dropping xml.marshal
Message-ID: <E15pUrt-0008Sv-00@ute.cnri.reston.va.us>

The code in the xml.marshal package is out of date, and I've never
heard of anyone using it.  Therefore, I suggest it be deleted.  
Any objections?

--amk


From fdrake@acm.org  Fri Oct  5 14:50:21 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 5 Oct 2001 09:50:21 -0400
Subject: [XML-SIG] Dropping xml.marshal
In-Reply-To: <E15pUrt-0008Sv-00@ute.cnri.reston.va.us>
References: <E15pUrt-0008Sv-00@ute.cnri.reston.va.us>
Message-ID: <15293.47773.97621.370545@grendel.zope.com>

Andrew Kuchling writes:
 > The code in the xml.marshal package is out of date, and I've never
 > heard of anyone using it.  Therefore, I suggest it be deleted.  
 > Any objections?

  Given the recent discussion on the XML-RPC list, and the
availability of xmlrpclib, I'd say that certainly xml.marshal.xmlrpc
can go.  I don't have any opinion on the others, but see no reason to
keep them if they aren't being used.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From noreply@sourceforge.net  Fri Oct  5 15:42:40 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 05 Oct 2001 07:42:40 -0700
Subject: [XML-SIG] [ pyxml-Bugs-468299 ] cloneNode does not change ownerElement
Message-ID: <E15pWBU-0003W6-00@usw-sf-web3.sourceforge.net>

Bugs item #468299, was opened at 2001-10-05 07:42
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=468299&group_id=6473

Category: DOM
Group: None
Status: Open
Resolution: None
Priority: 6
Submitted By: Alexandre Fayolle (afayolle)
Assigned to: Alexandre Fayolle (afayolle)
Summary: cloneNode does not change ownerElement

Initial Comment:
It's a long time since I found a bug in 4DOM!

>>> from xml.dom.ext.reader.Sax2 import Reader
>>> d = Reader().fromString("<plum pud='ding'/>")
>>> clone = d.documentElement.cloneNode(1)
>>> print d.documentElement
<Element Node at 822b48c: Name='plum' with 1 attributes
and 0 children>
>>> print clone                           
<Element Node at 822c42c: Name='plum' with 1 attributes
and 0 children>
>>> print clone.attributes[0].ownerElement
<Element Node at 822b48c: Name='plum' with 1 attributes
and 0 children>


This causes Events on attributes not to be properly
propagated if they occur on a cloned branch. 

I'll patch this one.

Alexandre

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=468299&group_id=6473


From martin@loewis.home.cs.tu-berlin.de  Fri Oct  5 19:06:30 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 5 Oct 2001 20:06:30 +0200
Subject: [XML-SIG] Dropping xml.marshal
In-Reply-To: <E15pUrt-0008Sv-00@ute.cnri.reston.va.us> (message from Andrew
 Kuchling on Fri, 05 Oct 2001 09:18:21 -0400)
References: <E15pUrt-0008Sv-00@ute.cnri.reston.va.us>
Message-ID: <200110051806.f95I6Ue01130@mira.informatik.hu-berlin.de>

> The code in the xml.marshal package is out of date, and I've never
> heard of anyone using it.  Therefore, I suggest it be deleted.  
> Any objections?

Yes. There have been user contributions to the wddx code, so
apparently some users do care atleast about wddx. I'm not so sure that
the other two marshallers have any value, but atleast the generic one
needs to stay to support wddx. So if you want to remove xml-rpc, go
ahead.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Fri Oct  5 19:02:19 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 5 Oct 2001 20:02:19 +0200
Subject: [XML-SIG] PyXML Question
In-Reply-To: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com>
 (message from David Moor on Fri, 5 Oct 2001 09:56:09 +0100)
References: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com>
Message-ID: <200110051802.f95I2Ji01127@mira.informatik.hu-berlin.de>

> Are there any 'Introduction to PyXML' documents, describing the
> different parts and giving examples?  I have looked in the
> xml-howto.txt in /xmldocs, the section I think I need is 4.5
> Processing HTML, which contains 'Intro to HTML builder' :)

The XML HOWTO is the right starting point. However, that section still
needs to be written/updated/replaced. You should use a
xml.dom.ext.reader.Reader instance, and its from{Stream,Uri,String}
method.

Then, the normal DOM operations can be used on the tree. To write back
the result, you should use use xml.dom.ext.XHtmlPrettyPrint.

Note that processing HTML with XML libraries is always risky, as HTML
documents are not XML documents (unless they comply with XHTML);
often, they don't even comply with the HTML DTD. In these cases,
processors can easily get confused.

Regards,
Martin


From pyxml@xhaus.com  Sat Oct  6 13:19:56 2001
From: pyxml@xhaus.com (Alan Kennedy)
Date: Sat, 06 Oct 2001 13:19:56 +0100
Subject: [XML-SIG] PyXML Question
References: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com> <200110051802.f95I2Ji01127@mira.informatik.hu-berlin.de>
Message-ID: <3BBEF6EC.25C51FCF@xhaus.com>

"Martin v. Loewis" wrote:

> Note that processing HTML with XML libraries is always risky, as HTML
> documents are not XML documents (unless they comply with XHTML);
> often, they don't even comply with the HTML DTD. In these cases,
> processors can easily get confused.

Although I haven't used the Python version, Dave Raggetts excellent Tidy
program will clean up malformed HTML and turn it into XHTML, which should
then be parsable by XML processors.

Marc-Andre Lemburg has provided a python interface to HTML tidy, which is
now a part of the Egenix Experimental Package. You can find it here:-

http://www.lemburg.com/files/python/index.html

My memory of my use of HTML tidy is that coverage is very good of most of
the common problems you would encounter processing malformed HTML as XML.
For example, I think it will wrap the content of <SCRIPT> elements in
<![CDATA[ ]]> markers so that your XML parser won't choke on [<>&]
characters that might be found in Javascript code.

The only thing missing from the original HTML Tidy was a way to generate
the tidied output as a SAX stream. Instead, you have to put the ouput into
a file or a string and parse it into whatever XML form you require, using
the standard PyXML parsing tools.

Alan.


From larsga@garshol.priv.no  Sat Oct  6 13:55:56 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 06 Oct 2001 14:55:56 +0200
Subject: [XML-SIG] PyXML Question
In-Reply-To: <3BBEF6EC.25C51FCF@xhaus.com>
References: <195F58F118C9D311B622009027DC812F77DC1E@mail1.technology.serco.com> <200110051802.f95I2Ji01127@mira.informatik.hu-berlin.de> <3BBEF6EC.25C51FCF@xhaus.com>
Message-ID: <m3vghtdj1f.fsf@lambda.garshol.priv.no>

* Alan Kennedy
| 
| The only thing missing from the original HTML Tidy was a way to
| generate the tidied output as a SAX stream. Instead, you have to put
| the ouput into a file or a string and parse it into whatever XML
| form you require, using the standard PyXML parsing tools.

Yep. What I would very much like to see is a SAX driver for MAL's Tidy
module. That would solve this problem at a stroke.

--Lars M.


From wade@okaynetwork.com  Sat Oct  6 15:07:23 2001
From: wade@okaynetwork.com (Wade Leftwich)
Date: Sat, 06 Oct 2001 10:07:23 -0400
Subject: [XML-SIG] Parsing HTML to DOM
Message-ID: <200110061404.KAA29086@emerald.lightlink.com>

Alexandre Fayolle wrote:
>the official way of creating a DOM tree is buy using a reader class, such
>as xml.dom.ext.reader.Sax2.Reader class. If what you want to process html,
>you'll want to use xml.dom.ext.reader.HtmlLib.Reader.
>

Because HTMLTidy (http://www.w3.org/People/Raggett/tidy/) is so good at making sense of funky HTML, I use it to produce XHTML, which can then be processed with XML tools.

I made a little Python module that calls the command line version of HTMLtidy with the appropriate arguments. Will be happy to share if anyone wants to see it.

Wade Leftwich
Ithaca, NY


From mlh@idi.ntnu.no  Sat Oct  6 19:59:18 2001
From: mlh@idi.ntnu.no (Magnus Lie Hetland)
Date: Sat, 6 Oct 2001 20:59:18 +0200
Subject: [XML-SIG] xml.xslt.Processor
Message-ID: <008501c14e99$0058ffc0$156ff181@idi.ntnu.no>

Hi!

In the 4Suite docs, the module xml.xslt is said to contain
a class Processor, but in my installation, the class
seems to be in the module xml.xslt.Processor... Is this
simply an error in the documentation?

--

  Magnus Lie Hetland         http://www.hetland.org

 "Reality is that which, when you stop believing in
  it, doesn't go away."           -- Philip K. Dick


From mlh@idi.ntnu.no  Sat Oct  6 20:23:48 2001
From: mlh@idi.ntnu.no (Magnus Lie Hetland)
Date: Sat, 6 Oct 2001 21:23:48 +0200
Subject: [XML-SIG] cDomlette
Message-ID: <009501c14e9c$6c67c810$156ff181@idi.ntnu.no>

Wow... I tried the 4XPath to search a simple XML file
with 1000 entries, and was a bit disappointed with the
time it took to even load the file (with pDomlette):
25.540 seconds. Then I noticed cDomlette, and found
the RawExpatReader... Now it takes only 0.530 seconds.

Yay! :)

--

  Magnus Lie Hetland         http://www.hetland.org

 "Reality is that which, when you stop believing in
  it, doesn't go away."           -- Philip K. Dick


From tpassin@home.com  Sat Oct  6 21:24:57 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sat, 6 Oct 2001 16:24:57 -0400
Subject: [XML-SIG] cDomlette
References: <009501c14e9c$6c67c810$156ff181@idi.ntnu.no>
Message-ID: <000c01c14ea4$f787cdc0$7cac1218@cj64132b>

[Magnus Lie Hetland]

>
> Wow... I tried the 4XPath to search a simple XML file
> with 1000 entries, and was a bit disappointed with the
> time it took to even load the file (with pDomlette):
> 25.540 seconds. Then I noticed cDomlette, and found
> the RawExpatReader... Now it takes only 0.530 seconds.
>

Yes, we had a thread on that last month.  Search for subjects with
4XSLT Performance Problems with Large Files

You might monitor your memory usage while the DOM is building during those
long seconds - you will probably see a major difference between dDomlette
and cDomlette.

Cheers,

Tom P


From martin@loewis.home.cs.tu-berlin.de  Sat Oct  6 22:50:28 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Oct 2001 23:50:28 +0200
Subject: [XML-SIG] xml.xslt.Processor
In-Reply-To: <008501c14e99$0058ffc0$156ff181@idi.ntnu.no> (mlh@idi.ntnu.no)
References: <008501c14e99$0058ffc0$156ff181@idi.ntnu.no>
Message-ID: <200110062150.f96LoS711986@mira.informatik.hu-berlin.de>

> In the 4Suite docs, the module xml.xslt is said to contain
> a class Processor, but in my installation, the class
> seems to be in the module xml.xslt.Processor... Is this
> simply an error in the documentation?

It's a documentation error. Where exactly did you read that claim?

Regards,
Martin


From Mike.Olson@fourthought.com  Sun Oct  7 16:46:40 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sun, 07 Oct 2001 09:46:40 -0600
Subject: [XML-SIG] xml.xslt.Processor
References: <008501c14e99$0058ffc0$156ff181@idi.ntnu.no>
Message-ID: <3BC078E0.406317DB@fourthought.com>

Magnus Lie Hetland wrote:
> 
> Hi!
> 
> In the 4Suite docs, the module xml.xslt is said to contain
> a class Processor, but in my installation, the class
> seems to be in the module xml.xslt.Processor... Is this
> simply an error in the documentation?

This is a documentation error.  All of these should get fixed as we move
to PyDoc.

Thanks
Mike

> 
> --
> 
>   Magnus Lie Hetland         http://www.hetland.org
> 
>  "Reality is that which, when you stop believing in
>   it, doesn't go away."           -- Philip K. Dick
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson                                Principal Consultant
mike.olson@fourthought.com                +1 303 583 9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St,                      http://4Suite.org
Boulder, CO 80301-2537, USA
XML strategy, XML tools, knowledge management


From mlh@idi.ntnu.no  Sun Oct  7 23:28:15 2001
From: mlh@idi.ntnu.no (Magnus Lie Hetland)
Date: Mon, 8 Oct 2001 00:28:15 +0200
Subject: [XML-SIG] xml.xslt.Processor
References: <008501c14e99$0058ffc0$156ff181@idi.ntnu.no> <200110062150.f96LoS711986@mira.informatik.hu-berlin.de>
Message-ID: <012401c14f7f$5b9e5ab0$156ff181@idi.ntnu.no>

From: "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>


> > In the 4Suite docs, the module xml.xslt is said to contain
> > a class Processor, but in my installation, the class
> > seems to be in the module xml.xslt.Processor... Is this
> > simply an error in the documentation?
> 
> It's a documentation error. Where exactly did you read that claim?

At the 4Suite site:
http://www.4suite.org/4Suite.org/documents/4Suite/4XSLT-Api

Perhaps I missed some details, but the page has the heading
"Module xml.xslt" and just below it "Classes", including
Processor.

> Regards,
> Martin

--

  Magnus Lie Hetland         http://www.hetland.org

 "Reality is that which, when you stop believing in
  it, doesn't go away."           -- Philip K. Dick


From martin@loewis.home.cs.tu-berlin.de  Mon Oct  8 07:58:49 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 8 Oct 2001 08:58:49 +0200
Subject: [XML-SIG] xml.xslt.Processor
In-Reply-To: <012401c14f7f$5b9e5ab0$156ff181@idi.ntnu.no> (mlh@idi.ntnu.no)
References: <008501c14e99$0058ffc0$156ff181@idi.ntnu.no> <200110062150.f96LoS711986@mira.informatik.hu-berlin.de> <012401c14f7f$5b9e5ab0$156ff181@idi.ntnu.no>
Message-ID: <200110080658.f986wnd00962@mira.informatik.hu-berlin.de>

> Perhaps I missed some details, but the page has the heading
> "Module xml.xslt" and just below it "Classes", including
> Processor.

I guess this is not to be taken literally. 4Suite uses the "a class
per module" principle in many places, so there is in general a module
for each class with the same name as the class.

Regards,
Martin


From dan_vanorden@hotmail.com  Tue Oct  9 03:07:01 2001
From: dan_vanorden@hotmail.com (Dan Van Orden)
Date: Mon, 08 Oct 2001 22:07:01 -0400
Subject: [XML-SIG] xmltok.dll
Message-ID: <F1907hx27kBnccYTjr8000001d7@hotmail.com>

Hi,
My name is Dan, and I am trying to fix my fathers PC.  I noticed your link 
in the web that Mentioned the files I seem to have problems with. About two 
months ago, my brother decided to download instant messenger to the PC, and 
then we came up with xmlparse.dll file not found, error everytime the PC was 
turned on. After a query on the web a gentleman sent me a file to put this 
back into the system folder.  I did this and now I have the error message 
xmltok.dll is not found.
In either case our word application comes up with an error and lord knows 
what else might not be operating right now. Can you help or is this 
hopeless.  Please respond either way, thank you.
Dan

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp


From noreply@sourceforge.net  Tue Oct  9 09:23:23 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Oct 2001 01:23:23 -0700
Subject: [XML-SIG] [ pyxml-Bugs-469460 ] xmlns=''
Message-ID: <E15qsAd-0008Qo-00@usw-sf-web2.sourceforge.net>

Bugs item #469460, was opened at 2001-10-09 01:23
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=469460&group_id=6473

Category: xmlproc
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: St�phane Bidoul (sbidoul)
Assigned to: Lars Marius Garshol (larsga)
Summary: xmlns=''

Initial Comment:
The xmlproc sax2 driver fails
to parse this:

<test xmlns=''>data</test>

See the attached unittest.

-Stephane

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=469460&group_id=6473


From noreply@sourceforge.net  Tue Oct  9 09:35:48 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Oct 2001 01:35:48 -0700
Subject: [XML-SIG] [ pyxml-Bugs-469463 ] XMLGenerator and xmlns=''
Message-ID: <E15qsMe-0005kZ-00@usw-sf-web3.sourceforge.net>

Bugs item #469463, was opened at 2001-10-09 01:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=469463&group_id=6473

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: St�phane Bidoul (sbidoul)
Assigned to: Nobody/Anonymous (nobody)
Summary: XMLGenerator and xmlns=''

Initial Comment:
When parsing this

<test xmlns=''>data</test>

with the expat sax2 driver and
feeding the XMLGenerator handler,
the following output is obtained:

<?xml version="1.0" encoding="iso-8859-1"?>
<test xmlns="None">data</test>

Obviously, xmlns="None" is not correct.

The attached script illustrates
this behaviour.

-Stephane

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=469463&group_id=6473


From noreply@sourceforge.net  Tue Oct  9 09:38:46 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 09 Oct 2001 01:38:46 -0700
Subject: [XML-SIG] [ pyxml-Bugs-469464 ] parser IndexError
Message-ID: <E15qsPW-0005nK-00@usw-sf-web3.sourceforge.net>

Bugs item #469464, was opened at 2001-10-09 01:38
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=469464&group_id=6473

Category: SAX
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: parser IndexError

Initial Comment:
Hi,

during the parse of the submitted xml file with the
'saxtrace.py Gloger_seq04_contour.xml' command, the
program runs into an error which seems not to be a xml
error. For your information: A parse of the same xml
file with IBM's xerces C++ xml parser runs without any
problems.

Regards

Joachim Gloger
 -------- the backtrace of python ----

File
"/usr/local/lib/python2.1/site-packages/_xmlplus/sax/drivers/drv_xmlproc.py",
line 31, in parse
    self.parser.parse_resource(sysID)
  File
"/usr/local/lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlutils.py",
line 77, in parse_resource
    self.read_from(infile,bufsize)
  File
"/usr/local/lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlutils.py",
line 137, in read_from
    self.feed(buf)
  File
"/usr/local/lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlutils.py",
line 185, in feed
    self.do_parse()
  File
"/usr/local/lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlproc.py",
line 96, in do_parse
    self.parse_start_tag()
  File
"/usr/local/lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlproc.py",
line 153, in parse_start_tag
    if self.data[self.pos]!=">" and
self.data[self.pos]!="/":
IndexError: string index out of range

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=469464&group_id=6473


From gangayi@yahoo.com  Tue Oct  9 08:12:56 2001
From: gangayi@yahoo.com (Gangayi Mahesh)
Date: Tue, 9 Oct 2001 00:12:56 -0700 (PDT)
Subject: [XML-SIG] DTD specification
Message-ID: <20011009071256.59974.qmail@web20606.mail.yahoo.com>

Dear Sir,
              I have some problem in specifying DTD
for below.Kindly help me how to write DTD for this.

"""""""""
<Type aaa="some value which may vary" bbb="some value
which may vary">

	<ChildOne>exit</ChildOne>

	<ChildTwo>exit</ChildTwo>

	<ChildThree>exit</ChildThree>

</Type>
"""""""""

Thanking You
With Regards
Mahesh


__________________________________________________
Do You Yahoo!?
NEW from Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1


From Juergen Hermann" <jhe@webde-ag.de  Tue Oct  9 16:05:19 2001
From: Juergen Hermann" <jhe@webde-ag.de (Juergen Hermann)
Date: Tue, 09 Oct 2001 17:05:19 +0200
Subject: [XML-SIG] xmldoc.py
Message-ID: <m15qyRb-007qAmC@smtp.web.de>

Hi!

FYI, Itamar Shtull-Trauring and I work on an XML layer on top of pydoc. =

He uses it to generate static HTML pages via a XSL stylesheet, I do the =

same but dynamically in a Python app server.


Ciao, J=FCrgen

--
J=FCrgen Hermann, Developer (jhe@webde-ag.de)
WEB.DE AG, http://webde-ag.de/


From joachim.j.gloger@daimlerchrysler.com  Tue Oct  9 16:59:16 2001
From: joachim.j.gloger@daimlerchrysler.com (joachim.j.gloger@daimlerchrysler.com)
Date: Tue, 09 Oct 2001 17:59:16 +0200
Subject: [XML-SIG] pyxml xml parser error, help needed
Message-ID: <0057440047907858000002L482*@MHS>

Hi,

I have a problem. My xml parser crashes. The parser is based on pyXml, =
the=20
latest version. Since I thought that it was my fault I tried to check t=
he xml=20
file which causes the crash with one of the pyxml demo programs 'saxtra=
ce.py'.

But to my surprise, saxtrace did also crash with the same error message=
.=20
Enclosed you find the message. By the way, the crash happens on both Wi=
ndows=20
and Linux. The behaviour is exactly the same. I tried also several vers=
ions of=20
python (2.0 and 2.1) and the last two versions of pyxml, but this makes=
 no=20
difference.

If there is a possibility to give one of the developers the crash causi=
ng xml=20
file, I can do that. Please let me know, if you need this file.

Regards
Joachim Gloger


Traceback (most recent call last):
  File "saxtrace.py", line 70, in ?
    p.parse(sys.argv[1])
  File "C:\Python21\_xmlplus\sax\drivers\drv_xmlproc.py", line 31, in p=
arse
    self.parser.parse_resource(sysID)
  File "C:\Python21\_xmlplus\parsers\xmlproc\xmlutils.py", line 77, in =
parse_res
ource
    self.read_from(infile,bufsize)
  File "C:\Python21\_xmlplus\parsers\xmlproc\xmlutils.py", line 137, in=
 read_fro
m
    self.feed(buf)
  File "C:\Python21\_xmlplus\parsers\xmlproc\xmlutils.py", line 185, in=
 feed
    self.do_parse()
  File "C:\Python21\_xmlplus\parsers\xmlproc\xmlproc.py", line 96, in d=
o_parse
    self.parse_start_tag()
  File "C:\Python21\_xmlplus\parsers\xmlproc\xmlproc.py", line 149, in =
parse_sta
rt_tag
    if self.data[self.pos]!=3D">" and self.data[self.pos]!=3D"/":
IndexError: string index out of range


=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
J. M. Gloger, DaimlerChrysler AG, Research Center Ulm
P.O. Box 2360, 89013 Ulm, Germany

Phone: +49 731 505 2353
Fax:   +49 731 505 4113
Email: joachim.j.gloger@daimlerchrysler.com

                                              walk the talk
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=


From paul@ActiveState.com  Tue Oct  9 20:36:55 2001
From: paul@ActiveState.com (Paul Prescod)
Date: Tue, 09 Oct 2001 12:36:55 -0700
Subject: [XML-SIG] Fame, Fortune and the Python Conference
Message-ID: <3BC351D7.DF81B466@ActiveState.com>

Many of you will have submitted papers for the Python conference.
Whether you did or didn't, please consider another way to contribute to
the conference. You could submit a talks to the Web Services and
Protocols track, Zope or Tools tracks.

The Web Services and Protocols track is for anyone who is using Python
to communicate information between computers in an innovative way. You
can get involved with a simple email back to me saying you are
interested with a couple of sentences about your area of interest. I
will work with you to turn that into an abstract. You do not have to
submit a paper or anything else in advance. Just work with me on your
abstract and then show up at the conference with a fascinating talk. We
already have some well-known speakers signed up so you will be in good
company!

Here are some areas of interest:

 * Jabber peer-to-peer protocol 
 * XML-based Web Services protocols (SOAP, XML-RPC) 
 * CORBA Distributed Computing Protocol 
 * Web Services Description Language 
 * Business Integration techniques ("B2B") 
 * Enterprise Application Integration techniques ("EAI")
 * Proprietary Protocols

Just send me a note this week (why not reply right row!) describing your
project with Python and protocols or web services and we'll work out
whether there is an abstract in there.

More info is available here:

 http://www.python10.org/p10-conferenceEvents.html#webServices

Conference info (dates, location,etc.) is here:

 http://www.python10.org/

 Paul Prescod


From larsga@garshol.priv.no  Tue Oct  9 21:40:50 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 09 Oct 2001 22:40:50 +0200
Subject: [XML-SIG] pyxml xml parser error, help needed
In-Reply-To: <0057440047907858000002L482*@MHS>
References: <0057440047907858000002L482*@MHS>
Message-ID: <m3elock0ml.fsf@lambda.garshol.priv.no>

* joachim j. gloger
| 
| I have a problem. My xml parser crashes. The parser is based on
| pyXml, the latest version. Since I thought that it was my fault I
| tried to check the xml file which causes the crash with one of the
| pyxml demo programs 'saxtrace.py'.
| 
| But to my surprise, saxtrace did also crash with the same error
| message.

Most likely both use xmlproc, which is where this is happening. The
crash is happening in a very strange place, so I'd very much like to
see the file that causes this. Please attach it to the bug in
SourceForge. 

If the file is somehow confidential you can send it to me directly.

--Lars M.


From larsga@garshol.priv.no  Tue Oct  9 21:43:41 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 09 Oct 2001 22:43:41 +0200
Subject: [XML-SIG] DTD specification
In-Reply-To: <20011009071256.59974.qmail@web20606.mail.yahoo.com>
References: <20011009071256.59974.qmail@web20606.mail.yahoo.com>
Message-ID: <m3d73wk0hu.fsf@lambda.garshol.priv.no>

* Gangayi Mahesh
| 
| I have some problem in specifying DTD for below.

XML-L or comp.text.xml are probably better fora for this question.

| """""""""
| <Type aaa="some value which may vary" bbb="some value
| which may vary">
| 
| 	<ChildOne>exit</ChildOne>
| 
| 	<ChildTwo>exit</ChildTwo>
| 
| 	<ChildThree>exit</ChildThree>
| 
| </Type>
| """""""""

It is not very easy to see what you intend by this, but perhaps this
can serve as a start:

  <!ELEMENT Type (ChildOne, ChildTwo, ChildThree)>
  <!ATTLIST Type aaa CDATA #REQUIRED
                 bbb CDATA #REQUIRED>

  <!ELEMENT ChildOne (#PCDATA)>
  <!ELEMENT ChildTwo (#PCDATA)>
  <!ELEMENT ChildThree (#PCDATA)>

--Lars M.


From martin@loewis.home.cs.tu-berlin.de  Fri Oct 12 18:41:30 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Oct 2001 19:41:30 +0200
Subject: [XML-SIG] xmltok.dll
In-Reply-To: <F1907hx27kBnccYTjr8000001d7@hotmail.com>
 (dan_vanorden@hotmail.com)
References: <F1907hx27kBnccYTjr8000001d7@hotmail.com>
Message-ID: <200110121741.f9CHfUo01466@mira.informatik.hu-berlin.de>

> My name is Dan, and I am trying to fix my fathers PC.  I noticed your link 
> in the web that Mentioned the files I seem to have problems with. About two 
> months ago, my brother decided to download instant messenger to the PC, and 
> then we came up with xmlparse.dll file not found, error everytime the PC was 
> turned on. After a query on the web a gentleman sent me a file to put this 
> back into the system folder.  I did this and now I have the error message 
> xmltok.dll is not found.
> In either case our word application comes up with an error and lord knows 
> what else might not be operating right now. Can you help or is this 
> hopeless.  Please respond either way, thank you.

Hi Dan,

Even though it appears that you are having problems with some kind of
Expat installation (Expat being an XML parser), it is unlikely that
you got it from us. PyXML, the software package we distribute, does
not include a separate xmlparse.dll. Instead, we incorporate xmlparse
and xmltok into a single DLL, pyexpat.pyd.

As to your problem: If you need "instant messenger" (what kind of
software is this?), you probably need to get xmltok.dll as well.
If you don't need that messenger, I recommend to uninstall it.

Regards,
Martin


From gerhard@bigfoot.de  Fri Oct 12 21:23:00 2001
From: gerhard@bigfoot.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Fri, 12 Oct 2001 22:23:00 +0200
Subject: [XML-SIG] SyncML, anyone?
Message-ID: <20011012222300.A27617@lilith.hqd-internal>

Has anybody here done SyncML (http://www.syncml.org/) related work in
Python and is maybe even willing to share code?

Gerhard
-- 
mail:   gerhard <at> bigfoot <dot> de       registered Linux user #64239
web:    http://www.cs.fhm.edu/~ifw00065/    OpenPGP public key id 86AB43C0
public key fingerprint: DEC1 1D02 5743 1159 CD20  A4B6 7B22 6575 86AB 43C0
reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))


From fdrake@acm.org  Sat Oct 13 07:31:34 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Sat, 13 Oct 2001 02:31:34 -0400
Subject: [XML-SIG] RELAX-NG tools in Python?
Message-ID: <15303.57286.582550.339422@grendel.zope.com>

  I recall there being some RELAX-NG tools written in Python, but I
can't find them now.  Can anyone point me in the right direction?
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From Juergen Hermann" <jh@web.de  Sat Oct 13 07:47:36 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Sat, 13 Oct 2001 08:47:36 +0200
Subject: [XML-SIG] RELAX-NG tools in Python?
In-Reply-To: <15303.57286.582550.339422@grendel.zope.com>
Message-ID: <m15sIYZ-007qGDC@smtp.web.de>

On Sat, 13 Oct 2001 02:31:34 -0400, Fred L. Drake, Jr. wrote:

>  I recall there being some RELAX-NG tools written in Python, but I
>can't find them now.  Can anyone point me in the right direction?

James Tauber ported TREX: http://www.xmlhack.com/read.php?item=3D1114


Ciao, J=FCrgen


From info@mjais.de  Mon Oct 15 21:43:24 2001
From: info@mjais.de (markus jais)
Date: Mon, 15 Oct 2001 22:43:24 +0200
Subject: [XML-SIG] pydom vs. 4DOM
Message-ID: <E15tEbV-00018j-00@mrvdom01.schlund.de>

hello
I am new to xml with python and there is something I do not
completely understand!

in my book "xml processing with python" from prentice hall
the author speaks of 2 dom implemenation for python:
pyDOM and 4DOM

I have installed 4Suite-0.11.1 and PyXML-0.6.6     
when I look into the sources of PyXML, I see, that this
dom implementation is also 4DOM

is pyDOM out of date??
the book is already more than a year old, maybe things have changed

there is also minidom??

I am a little bit confused.
can anybody tell me, what is up to date and what DOM implementation
to use??

thanks in advance


regards
markus

-- 
Markus Jais
http://www.mjais.de
info@mjais.de
The road goes ever on and on - Bilbo Baggins


From Alexandre.Fayolle@logilab.fr  Tue Oct 16 07:59:49 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Tue, 16 Oct 2001 08:59:49 +0200 (CEST)
Subject: [XML-SIG] Re: [4suite] dom question
In-Reply-To: <E15tGO8-0005Vs-00@mrvdom01.schlund.de>
Message-ID: <Pine.LNX.4.21.0110160851450.2894-100000@sagittarius.logilab.fr>

On Tue, 16 Oct 2001, markus jais wrote:

> hello
> I took the following programm out of a python book
> (slightly modified)
> ------------------
> #!/usr/local/bin/python
> 
> from xml.dom.ext.reader import Sax

The prefered module, I think, is xml.dom.ext.reader.Sax2

> from xml.dom import Node
> s="""<languages>
> <lang><name>Perl</name><kritik>good</kritik></lang>
> <lang><name>Python</name><kritik>great</kritik></lang>
> <lang><name>Ruby</name><kritik>great</kritik></lang>
> </languages>
> """
> def text_in_children(n):
> 	t = ""
> 	for c in n.childNodes:
> 		if c.nodeType == Node.TEXT_NODE:
> 			t += c.nodeValue
> 	return t
> 
> baum = Sax.FromXml(s)
> languages = baum.getElementsByTagName("languages")[0]
> 
> for lang in languages.childNodes:
> 	if lang.nodeName == "lang":
> 		for c in lang.childNodes:
> 			#if c.hasAttributes():  ########### does not work!!!!
> 			#	print "has attributs"
> 			if c.nodeName == "name":
> 				print text_in_children(c.nextSibling)
> print "done"
> ----------------------------
> 
> I played a little bit with this file and I have some questions?
> 
> 1)
> according to the dom documentation in the python library reference
> there should be a method for Node, but I get the following
> error, when I call the method in the above code:
>     AttributeError: class FtNode has no attribute 'hasAttributes'  
> is the documentation wrong, or is it my mistake????

This method is not implemented in 4DOM. On the other hand, only Element
nodes have attributes. You can also use the attributes member variable, so
your test would be (not tested):

if c.attributes:
    print 'has attributes'


> 
> 2)
> can anybody tell me, where in the sources of 4DOM
> the method "childNodes" is implemented.
> I have been trying more to find the method in the sources
> for more than an hour now.
> maybe I am just blind :-)

in FtNode, line 103. It is not a method, rather a direct attribute access.

> I really appreciate any hints!!!!

Well since you asked, 4Thought gave the source code of 4DOM to the PyXML
Project, so the right place to ask question on 4DOM is xml-sig@python.org
(cc'ed to this mail). Lots of people have subscribed to both lists, so you
should get answers on both anyway. 

Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).


From Alexandre.Fayolle@logilab.fr  Tue Oct 16 08:08:37 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Tue, 16 Oct 2001 09:08:37 +0200 (CEST)
Subject: [XML-SIG] pydom vs. 4DOM
In-Reply-To: <E15tEbV-00018j-00@mrvdom01.schlund.de>
Message-ID: <Pine.LNX.4.21.0110160859560.2894-100000@sagittarius.logilab.fr>

On Mon, 15 Oct 2001, markus jais wrote:

> hello
> I am new to xml with python and there is something I do not
> completely understand!
> 
> in my book "xml processing with python" from prentice hall
> the author speaks of 2 dom implemenation for python:
> pyDOM and 4DOM

PyDOM was shipped with PyXML < 0.6. It has disappeared one year ago or
something, so I guess you book is slightly out of date. The core of 4DOM
has not changed too much, although I think the way of parsing XML
documents into DOMs has changed a bit (though the old API is still there
for backward compatibility). You should now use:
from xml.dom.ext.reader.Sax2 import Reader

r = Reader()
document = r.fromString(mystr)
or 
document = r.fromUri(someUri)

 
> I am a little bit confused.
> can anybody tell me, what is up to date and what DOM implementation
> to use??

miniDOM and 4DOM are OK to use. 4DOM is more spec compliant than miniDom,
so if you need a fairly compete support of the spec, you should use
4DOM. However, there is a cost : 4DOM trees eat up more memory, and
manipulation are slower (lot of checking is done which is not done in
miniDom). 

If you install 4Suite, you'll also find pDomlette and cDomlette, which are
other implementation of the DOM, not as compliant as 4DOM, but
faster. cDomlette is written in C, currently read only, but *very* fast. 

I think PyXML also features PullDom, but I've never used it. The main
advantage of PullDom is that the document is not parsed completely untill
you actually try to access the data from the dom. 

So the answer is : tell us what you want to do, and we'll tell you which
implementation to use.


Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).


From Jan.Delgado@unamite.com  Tue Oct 16 14:12:31 2001
From: Jan.Delgado@unamite.com (Jan Delgado)
Date: Tue, 16 Oct 2001 15:12:31 +0200
Subject: [XML-SIG] minidom and validating parser error messages
Message-ID: <HDEPJHADHDIEGPOCLLLEAEAGCAAA.Jan.Delgado@unamite.com>

hi,
i want to read and validate an xml file into a dom object.
i was trying the following (among approx. 1000 other things ;-):
(i am using PyXML 0.6.6-2)

----------------------------------------------------------
from xml.dom.minidom import parse
import xml.sax.saxexts
import xml.sax.saxlib
import sys

# ask any validating parser from the XML validating parser factory
parser = xml.sax.saxexts.XMLValParserFactory.make_parser()
print "Using parser:", parser
dom = parse(sys.stdin, parser)
----------------------------------------------------------

when i run the program, i always get an error:

...
 File "/usr/lib/python2.0/site-packages/_xmlplus/dom/pulldom.py", line 219,
in
reset
    self.parser.setFeature(xml.sax.handler.feature_namespaces, 1)
AttributeError: 'SAX_XPValParser' instance has no attribute 'setFeature'


unfortunately, the documentation is *more than poor* (sorry guys).

so what is wrong here ?

what is this "sax2exts" module used for ? when i tried that module for
the parser generation, i noticed that there was a different error message
(about a missing feed() method)

greetings
	jan


From Alexandre.Fayolle@logilab.fr  Tue Oct 16 16:27:54 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Tue, 16 Oct 2001 17:27:54 +0200 (CEST)
Subject: [XML-SIG] [ANN] Narval 1.1
Message-ID: <Pine.LNX.4.21.0110161727270.4380-100000@sagittarius.logilab.fr>

Logilab (www.logilab.com) announces the release of

	Narval 1.1

	GPL'd Intelligent Personnal Assistant Framework
	
	http://www.logilab.org/narval


News
----

    The whole project has moved from python 1.5.2 to python 2.1

    A few new features have made it into the engine, such as automatically
    reloading modules that have changed on disk, as well as some speed
    improvements. Several new modules (automatic classification, LDAP
directory
    access, checksum computing) have made it into the standard library.

    The Horn GUI has been completely redesigned for better flexibility,
speed
    and better ease of use.

    The infopal application (available separately) has received a large
number
    of improvements.
    
    The mailing lists have been moved to use mailman. More information can
be
    found at http://lists.logilab.org/mailman/listinfo

Description
-----------

Narval is a framework (language + interpreter + GUI/IDE) dedicated to the
setting up of intelligent personal assistants (IPAs).

An Intelligent Personal Assitant is a companion that will help you in your
daily work in the information world. It runs on your machine or on a
remote
server, and you can communicate with it via all standard means (email,
web,
telnet, phone, specific GUI, etc). It executes recipes (sequences of
actions)
you wrote, to perform a wide range of tasks, such as prepare your morning
newspaper, help you surf the web by filtering out junk ads, keep searching
the web day after day for things you want, participe in on-line auctions,
learn you interests and bring you back valuable information, take care of
repetitive chores, answer e-mail, negociate the date and time of a
meeting,
and much more... It is easy to extend the built in action library by
writing
new actions in Python.

Infopal, your information pal, is a Narval application that implements
part of
the above, but Narval makes it easy for you to set up new
assistants. Others
applications will soon be available from Logilab.

Logilab S.A. is a French company that specializes in the fields of
artificial
intelligence, knowledge management, data analysis and natural language
processing.


More info
---------

Please see

        http://www.logilab.org/narval
        http://www.logilab.com
	http://www.logilab.fr

or contact	contact@logilab.fr


Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).


From Alexandre.Fayolle@logilab.fr  Wed Oct 17 08:05:58 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Wed, 17 Oct 2001 09:05:58 +0200 (CEST)
Subject: [XML-SIG] Re: [4suite] dom question
In-Reply-To: <E15tZbX-0005It-00@mrvdom03.schlund.de>
Message-ID: <Pine.LNX.4.21.0110170855470.5296-100000@sagittarius.logilab.fr>

On Tue, 16 Oct 2001, markus jais wrote:

> hello
> thanks for your question.
> 
> one more:
> you said, that the attribute childNodes is in FtNode, line 103
> but there is:
> 
> def _get_childNodes(self):
>         return self.__dict__['__childNodes']
> 
> the method has an underscore at the beginning
> and the attribute has two "__"
> but there is not "childNodes"
> 
> I am not a python wizard, so maybe this is a python idiom
> I do not understand

The _get_XXX is part of the IDL to python mapping (DOM is defined as a set
of IDL interfaces). Not all Python DOM implementations provide this
mapping. The _get_childNodes method is called through the __getattr__
method (this is a python method that is called on a class when the
interpreter was not able to find a field, attribute or method, see
http://www.python.org/doc/current/ref/attribute-access.html#l2h-115). 

The __getattr__ method of FtNode (line 68) looks up a dictionnary called
_readComputedAttrs (defined line 425), which says that the method to read
childNodes is _get_childNodes, and then calls the method and returns the
result. 

Now, the question is, why do we do that ? It's because we want to protect
access to the childNode attribute, especially write access, while
preserving the attribute notation. So this python idiom is used to provide
a read-only attribute. There is also a _writeComputedAttrs dictionnary
dictionnary, which enable write access to some attributes, through methods
that will enable checking of the arguments. 

Does this answer your question?

Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).


From Martin.v.Loewis@t-online.de  Wed Oct 17 08:49:15 2001
From: Martin.v.Loewis@t-online.de (Martin v. Loewis)
Date: Wed, 17 Oct 2001 09:49:15 +0200
Subject: [XML-SIG] minidom and validating parser error messages
In-Reply-To: <HDEPJHADHDIEGPOCLLLEAEAGCAAA.Jan.Delgado@unamite.com>
References: <HDEPJHADHDIEGPOCLLLEAEAGCAAA.Jan.Delgado@unamite.com>
Message-ID: <200110170749.f9H7nFN01516@mira.informatik.hu-berlin.de>

> parser = xml.sax.saxexts.XMLValParserFactory.make_parser()
[...]
> when i run the program, i always get an error:
> 
> ...
>  File "/usr/lib/python2.0/site-packages/_xmlplus/dom/pulldom.py", line 219,
> in
> reset
>     self.parser.setFeature(xml.sax.handler.feature_namespaces, 1)
> AttributeError: 'SAX_XPValParser' instance has no attribute 'setFeature'

Hi Jan,

With the line above, you get a SAX1 driver; SAX1 did not provide the
setFeature operation, as used by pulldom. Instead of saxexts, you need
to use sax2exts.

> unfortunately, the documentation is *more than poor* (sorry guys).

We know; contributions are welcome.

> what is this "sax2exts" module used for ? when i tried that module for
> the parser generation, i noticed that there was a different error message
> (about a missing feed() method)

I see. Unfortunately, the xmlproc parser as shipped with PyXML 0.6.6
does not support "incremenental" operation, which is also required by
pulldom.

So it seems your options are the following:

- do not use a validating parser. Then, a plain
  minidom.parse(sys.stdin) invocation will be sufficient.

- do not use minidom. The 4DOM builders support validation without
  requiring an incremental parser. Use

  xml.dom.ext.reader.Sax2.FromXmlStream(sys.stdin, validate=1)

- use the PyXML CVS code.

HTH,
Martin


From idc@now.net.cn  Wed Oct 17 08:57:59 2001
From: idc@now.net.cn (Today's Network)
Date: 17 Oct 2001 07:57:59 -0000
Subject: [XML-SIG] ���ٳ�ֵ�����йܣ�
Message-ID: <20011017075759.25959.qmail@localhost.localdomain>

�𾴵�����ϵͳ���ܣ����ã� 
��Ϊ�ɣ���ҵ��ͬ�У�������֪�������࣬Ҳ������׿Խ���ܺ�ͻ���ĳɼ�������塣 
ð��������һ���ӣ�Ϊ������վ�Ҹ��üҡ� 
���ǣ���һ������ʩ�����ͷ���Ϊ���ṩ�����йܡ��������ޡ������ȷ���Ϊ��������ա��ڴ���Я�ֹ����������ɹ��� 
���ǵ�����������Ŀǰ���ڵ�֮������һ�����������ġ� 
Ŀǰ���ھ��й����Ż� ��������� 3U~7U ��׼���� 1250Ԫ/�¡� 

��ף 
������ 

dave 
��ӭ�������ǵ���վ HTTP��//idc.now.net.cn 

��ϵ��ʽ�� TEL�� 0756-2125594 2125593 
MOBILE�� 13928027346 ������ 13016318949 ������ ��idc@now.net.cn�� 
��ӭ��ϵ����ȡ��ϸ���ϡ� 


�뻥�����ĸ�������: 
������Ҫ��ChinaNet 155M - 2 G�ĸ������� 
��������ͨ��·���� 

���ʷ��� 
SLA ����Ʒ��Э�� 
��ȫ�������Powerb@se�ķ���Ʒ�� 
�Ա��������ݡ��Ϻ�Ϊ���ģ�½����ȫ������ͳһƷ�ƺͷ����IDC 

���������� 
�ӹ̵ķ��������� 
�ȶ��ĵ�Դϵͳ�� 
��ҵ�������价����������¡���ʪ���ҳ�����Ҫ�� 
��װ�ɿ�������װ�� 
�ϸ�ı���ϵͳ��24X7ֵ�ָ࣬��ʶ��ϵͳ���ֵȼ����ſ�׼���ƶȣ��������������ߵ��Ӽ�� 

�����ܵ����绷���� 
�������ȵ����缼���ͷ��� 
ȫ˫·���� 
ȫ����SAN�������� 
Ϊ�ͻ��ṩ���簲ȫ�������һ�廯������� 
���ؾ���ϵͳ 

����ϵͳ�� 
���ڵ���ϵͳ�ܰ�װ������Ϊ1,600KVA 
����˫·�е������� 
ȫ����˫·UPSϵͳ 
�󱸲��ͷ���������������Ϳ⣬ȼ�͹�˾������ʱ��Ӧ 

�յ�ϵͳ�� 
�����յ�ϵͳ����ҵ�������价���������ṩ������Ϊ200�� 
24 X 7Сʱ�����������ṩ��ȷ���º�ʪ���� 

����ϵͳ�� 
�Ƚ���������������ϵͳ��FM200�Զ��������ϵͳ 

����ϵͳ�� 
�����ϸ�ı���ϵͳ���ֵȼ����ſ�׼���ƶȣ� 
ָ��ʶ��ϵͳ���Զ���������ϵͳ�� 
24 X 7Сʱ����ֵ�࣬��·���Ӽ��ϵͳ��CCTV�� 


����������ݶ���û�м�ֵ����ɾ���������Ƕ�ð����������ʾ�����Ǹ�⣡��
 

From uye-istanbul@mmo.org.tr  Fri Oct 19 01:47:03 2001
From: uye-istanbul@mmo.org.tr (uye-mmo)
Date: Fri, 19 Oct 2001 03:47:03 +0300
Subject: [XML-SIG] =?iso-8859-9?Q?TMMOB-SANAY=DD_KONGRES=DD_2001_VE_PANEL?=
Message-ID: <NFBBKNBMELEGCFJLLACLOEHMLDAB.uye-istanbul@mmo.org.tr>

20 Ekim 2001 tarihinde Ataturk Kultur Merkezi-Sinema Salonu'nda
gerceklestirilecek olan Panel, ilgili ayrintilar asagida verilmistir.

............................................................................
...............................................
............................................................................
...............................................

SANAYI KONGRESI 2001'e DOGRU

Yer     :Ataturk Kultur Merkezi -Taksim /Istanbul
Tarih  : 20 Ekim 2001 saat 10,30-17,00

10.30 - 13.00 Panel : 1

YENI DUNYA DUZENI VE KURESELLESME

   Oturum Baskani: Yavuz BAYULKEN

   Panelistler :
	Aydin CUBUKCU
	Doc. Dr. Emin GURSES
	Doc. Dr. Hayri KOZANOGLU
	Prof. Dr. I. Resat OZKAN

13.00-14.30	Ara


14.30 - 15.00	SUNUS (Felsefe ve Teknoloji)
	Prof. Dr. Safak URAL


15.00 - 17.30	Panel : 2

TURKIYE : KURESELLESME ve SANAYI

Oturum Baskani: Emin KORAMAZ

Panelistler :
	Doc. Dr. Fuat ERCAN
	Yrd. Doc. Dr. Aziz KONUKMAN
	Kemal OZDEN
	Prof. Dr. Isaya USUR


Kongre ve Panelle ilgili ayrintili bilgi icin
Kongre Sekreteri Haydar Boyali ile iletisime gecebilirsiniz.
Tel:0212 245 03 63-64  Dah. 124

mailto:uye-istanbul@mmo.org.tr
www.mmo.org.tr/istanbul
............................................................................
................................................
............................................................................
................................................


tmmob
makina m�hendisleri odas�
istanbul �ubesi

�UBE:H�SEYIN A�A MAH. SAKIZ A�ACI CAD. NO=16    80080    BEYO�LU-ISTANBUL
Tel : (0212) 245 03 63 - 64 / 252 95 00 - 01   Faks: (0212) 249 86 74
Email: istanbul@mmo.org.tr

MERKEZ :  S�mer Sok. No: 36/1-A 06440 Demirtepe-ANKARA TEL: (0312) 231 31
59 - 231 31 64 - 230 11 66 FAKS: (0312) 231 31 65
TMMOB MAKINA M�HENDISLERI ODASI, Anayasa'n�n 135. maddesinde tan�mlanan  66
ve 85 say�l� KHK ve 7303 say�l� yasa ile de�i�ik 6235 say�l� yasaya g�re
kurulmu� kamu niteli�inde bir meslek kurulu�udur.


From noreply@sourceforge.net  Fri Oct 19 04:20:11 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 18 Oct 2001 20:20:11 -0700
Subject: [XML-SIG] [ pyxml-Bugs-472636 ] PyXML BooleanObj doesn't init ob_refcnt
Message-ID: <E15uQCh-0004Pv-00@usw-sf-web2.sourceforge.net>

Bugs item #472636, was opened at 2001-10-18 20:20
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=472636&group_id=6473

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: PyXML BooleanObj doesn't init ob_refcnt

Initial Comment:
In PyXML-0.6.6/extensions/boolean.c, line 85 in
boolean_NEW(),

object->ob_refcnt is not initialized which causes 
an unitialized memory read from purify.

Adding the following line fixes the problem:

      object->ob_refcnt = 0;

Neal


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=472636&group_id=6473


From janssen@parc.xerox.com  Fri Oct 19 17:58:30 2001
From: janssen@parc.xerox.com (Bill Janssen)
Date: Fri, 19 Oct 2001 09:58:30 PDT
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
Message-ID: <01Oct19.095837pdt."3456"@watson.parc.xerox.com>

Hi, folks.

I was thinking of writing a new Python codec which took HTML in a
UTF-8 encoding, but still containing escaped character entity
references, and output UTF-8 with all of the entity refs replaced by
their UTF-8 characters, and in the other direction took UTF-8 and came
out with all characters above ASCII replaced with the HTML character
entity ref.

First off, this seems like an obvious thing to do, so has someone
already done it?  Or is there some obvious flaw in the idea which
I just haven't seen?

Secondly, is there any documentation on the _codecs module, which
seems full of interesting and useful stuff for this purpose?

Thirdly, what's the equivalent of chr() for Unicode characters?

Thanks in advance!

Bill


From fdrake@acm.org  Fri Oct 19 17:59:39 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 19 Oct 2001 12:59:39 -0400
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
In-Reply-To: <01Oct19.095837pdt."3456"@watson.parc.xerox.com>
References: <01Oct19.095837pdt."3456"@watson.parc.xerox.com>
Message-ID: <15312.23547.607289.728876@grendel.zope.com>

Bill Janssen writes:
 > First off, this seems like an obvious thing to do, so has someone
 > already done it?  Or is there some obvious flaw in the idea which
 > I just haven't seen?

  I haven't seen it, either, but it would be really nice.  Most people
don't want to end up with &#...; character references; they'd rather
have the general entity references.

 > Secondly, is there any documentation on the _codecs module, which
 > seems full of interesting and useful stuff for this purpose?

  No.  There is limited documentation on the codecs module, though.
If you'd like to extend that while you're at it, I'd certainly
appreciate it!

 > Thirdly, what's the equivalent of chr() for Unicode characters?

  unichr() is a built-in function which does this; see the docs if you
need more information.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From mal@lemburg.com  Fri Oct 19 19:08:48 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Oct 2001 20:08:48 +0200
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
References: <01Oct19.095837pdt."3456"@watson.parc.xerox.com> <15312.23547.607289.728876@grendel.zope.com>
Message-ID: <3BD06C30.3C844FBC@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> Bill Janssen writes:
>  > First off, this seems like an obvious thing to do, so has someone
>  > already done it?  Or is there some obvious flaw in the idea which
>  > I just haven't seen?
> 
>   I haven't seen it, either, but it would be really nice.  Most people
> don't want to end up with &#...; character references; they'd rather
> have the general entity references.

I've written one of these for a customer; can't release it though.

Note that even though humans tend to like named entities a lot,
numeric entities are usually much easier to handle and parse
(just think of the hoops that are needed to get these thingies
parsed correctly in XML...).
 
>  > Secondly, is there any documentation on the _codecs module, which
>  > seems full of interesting and useful stuff for this purpose?
> 
>   No.  There is limited documentation on the codecs module, though.
> If you'd like to extend that while you're at it, I'd certainly
> appreciate it!

The _codecs module is basically just a helper to make the internal
codecs available. All of these are documented in great detail 
in the C API reference and the unicodeobject.h header file.
 
>  > Thirdly, what's the equivalent of chr() for Unicode characters?
> 
>   unichr() is a built-in function which does this; see the docs if you
> need more information.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From janssen@parc.xerox.com  Sat Oct 20 01:59:29 2001
From: janssen@parc.xerox.com (Bill Janssen)
Date: Fri, 19 Oct 2001 17:59:29 PDT
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
In-Reply-To: Your message of "Fri, 19 Oct 2001 11:08:48 PDT."
 <3BD06C30.3C844FBC@lemburg.com>
Message-ID: <01Oct19.175936pdt."3456"@watson.parc.xerox.com>

Perhaps you'd be kind enough to review my sample code at
ftp://ftp.parc.xerox.com/transient/janssen/htmlcodec.py, and advise of
glaring errors or any interesting improvements that occur to you?

Thanks in advance!

Bill


From doc@pcola.gulf.net  Sat Oct 20 03:55:08 2001
From: doc@pcola.gulf.net (Watson, Brian)
Date: Fri, 19 Oct 2001 21:55:08 -0500
Subject: [XML-SIG] pyexpat.c help
Message-ID: <3BD0E78C.6A5A3C95@pcola.gulf.net>

Hello,

I'm trying to get ParsedXML to work in Zope 2.4.1 with Python 2.1.  At
the ParsedXML site at zope.org, there is a zipe file with the VC++
project file to build pyexpat.pyd.  I get this error when building:

   Creating library Debug/pyexpat.lib and object Debug/pyexpat.exp
pyexpat.obj : error LNK2001: unresolved external symbol __Py_Dealloc
pyexpat.obj : error LNK2001: unresolved external symbol __Py_RefTotal
pyexpat.obj : error LNK2001: unresolved external symbol __Py_NoneStruct
pyexpat.obj : error LNK2001: unresolved external symbol _PyExc_IOError
pyexpat.obj : error LNK2001: unresolved external symbol _PyExc_TypeError
pyexpat.obj : error LNK2001: unresolved external symbol _PyFile_Type
pyexpat.obj : error LNK2001: unresolved external symbol
_PyExc_ValueError
pyexpat.obj : error LNK2001: unresolved external symbol _PyString_Type
pyexpat.obj : error LNK2001: unresolved external symbol
_PyExc_AttributeError
pyexpat.obj : error LNK2001: unresolved external symbol
_PyExc_RuntimeError
pyexpat.obj : error LNK2001: unresolved external symbol
_Py_InitModule4TraceRefs
pyexpat.obj : error LNK2001: unresolved external symbol _PyType_Type
../pyexpat.pyd : fatal error LNK1120: 12 unresolved externals
Error executing link.exe.

pyexpat.pyd - 13 error(s), 0 warning(s)

Is this .c written for python1.5 only?  I put py2.1 files in the
include/lib path.  Do I need a ver made to compile with 2.1?  Also, what
is pyexpat.pyd's role in zope (what does it do/used for)?  ANY help
would be very much appreciated; I've been trying for hours to make it
work... 

Thanks in advance,
Brian W.


From Martin.v.Loewis@t-online.de  Sat Oct 20 09:54:10 2001
From: Martin.v.Loewis@t-online.de (Martin v. Loewis)
Date: Sat, 20 Oct 2001 10:54:10 +0200
Subject: [XML-SIG] pyexpat.c help
In-Reply-To: <3BD0E78C.6A5A3C95@pcola.gulf.net> (doc@pcola.gulf.net)
References: <3BD0E78C.6A5A3C95@pcola.gulf.net>
Message-ID: <200110200854.f9K8sAI01814@mira.informatik.hu-berlin.de>

> I'm trying to get ParsedXML to work in Zope 2.4.1 with Python 2.1.  At
> the ParsedXML site at zope.org, there is a zipe file with the VC++
> project file to build pyexpat.pyd.

I strongly recommend to use the pyexpat that comes with Python
2.1. Depending on where you got Python 2.1 from, it should be
pre-built already. If not, you might consider downloading the Python
2.1 sources and building it yourself.

>    Creating library Debug/pyexpat.lib and object Debug/pyexpat.exp

The problem here certainly is that you are trying to create a Debug
binary. Don't do that unless you also make use of a Debug Python
installation (which you might not even have yet).

> pyexpat.obj : error LNK2001: unresolved external symbol _PyString_Type

That's strange. What version of the pythonXY.lib where you linking
with?

> Is this .c written for python1.5 only?  I put py2.1 files in the
> include/lib path.  Do I need a ver made to compile with 2.1?  Also, what
> is pyexpat.pyd's role in zope (what does it do/used for)?  

I don't know about the pyexpat.c that you can get from Zope; the
current one certainly works with all Python versions. pyexpat.pyd is
the module that interfaces to the Expat XML parser, which is in turn
used to parse XML.

Regards,
Martin


From Martin.v.Loewis@t-online.de  Sat Oct 20 10:11:39 2001
From: Martin.v.Loewis@t-online.de (Martin v. Loewis)
Date: Sat, 20 Oct 2001 11:11:39 +0200
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
In-Reply-To: <01Oct19.175936pdt."3456"@watson.parc.xerox.com> (message from
 Bill Janssen on Fri, 19 Oct 2001 17:59:29 PDT)
References: <01Oct19.175936pdt."3456"@watson.parc.xerox.com>
Message-ID: <200110200911.f9K9Bdl01906@mira.informatik.hu-berlin.de>

> Perhaps you'd be kind enough to review my sample code at
> ftp://ftp.parc.xerox.com/transient/janssen/htmlcodec.py, and advise of
> glaring errors or any interesting improvements that occur to you?

Hi Bill,

The script looks quite alright, AFAICT. There is one issue of
correctness: If you also want to support XHTML, you may encounter
CDATA sections, in which case "&" does not denote markup.

Another correctness issue is the role of UTF-8 here; it appears that
your Codec does not deal with UTF-8 at all. On encoding, there
wouldn't be any need to ever use HTML entities, since you could encode
everything as UTF-8. Not doing so is fine - except that you could then
declare that the output is US-ASCII as well. On decoding, you might
need to pay attention to UTF-8. While doing so, it is advisable not to
mix Unicode and byte strings in a single operation. E.g. when you
write

  if input[i] == u'&'

then I believe input is a byte string, so this would be better

  if input[i] == u'&'

The former will fail if ord(input[i])>127.

There is a (perhaps more important) issue of efficiency: Building up a
large string by adding a character at a time is not particularly
efficient. Assuming that non-ASCII characters are rare, you may try to
find large substrings of your input that need no processing. For
example, in decode, doing input.find("&") might be better: you can add
large chunks of input to the output.

Even with these improvements, building up a string by adding tail
segments requires repeated copying of the string head. This can be
avoided with a pre-allocated list L, using string.join(L,"") when
done.

HTH,
Martin


From tpassin@home.com  Sat Oct 20 15:05:45 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sat, 20 Oct 2001 10:05:45 -0400
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
References: <01Oct19.175936pdt."3456"@watson.parc.xerox.com> <200110200911.f9K9Bdl01906@mira.informatik.hu-berlin.de>
Message-ID: <000601c15970$501a61c0$7cac1218@cj64132b>

[Martin v. Loewis]

>
> Even with these improvements, building up a string by adding tail
> segments requires repeated copying of the string head. This can be
> avoided with a pre-allocated list L, using string.join(L,"") when
> done.
>

We had a thread on this a while ago (I thinkit was this year, but maybe it
was last).  When the string is small it makes no noticeable difference, but
as the string gets larger the speedup of using a list can be dramatic, even
more than an order of magnitude.  Anything more than a few KB for the final
string would probably benefit from the list approach - especially when you
add to the string a character at a time.

Cheers,

Tom P


From Benjamin.Schollnick@usa.xerox.com  Sat Oct 20 15:08:55 2001
From: Benjamin.Schollnick@usa.xerox.com (Schollnick, Benjamin)
Date: Sat, 20 Oct 2001 10:08:55 -0400
Subject: [XML-SIG] HTML<->UTF-8 'codec'? [Slightly-OT]
Message-ID: <E2D1934575D1D411973D00508BB02F1730246F@usa0129ms1.ess.mc.xerox.com>

I've got a perfect example of that....

I just wrote a quick RPG utility to act as a dice roller (via the random
module)....

I created a audit trail, by using:

results = results + "\nCurrent Dice roll = " + str(dice_roll)

And while being reasonably fast, it quickly slowed down after 1000 or so
rolls... (10,000 took roughly 1-2 minutes to process) [That's assuming I
recall correctly, it was certainly over 40 seconds...] .

When I converted the results audit trail to a list:

results.insert (1, "Current Dice Roll ...")

A 10,000 roll sequence only takes roughly 1.5 seconds....

There's other things going on of course....But for strings that you
are appending data to, it's a safe better to use a list....(for large
groups of data).

It's quite simple to:

string.join(list_variable) 

To produce a string result from the list...

			- Benjamin

-----Original Message-----
From: Thomas B. Passin [mailto:tpassin@home.com]
Sent: Saturday, October 20, 2001 10:06 AM
To: xml-sig@python.org
Subject: Re: [XML-SIG] HTML<->UTF-8 'codec'?


[Martin v. Loewis]

>
> Even with these improvements, building up a string by adding tail
> segments requires repeated copying of the string head. This can be
> avoided with a pre-allocated list L, using string.join(L,"") when
> done.
>

We had a thread on this a while ago (I thinkit was this year, but maybe it
was last).  When the string is small it makes no noticeable difference, but
as the string gets larger the speedup of using a list can be dramatic, even
more than an order of magnitude.  Anything more than a few KB for the final
string would probably benefit from the list approach - especially when you
add to the string a character at a time.

Cheers,

Tom P


_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://mail.python.org/mailman/listinfo/xml-sig


From rsalz@zolera.com  Sat Oct 20 15:14:18 2001
From: rsalz@zolera.com (Rich Salz)
Date: Sat, 20 Oct 2001 10:14:18 -0400
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
References: <01Oct19.175936pdt."3456"@watson.parc.xerox.com> <200110200911.f9K9Bdl01906@mira.informatik.hu-berlin.de> <000601c15970$501a61c0$7cac1218@cj64132b>
Message-ID: <3BD186BA.38ECF9F1@zolera.com>

I thought StringIO was also a win.

> as the string gets larger the speedup of using a list can be dramatic, even
> more than an order of magnitude.  Anything more than a few KB for the final
> string would probably benefit from the list approach - especially when you
> add to the string a character at a time.

-- 
Zolera Systems, Securing web services (XML, SOAP, Signatures,
Encryption)
http://www.zolera.com


From tpassin@home.com  Sat Oct 20 16:20:06 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sat, 20 Oct 2001 11:20:06 -0400
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
References: <01Oct19.175936pdt."3456"@watson.parc.xerox.com> <200110200911.f9K9Bdl01906@mira.informatik.hu-berlin.de> <000601c15970$501a61c0$7cac1218@cj64132b> <3BD186BA.38ECF9F1@zolera.com>
Message-ID: <001d01c1597a$b2f94f90$7cac1218@cj64132b>

[Rich Salz]

> I thought StringIO was also a win.
>
It is, as long as it is cStringIO.  I made some tests once comparing
list.append()/string.join with cStringIO for this business of adding to a
string character by character.  My post is dated August 23,2000 - it should
be in the archives.  Here are the results - Method 1 was str=str+char,
method 2 was list.append() + string.join(list), and method 3 used cStringIO:

"The results are dramatic.  Method 1) is as good as or better than anything
until the string length exceeds about 1000 bytes.  Then Method 1 starts
slowing down.  Above about 4000 bytes, it's really getting ssslllooowww.
Here is a table of the results on my system - 450 MHz PIII running Win98,
Python 1.5.2.

      Rate of generating output string, char/sec
length of input    Method 1    Method 2    Method 3
    50-1000            3.3e5        1.8e5            2.3e5
    1200                3.2e5        1.8e5            2.6e5
    1500                1.2e5        1.8e5            2.5e5
    2000                1.2e5        2.7e5            2.6e5
    4000                6.1e4        1.8e5            2.6e5
    8000                3.6e4        1.9e5            2.5e5
    15000               1.7e4        1.4e5            2.5e5
    30,000               8200        1.8e5            2.7e5
    40,000               6600        1.8e5            2.4e5
    60,000               4500        2.1e5            2.2e5
    100,000             ---            1.8e5            2.4e5
    200,000             ---            1.8e5            2.4e5

These figures include some averaging.  The few numbers that are a little
different - like Method 2 at 60,000 char - probably don't mean anything.
Oh, yes, plain StringIO was definitely slower that cStringIO, as you might
think - I dont's have any figures, though."

So cStringIO is faster than list.append(), but it's not a giant difference.
The nice thing is the constant behavior vs string size for methods 2 and 3.
I suppose the details would be different for Python 2.1, but I doubt that
the overall picture is much different.

It was a post by Bjorn Pettersen on the speed of StringIO that got me
started trying this out.

Cheers,

Tom P

> > as the string gets larger the speedup of using a list can be dramatic,
even
> > more than an order of magnitude.  Anything more than a few KB for the
final
> > string would probably benefit from the list approach - especially when
you
> > add to the string a character at a time.
>
> --


From Martin.v.Loewis@t-online.de  Sat Oct 20 16:49:46 2001
From: Martin.v.Loewis@t-online.de (Martin v. Loewis)
Date: Sat, 20 Oct 2001 17:49:46 +0200
Subject: [XML-SIG] HTML<->UTF-8 'codec'? [Slightly-OT]
In-Reply-To: <E2D1934575D1D411973D00508BB02F1730246F@usa0129ms1.ess.mc.xerox.com>
 (Benjamin.Schollnick@usa.xerox.com)
References: <E2D1934575D1D411973D00508BB02F1730246F@usa0129ms1.ess.mc.xerox.com>
Message-ID: <200110201549.f9KFnkJ04405@mira.informatik.hu-berlin.de>

> When I converted the results audit trail to a list:
> 
> results.insert (1, "Current Dice Roll ...")
> 
> A 10,000 roll sequence only takes roughly 1.5 seconds....

Notice that using insert/append does not help at all when you add a
single character at the time, since the list is copied on each resize,
also.

Regards,
Martin


From mal@lemburg.com  Sat Oct 20 17:20:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 20 Oct 2001 18:20:14 +0200
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
References: <01Oct19.175936pdt."3456"@watson.parc.xerox.com> <200110200911.f9K9Bdl01906@mira.informatik.hu-berlin.de> <000601c15970$501a61c0$7cac1218@cj64132b> <3BD186BA.38ECF9F1@zolera.com>
Message-ID: <3BD1A43E.30001@lemburg.com>

>
>
>I thought StringIO was also a win.
>
StringIO is the fastest variant around -- if you're building large 
strings this is the
way to go.

>>> as the string gets larger the speedup of using a list can be dramatic, even
>>> more than an order of magnitude.  Anything more than a few KB for the final
>>> string would probably benefit from the list approach - especially when you
>>> add to the string a character at a time.
>>
>


From noreply@sourceforge.net  Sat Oct 20 18:05:22 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 20 Oct 2001 10:05:22 -0700
Subject: [XML-SIG] [ pyxml-Bugs-473195 ] Uninit Memory Read in pyexpat.c
Message-ID: <E15uzYo-0007yI-00@usw-sf-web2.sourceforge.net>

Bugs item #473195, was opened at 2001-10-20 10:05
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=473195&group_id=6473

Category: expat
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: Uninit Memory Read in pyexpat.c

Initial Comment:
UMR: Uninitialized memory read (18 times)

Also note that self->handlers[i]=NULL; is done twice in
clear_handlers(), once in if (decref), once after.

What's happening is that while iterating through the
loop
on the 1st (StartElement), before the 2nd (EndElement)
is initialized,
EndElement is checked in pyxml_SetStartElementHandler.

Not sure how to fix, other than create another loop to
initialize
the handler or do a calloc, instead of malloc in
pyexpat_ParserCreate [pyexpat.c:1407].

Neal
--
      This is occurring while in:
            pyxml_SetStartElementHandler
[pyexpat.c:1680]
                       && self->handlers[endHandler] !=
Py_None) {
                       start_handler =
handler_info[startHandler].handler;
                   }
            =>     if (self->handlers[EndElement]
                       && self->handlers[EndElement] !=
Py_None) {
                       end_handler =
handler_info[endHandler].handler;
                   }
            clear_handlers [pyexpat.c:1661]
                           Py_XDECREF(temp);
                       }
                       self->handlers[i]=NULL;
            =>        
handler_info[i].setter(self->itself, NULL);
                   }
               }
               
            newxmlparseobject [pyexpat.c:1178]
            pyexpat_ParserCreate [pyexpat.c:1407]


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=473195&group_id=6473


From tpassin@home.com  Sat Oct 20 18:59:46 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sat, 20 Oct 2001 13:59:46 -0400
Subject: [XML-SIG] HTML<->UTF-8 'codec'? [Slightly-OT]
References: <E2D1934575D1D411973D00508BB02F1730246F@usa0129ms1.ess.mc.xerox.com> <200110201549.f9KFnkJ04405@mira.informatik.hu-berlin.de>
Message-ID: <00a601c15991$0175c570$7cac1218@cj64132b>

[Martin v. Loewis]

> > When I converted the results audit trail to a list:
> >
> > results.insert (1, "Current Dice Roll ...")
> >
> > A 10,000 roll sequence only takes roughly 1.5 seconds....
>
> Notice that using insert/append does not help at all when you add a
> single character at the time, since the list is copied on each resize,
> also.
>

Not so, Martin.  The tests I quoted in my last post added a single character
at a time, and insert/append gave dramatic improvements.  The list may be
cpoied on each resize but apparently it doesn't have to resize very often,
just like a dictionary doesn't have to, resize on each new addition.

Cheers,

Tom P


From faassen@vet.uu.nl  Sat Oct 20 21:26:06 2001
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Sat, 20 Oct 2001 22:26:06 +0200
Subject: [XML-SIG] pyexpat.c help
In-Reply-To: <200110200854.f9K8sAI01814@mira.informatik.hu-berlin.de>
References: <3BD0E78C.6A5A3C95@pcola.gulf.net> <200110200854.f9K8sAI01814@mira.informatik.hu-berlin.de>
Message-ID: <20011020222606.A12972@vet.uu.nl>

Martin v. Loewis wrote:
> > Is this .c written for python1.5 only?  I put py2.1 files in the
> > include/lib path.  Do I need a ver made to compile with 2.1?  Also, what
> > is pyexpat.pyd's role in zope (what does it do/used for)?  
> 
> I don't know about the pyexpat.c that you can get from Zope; the
> current one certainly works with all Python versions. pyexpat.pyd is
> the module that interfaces to the Expat XML parser, which is in turn
> used to parse XML.

ParsedXML's pyexpat.c apparently exists because it has some extra
bells and whistles needed by ParsedXML. The general desire is to phase
this out as quickly as possible. I don't know what extra bells and whistles
these are, even; but Fred Drake should know..

Of course I am getting mysterious segfaults when running some parser unit tests
against either version of pyexpat.. I see a recent bug on sourceforge 
(Uninit memory read in pyexpat.c) that could be related; the segfault
I'm seeing looks pretty memory related; even adding and removing debugging
print statements in the test code makes more or less tests succeed 
semi-randomly.

Regards,

Martijn
 

From Martin.v.Loewis@t-online.de  Sat Oct 20 22:07:08 2001
From: Martin.v.Loewis@t-online.de (Martin v. Loewis)
Date: Sat, 20 Oct 2001 23:07:08 +0200
Subject: [XML-SIG] HTML<->UTF-8 'codec'? [Slightly-OT]
In-Reply-To: <00a601c15991$0175c570$7cac1218@cj64132b> (tpassin@home.com)
References: <E2D1934575D1D411973D00508BB02F1730246F@usa0129ms1.ess.mc.xerox.com> <200110201549.f9KFnkJ04405@mira.informatik.hu-berlin.de> <00a601c15991$0175c570$7cac1218@cj64132b>
Message-ID: <200110202107.f9KL78i05517@mira.informatik.hu-berlin.de>

> Not so, Martin.  The tests I quoted in my last post added a single
> character at a time, and insert/append gave dramatic improvements.
> The list may be cpoied on each resize but apparently it doesn't have
> to resize very often, just like a dictionary doesn't have to, resize
> on each new addition.

Right, I forgot about the overallocation. Note that repeated list
.append calls still gave an algorithm of quadratic time in Python 1.5,
since the overallocation extended the list size by at most 100
elements. Only in 2.1, such an algorithm is of linear complexity,
since the overallocation grows with the list size.

Regards,
Martin


From Martin.v.Loewis@t-online.de  Sat Oct 20 22:46:28 2001
From: Martin.v.Loewis@t-online.de (Martin v. Loewis)
Date: Sat, 20 Oct 2001 23:46:28 +0200
Subject: [XML-SIG] pyexpat.c help
In-Reply-To: <20011020222606.A12972@vet.uu.nl> (message from Martijn Faassen
 on Sat, 20 Oct 2001 22:26:06 +0200)
References: <3BD0E78C.6A5A3C95@pcola.gulf.net> <200110200854.f9K8sAI01814@mira.informatik.hu-berlin.de> <20011020222606.A12972@vet.uu.nl>
Message-ID: <200110202146.f9KLkSc05947@mira.informatik.hu-berlin.de>

> Of course I am getting mysterious segfaults when running some parser
> unit tests against either version of pyexpat.. I see a recent bug on
> sourceforge (Uninit memory read in pyexpat.c) that could be related;

I cannot understand that report, but I somewhat doubt it indicates a
real problem. There seems to be a real bug in UpdatePairedHandlers,
but none that could cause a segfault.

> the segfault I'm seeing looks pretty memory related; even adding and
> removing debugging print statements in the test code makes more or
> less tests succeed semi-randomly.

If you have more details, don't hesitate to share them with us.

Regards,
Martin


From noreply@sourceforge.net  Sun Oct 21 00:23:51 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Sat, 20 Oct 2001 16:23:51 -0700
Subject: [XML-SIG] [ pyxml-Bugs-473288 ] pychecker on Ft.Lib.cDomlettec crashs
Message-ID: <E15v5T5-0004Sh-00@usw-sf-web2.sourceforge.net>

Bugs item #473288, was opened at 2001-10-20 16:23
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=473288&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Neal Norwitz (nnorwitz)
Assigned to: Nobody/Anonymous (nobody)
Summary: pychecker on Ft.Lib.cDomlettec crashs

Initial Comment:
Running pychecker (pychecker.sf.net) on
Ft.Lib.cDomlettec crashes the python interpreter.

Here's the 1 line script that pychecker (v0.8.5) is run
against:
   import Ft.Lib.cDomlettec

This causes the interpreter to crash in 
            try_rich_compare [object.c:382]
            do_richcmp     [object.c:836]
            PyObject_RichCompare [object.c:883]
            cmp_outcome    [ceval.c:3443]

Program received signal SIGSEGV, Segmentation fault.
0x1d6ec in try_rich_compare (v=0xff05b784, w=0xe2dd8,
op=2)
    at Objects/object.c:382
382             if ((f = RICHCOMPARE(v->ob_type)) !=
NULL) {
(gdb) p *v
$4 = {ob_refcnt = 3, ob_type = 0x0}

Notice that ob_type is NULL.

I'm not sure whose fault this is:  pychecker, 4suite,
or the interpreter.
pychecker is completely a python module, so I think
it's in  cDomlettec.so or interpreter.

Neal


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=473288&group_id=6473


From mal@lemburg.com  Mon Oct 22 14:50:47 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Oct 2001 15:50:47 +0200
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
References: <01Oct19.175936pdt."3456"@watson.parc.xerox.com>
Message-ID: <3BD42437.31D130A9@lemburg.com>

Bill Janssen wrote:
> 
> Perhaps you'd be kind enough to review my sample code at
> ftp://ftp.parc.xerox.com/transient/janssen/htmlcodec.py, and advise of
> glaring errors or any interesting improvements that occur to you?
> 
> Thanks in advance!

Here are some comments:

First of all, you are encoding Unicode to an 8-bit string, right ?
If so, then you don't need to use Unicode for output.

    def encode(self,input,errors='strict'):

        output = u''
        i = 0
        input_len = len(input)
        while (i < input_len):
            if ord(input[i]) > 0x7F:
                output = output + u'&#' + unicode(str(ord(input[i]))) + u';'

Wouldn't this be easier: u"&#%i;" % ord(input[i]) ?!

            else:
                output = output + unicode(input[i])
            i = i + 1
        return (str(output), len(output))

This should be return (str(output), i) -- (returnvalue, bytes_consumed).

Same for decode().

A note about the search function: if you give the codec module
a name like 'html_utf_8.py' then you can have the search function
in encodings/__init__.py find it.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From mal@lemburg.com  Mon Oct 22 14:50:47 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Oct 2001 15:50:47 +0200
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
References: <01Oct19.175936pdt."3456"@watson.parc.xerox.com>
Message-ID: <3BD42437.31D130A9@lemburg.com>

Bill Janssen wrote:
> 
> Perhaps you'd be kind enough to review my sample code at
> ftp://ftp.parc.xerox.com/transient/janssen/htmlcodec.py, and advise of
> glaring errors or any interesting improvements that occur to you?
> 
> Thanks in advance!

Here are some comments:

First of all, you are encoding Unicode to an 8-bit string, right ?
If so, then you don't need to use Unicode for output.

    def encode(self,input,errors='strict'):

        output = u''
        i = 0
        input_len = len(input)
        while (i < input_len):
            if ord(input[i]) > 0x7F:
                output = output + u'&#' + unicode(str(ord(input[i]))) + u';'

Wouldn't this be easier: u"&#%i;" % ord(input[i]) ?!

            else:
                output = output + unicode(input[i])
            i = i + 1
        return (str(output), len(output))

This should be return (str(output), i) -- (returnvalue, bytes_consumed).

Same for decode().

A note about the search function: if you give the codec module
a name like 'html_utf_8.py' then you can have the search function
in encodings/__init__.py find it.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From janssen@parc.xerox.com  Mon Oct 22 23:05:28 2001
From: janssen@parc.xerox.com (Bill Janssen)
Date: Mon, 22 Oct 2001 15:05:28 PDT
Subject: [XML-SIG] HTML<->UTF-8 'codec'?
In-Reply-To: Your message of "Sat, 20 Oct 2001 02:11:39 PDT."
 <200110200911.f9K9Bdl01906@mira.informatik.hu-berlin.de>
Message-ID: <01Oct22.150537pdt."3456"@watson.parc.xerox.com>

> While doing so, it is advisable not to
> mix Unicode and byte strings in a single operation. E.g. when you
> write
> 
>   if input[i] == u'&'
> 
> then I believe input is a byte string, so this would be better
> 
>   if input[i] == u'&'
> 
> The former will fail if ord(input[i])>127.

I was uncertain as to whether "input" was a byte string or not, but in
any case I fail to see the difference between the two lines?  Did you mean

  if (unicode(input[i]) == u'&'):

Bill


From m_mariappanX@trillium.com  Tue Oct 23 10:02:49 2001
From: m_mariappanX@trillium.com (Mariappan, MaharajanX)
Date: Tue, 23 Oct 2001 02:02:49 -0700
Subject: [XML-SIG] newbie question on loading/saving xml files
Message-ID: <53A7943A5BD8D411B6930002A5073155013F604A@bgsmsx90.iind.intel.com>

This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_000_01C15BA1.7D8D0260
Content-Type: text/plain;
	charset="iso-8859-1"

Hi Folks,

I just now start searching for documents for expat modules.

I want to 
* load a xml file and show in GUI using python and
* save the xml files in disk 

I went through 

 <<DOM.html.url>> 
as Alexander told earlier, But I couldn't make out 

Pointer to any documents with examples will really help me

TIA,
Maharajan


------_=_NextPart_000_01C15BA1.7D8D0260
Content-Type: application/octet-stream;
	name="DOM.html.url"
Content-Disposition: attachment;
	filename="DOM.html.url"

[InternetShortcut]
URL=http://py-howto.sourceforge.net/xml-howto/DOM.html
Modified=3070ED85A05BC101CF

------_=_NextPart_000_01C15BA1.7D8D0260--


From mark.humphrey@acm.org  Tue Oct 23 16:19:00 2001
From: mark.humphrey@acm.org (Mark Humphrey)
Date: Tue, 23 Oct 2001 10:19:00 -0500
Subject: [XML-SIG] newbie question on loading/saving xml files
In-Reply-To: <53A7943A5BD8D411B6930002A5073155013F604A@bgsmsx90.iind.intel.com>; from m_mariappanX@trillium.com on Tue, Oct 23, 2001 at 02:02:49AM -0700
References: <53A7943A5BD8D411B6930002A5073155013F604A@bgsmsx90.iind.intel.com>
Message-ID: <20011023101900.B16617@galadriel>

--4SFOXa2GPu3tIq4H
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 23, 2001 at 02:02:49AM -0700, Mariappan, MaharajanX wrote:
> Hi Folks,
>=20
> I just now start searching for documents for expat modules.
>=20
> I want to=20
> * load a xml file and show in GUI using python and
> * save the xml files in disk=20
>=20
> I went through=20
>=20
>  <<DOM.html.url>>=20
> as Alexander told earlier, But I couldn't make out=20
>=20
> Pointer to any documents with examples will really help me

I'm checking my own code for this.  Seems like these are the functions that=
 you're wanting:

	Printer.PrintVisitor(...)
- and -
	FromXMLStream(...)

Writing works something like this:

	printer =3D Printer.PrintVisitor(f, 'UTF-8')
	printer.visit(domTree)

Here, f is an output file stream.  In my particular case,
	f =3D open(..., 'w')

You'll need to do this...
	from xml.dom.ext import Printer
=2E..to get this object.

The FromXMLStream function takes a file stream as an input and outputs a DO=
M tree, if the file stream was properly structured XML.  You get to it with:
	from xml.dom.ext.reader.Sax import FromXmlStream

After that, everything you do just consists of DOM tree manipulations, and =
you should be able to find good documentation on the DOM from the w3c web s=
ite (www.w3c.org).  PyXML has an odd mix of DOM level 1 and level 2 functio=
ns implemented, and I think I've even found a few random level 3 functions =
have been implemented.  Not sure what the exact pattern is, though.

--=20
Mark "Markus" Humphrey		mark.humphrey@acm.org
http://galadriel.ath.cx:88/
GPG Public Key A54BC06F, available on www.keyserver.net
If you can't say anything bad about Microsoft, don't say anything at all.

--4SFOXa2GPu3tIq4H
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE71YpkP3ikGaVLwG8RAhGaAJ9Ck1QtwI154xkJkyRfZqf8dTgePgCg4hPt
C+3qXm1a3cHxCGqY8j7tgz8=
=jxku
-----END PGP SIGNATURE-----

--4SFOXa2GPu3tIq4H--


From paul@boddie.net  Wed Oct 24 08:59:11 2001
From: paul@boddie.net (paul@boddie.net)
Date: 24 Oct 2001 07:59:11 -0000
Subject: [XML-SIG] newbie question on loading/saving xml files
Message-ID: <20011024075911.8709.qmail@www1.nameplanet.com>

Mark Humphrey <mark.humphrey@acm.org> wrote:
>

[Loading files]

>The FromXMLStream function takes a file stream as an input and outputs a DO=
>M tree, if the file stream was properly structured XML.  You get to it with:
>	from xml.dom.ext.reader.Sax import FromXmlStream

Shouldn't this be...

  from xml.dom.ext.reader.Sax2 import Reader
  # Get a Reader.
  reader = Reader()
  # Use the fromStream method of reader to get a document.
  doc = reader.fromStream(stream)

This is off the top of my head, but I did change my code in the past few days 
to use fromStream, and not the FromXMLStream function, which is apparently 
deprecated. Also, I think the Sax2 module is recommended over Sax these days.

>After that, everything you do just consists of DOM tree manipulations, and
>you should be able to find good documentation on the DOM from the w3c web s
>ite (www.w3c.org).  PyXML has an odd mix of DOM level 1 and level 2 functio
>ns implemented, and I think I've even found a few random level 3 functions
>have been implemented.  Not sure what the exact pattern is, though.

Sadly, it's a case of reading the source for the most part, but I've become 
used to that. Having said that, there are a few books out which could be 
helpful, the most recent of which seems to be "Definitive XML Application 
Development" by Lars Marius Garshol, and a presentation of that book seems to 
be taking place where I work tomorrow. It looks like Python is emphasised quite 
a bit in that work.

Paul

-- 
Get your firstname@lastname email for FREE at http://Nameplanet.com/?su


From larsga@garshol.priv.no  Wed Oct 24 13:27:13 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 24 Oct 2001 14:27:13 +0200
Subject: [XML-SIG] newbie question on loading/saving xml files
In-Reply-To: <20011024075911.8709.qmail@www1.nameplanet.com>
References: <20011024075911.8709.qmail@www1.nameplanet.com>
Message-ID: <m3bsixb4v2.fsf@lambda.garshol.priv.no>

* paul@boddie.net
| 
| Having said that, there are a few books out which could be helpful,
| the most recent of which seems to be "Definitive XML Application
| Development" by Lars Marius Garshol, 

It's bound to be the most recent, given that it's not published
yet. :-)  

I'm glad to see people mention it, though.

| It looks like Python is emphasised quite a bit in that work.

It is. It's actually all Python, except for one part, which shows how
to do the same things in Java. There's also some Jython bits.

It has a home page at
  <URL: http://www.garshol.priv.no/download/text/ph1/ >
where you can find out a bit more about it, BTW.

--Lars M.


From rodsenra@gpr.com.br  Wed Oct 24 14:27:00 2001
From: rodsenra@gpr.com.br (Rodrigo Senra)
Date: Wed, 24 Oct 2001 11:27:00 -0200
Subject: [XML-SIG] DOM Documentation
Message-ID: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br>

Hi,
  is there anybody still working with DOM Documentation (Python
  XML Howto release 0.6.1) ?
  I'm having to dig it up now, and I could use this opportunity to advance
  a bit the documentation process. But it would be better to sync it up
  with current efforts.
TIA
Rod Senra

Rodrigo Senra
Computer Engineer (GPr Sistemas Ltda) rodsenra@gpr.com.br
MSc Student (IC - UNICAMP) Rodrigo.Senra@ic.unicamp.br
http://www.ic.unicamp.br/~921234 (LinUxer 217.243) (ICQ 114477550)


From martin@v.loewis.de  Wed Oct 24 20:54:21 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Wed, 24 Oct 2001 21:54:21 +0200
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br>
 (message from Rodrigo Senra on Wed, 24 Oct 2001 11:27:00 -0200)
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br>
Message-ID: <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de>

>   is there anybody still working with DOM Documentation (Python
>   XML Howto release 0.6.1) ?
>   I'm having to dig it up now, and I could use this opportunity to advance
>   a bit the documentation process. But it would be better to sync it up
>   with current efforts.

I'm pretty certain that everything that has been written is in the
PyXML CVS, and that not much has changed for the last months. So
contributions are welcome.

Please note that that the .tex file is the primary source; everything
else is generated.

Regards,
Martin


From akuchlin@mems-exchange.org  Wed Oct 24 21:00:49 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 24 Oct 2001 16:00:49 -0400
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de>; from martin@v.loewis.de on Wed, Oct 24, 2001 at 09:54:21PM +0200
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br> <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de>
Message-ID: <20011024160049.A23168@ute.mems-exchange.org>

On Wed, Oct 24, 2001 at 09:54:21PM +0200, Martin v. Loewis wrote:
>I'm pretty certain that everything that has been written is in the
>PyXML CVS, and that not much has changed for the last months. So
>contributions are welcome.

Actually there's a partially updated version in the py-howto CVS tree
(see py-howto.sourceforge.net for instructions on accessing it), so
start with that and not with the version in the PyXML CVS.

--amk


From fdrake@acm.org  Wed Oct 24 21:04:05 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 24 Oct 2001 16:04:05 -0400
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <20011024160049.A23168@ute.mems-exchange.org>
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br>
 <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de>
 <20011024160049.A23168@ute.mems-exchange.org>
Message-ID: <15319.7861.726037.753103@grendel.zope.com>

Andrew Kuchling writes:
 > Actually there's a partially updated version in the py-howto CVS tree
 > (see py-howto.sourceforge.net for instructions on accessing it), so
 > start with that and not with the version in the PyXML CVS.

  We should remove one of these; having both causes confusion.  I've
no real preference for which is kept, though.
  Given the original question, it may be useful to note that the
Python DOM reference material is located in the Library Reference
these days.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From lloyd@lancaster.lib.pa.us  Wed Oct 24 21:58:16 2001
From: lloyd@lancaster.lib.pa.us (Eron Lloyd)
Date: Wed, 24 Oct 2001 16:58:16 -0400
Subject: [XML-SIG] DOM Documentation
Message-ID: <200110241644301.SM00109@there>

Isn't it true that an O'Reilly Python & XML title will be out Very Soon Now, 
covering PyXML? This would be a good place to begin, IMHO.

Regards,

Eron


From fdrake@acm.org  Wed Oct 24 21:58:40 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 24 Oct 2001 16:58:40 -0400
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <200110241644301.SM00109@there>
References: <200110241644301.SM00109@there>
Message-ID: <15319.11136.222171.938736@grendel.zope.com>

Eron Lloyd writes:
 > Isn't it true that an O'Reilly Python & XML title will be out Very
 > Soon Now, covering PyXML? This would be a good place to begin,

  I encourage everyone to buy a copy (of course!), but it does not
replace the reference documentation in the Library Reference.  PyXML
is required for most examples, and 4Suite is used as well.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From rodrigo.senra@terra.com.br  Wed Oct 24 22:22:59 2001
From: rodrigo.senra@terra.com.br (Rodrigo Dias Arruda Senra)
Date: Wed, 24 Oct 2001 19:22:59 -0200
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <20011024160049.A23168@ute.mems-exchange.org>
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br>
 <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de>
 <20011024160049.A23168@ute.mems-exchange.org>
Message-ID: <20011024192259.1885aed4.rodrigo.senra@terra.com.br>

 |Andrew Kuchling <akuchlin@mems-exchange.org>,
 |on Wed, 24 Oct 2001 16:00:49 -0400
 |about Re: [XML-SIG] DOM Documentation

 > On Wed, Oct 24, 2001 at 09:54:21PM +0200, Martin v. Loewis wrote:
 > >I'm pretty certain that everything that has been written is in the
 > >PyXML CVS, and that not much has changed for the last months. So
 > >contributions are welcome.
 > 
 > Actually there's a partially updated version in the py-howto CVS tree
 > (see py-howto.sourceforge.net for instructions on accessing it), so
 > start with that and not with the version in the PyXML CVS.

 Ok, thx. I'm on it.

 I'd like to know the status of the implementation, is it shifting
 recently or is it stable ? Just to know how often should I download
 versions from CVS ;o)

 regards,
 Senra

___
Rodrigo Senra         
Computer Engineer   (GPr Sistemas Ltda)         rodsenra@gpr.com.br 
MSc Student              (IC - UNICAMP) Rodrigo.Senra@ic.unicamp.br
http://www.ic.unicamp.br/~921234  (LinUxer 217.243) (ICQ 114477550)


From akuchlin@mems-exchange.org  Wed Oct 24 22:45:12 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 24 Oct 2001 17:45:12 -0400
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <15319.7861.726037.753103@grendel.zope.com>; from fdrake@acm.org on Wed, Oct 24, 2001 at 04:04:05PM -0400
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br> <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de> <20011024160049.A23168@ute.mems-exchange.org> <15319.7861.726037.753103@grendel.zope.com>
Message-ID: <20011024174512.C23168@ute.mems-exchange.org>

On Wed, Oct 24, 2001 at 04:04:05PM -0400, Fred L. Drake, Jr. wrote:
>  We should remove one of these; having both causes confusion.  I've
>no real preference for which is kept, though.

The one in the HOWTO CVS seems the obvious choice, as that way it can
be assembled into PDF/text/etc. along with all the other HOWTOs.

--amk


From martin@v.loewis.de  Wed Oct 24 22:49:47 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Wed, 24 Oct 2001 23:49:47 +0200
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <20011024160049.A23168@ute.mems-exchange.org> (message from
 Andrew Kuchling on Wed, 24 Oct 2001 16:00:49 -0400)
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br> <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de> <20011024160049.A23168@ute.mems-exchange.org>
Message-ID: <200110242149.f9OLnlR02464@mira.informatik.hu-berlin.de>

> Actually there's a partially updated version in the py-howto CVS tree
> (see py-howto.sourceforge.net for instructions on accessing it), so
> start with that and not with the version in the PyXML CVS.

I thought we agreed the copy in PyXML was the master copy :-(

Martin


From fdrake@acm.org  Wed Oct 24 22:45:22 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 24 Oct 2001 17:45:22 -0400
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <20011024174512.C23168@ute.mems-exchange.org>
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br>
 <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de>
 <20011024160049.A23168@ute.mems-exchange.org>
 <200110242149.f9OLnlR02464@mira.informatik.hu-berlin.de>
 <15319.7861.726037.753103@grendel.zope.com>
 <20011024174512.C23168@ute.mems-exchange.org>
Message-ID: <15319.13938.95329.86148@grendel.zope.com>

Andrew Kuchling writes:
 > The one in the HOWTO CVS seems the obvious choice, as that way it can
 > be assembled into PDF/text/etc. along with all the other HOWTOs.

Martin v. Loewis writes:
 > I thought we agreed the copy in PyXML was the master copy :-(

  Aaaarrrrggghhhhh!!!!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From martin@v.loewis.de  Wed Oct 24 22:58:33 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Wed, 24 Oct 2001 23:58:33 +0200
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <20011024192259.1885aed4.rodrigo.senra@terra.com.br> (message
 from Rodrigo Dias Arruda Senra on Wed, 24 Oct 2001 19:22:59 -0200)
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br>
 <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de>
 <20011024160049.A23168@ute.mems-exchange.org> <20011024192259.1885aed4.rodrigo.senra@terra.com.br>
Message-ID: <200110242158.f9OLwXA02468@mira.informatik.hu-berlin.de>

>  I'd like to know the status of the implementation, is it shifting
>  recently or is it stable ? Just to know how often should I download
>  versions from CVS ;o)

The DOM API is very stable. It may be that DOM level 3 gets supported
in a uniform way at some point (replacing the proprietary load/store
interfaces), but the current API will certainly continue to function
for the next few years.

Regards,
Martin


From noreply@sourceforge.net  Thu Oct 25 02:46:22 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 24 Oct 2001 18:46:22 -0700
Subject: [XML-SIG] [ pyxml-Bugs-474708 ] Unicode 'junk' bug
Message-ID: <E15wZbC-00072j-00@usw-sf-web2.sourceforge.net>

Bugs item #474708, was opened at 2001-10-24 18:46
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=474708&group_id=6473

Category: expat
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Unicode 'junk' bug

Initial Comment:
Expat will not properly parse an XML that contains 
UTF-8 encoded unicode.  Anything more than one pair 
of XML tags will result in 'junk after document 
element' on the second element.

<?xml version="1.0" encoding="UTF-8"?>
<text>ascii</text>
<text>
 �~A~B�~A~D�~A~F�~A~H�~A~J�~A~K�~A~M�~A~O�~A~Q�~A~S�~A
~B�~A~D�~A~F�~A~H�~A~J�~A~K�~A~M�~A~O�~A~Q�~A~S�~A~B�~
A~D�~A~F�~A~H�~A~J�~A~K�~A~M�~A~O�~A~Q�~A~S�~A~B�~A~D�
~A~F�~A~H�~A~J�~A~K�~A~M�~A~O�~A~Q�~A~S影�~@~@
Above is some Japanese text.</text>

above file results in 
xml.parsers.expat.ExpatError: junk after document 
element: line 3, column 0
whereas removing the first <text></text> pair will 
result in a properly parsed file.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=474708&group_id=6473


From m_mariappanX@trillium.com  Thu Oct 25 12:06:43 2001
From: m_mariappanX@trillium.com (Mariappan, MaharajanX)
Date: Thu, 25 Oct 2001 04:06:43 -0700
Subject: [XML-SIG] newbie question on loading/saving xml files
Message-ID: <53A7943A5BD8D411B6930002A5073155013F604D@bgsmsx90.iind.intel.com>

Hi Folks,

I tried DOM module to load as Mark told. I'm trying to load the xml elements
to treecontrol using below code

        def LoadFileUsingDom(self, filename):
            reader = PyExpat.Reader()
            doc = reader.fromUri(filename)
            nit = doc.createNodeIterator(doc,NodeFilter.SHOW_ELEMENT,
None,0)

            ct_node = nit.nextNode()
            previous_node = None
            while ct_node:
                if ct_node.parentNode == doc:
                    self.nodeStack = [self.AddRoot(ct_node.nodeName)]
                    print "Root element is %s"%(ct_node.nodeName)
                else:
                    #id = self.AppendItem(self.nodeStack[-1],
ct_node.nodeName)
                    #self.nodeStack.append(id)
                    #if not previous_node:
                    #    self.nodeStack = self.nodeStack[:-1]
                    #noChilds=1
                    if previous_node and
self.GetItemParent(self.nodeStack[-1]) != self.GetRootItem():
                        if ct_node.parentNode.nodeName ==
previous_node.parentNode.nodeName:
                            par = self.GetItemParent(self.nodeStack[-1])
                            noChilds = self.GetChildrenCount(par, 1)
                            print "%s --> %s <==> %s
-->%s"%(previous_node.nodeName,previous_node.parentNode.nodeName,ct_node.nod
eName,ct_node.parentNode.nodeName)
                            print "childs --> %s"%(noChilds)
                            print "Par --> %s"%(self.GetItemText(par))
                            print "last node -->
%s"%(self.GetItemText(self.nodeStack[-1]))
                            print "last but one node -->
%s"%(self.GetItemText(self.nodeStack[-2]))
                            #self.nodeStack = self.nodeStack[:-noChilds]
                            print "now last node is -->
%s"%(self.GetItemText(self.nodeStack[-1]))
                            id =
self.AppendItem(self.nodeStack[-(noChilds+1)], ct_node.nodeName)
                            self.nodeStack.append(id)
                        else:
                            par = self.GetItemParent(self.nodeStack[-1])
                            noChilds = self.GetChildrenCount(par, 1)
                            print "%s --> %s <==> %s
-->%s"%(previous_node.nodeName,previous_node.parentNode.nodeName,ct_node.nod
eName,ct_node.parentNode.nodeName)
                            print "childs --> %s"%(noChilds)
                            print "Par --> %s"%(self.GetItemText(par))
                            print "last node -->
%s"%(self.GetItemText(self.nodeStack[-1]))
                            print "last but one node -->
%s"%(self.GetItemText(self.nodeStack[-2]))
                            self.nodeStack = self.nodeStack[:-(noChilds+1)]
                            print "now last node is -->
%s"%(self.GetItemText(self.nodeStack[-1]))
                            id = self.AppendItem(self.nodeStack[-1],
ct_node.nodeName)
                            self.nodeStack.append(id)
                            
                    #if previous_node:
                    #    print "%s --> %s <==> %s
-->%s"%(previous_node.nodeName,previous_node.parentNode.nodeName,ct_node.nod
eName,ct_node.parentNode.nodeName)
                    else:
                        id = self.AppendItem(self.nodeStack[-1],
ct_node.nodeName)
                        self.nodeStack.append(id)
                    previous_node = ct_node
                    
                    #print "Element %s"%(ct_node.nodeName)
                ct_node = nit.nextNode()
                                

But it seems not working fine as i except. Is there any direct way to load 
the DOM tree to tree controls

TIA
Maharajan


-----Original Message-----
From: Mark Humphrey [mailto:mark.humphrey@acm.org]
Sent: Tuesday, October 23, 2001 8:49 PM
To: Mariappan, MaharajanX
Cc: 'xml-sig@python.org'
Subject: Re: [XML-SIG] newbie question on loading/saving xml files


On Tue, Oct 23, 2001 at 02:02:49AM -0700, Mariappan, MaharajanX wrote:
> Hi Folks,
> 
> I just now start searching for documents for expat modules.
> 
> I want to 
> * load a xml file and show in GUI using python and
> * save the xml files in disk 
> 
> I went through 
> 
>  <<DOM.html.url>> 
> as Alexander told earlier, But I couldn't make out 
> 
> Pointer to any documents with examples will really help me

I'm checking my own code for this.  Seems like these are the functions that
you're wanting:

	Printer.PrintVisitor(...)
- and -
	FromXMLStream(...)

Writing works something like this:

	printer = Printer.PrintVisitor(f, 'UTF-8')
	printer.visit(domTree)

Here, f is an output file stream.  In my particular case,
	f = open(..., 'w')

You'll need to do this...
	from xml.dom.ext import Printer
...to get this object.

The FromXMLStream function takes a file stream as an input and outputs a DOM
tree, if the file stream was properly structured XML.  You get to it with:
	from xml.dom.ext.reader.Sax import FromXmlStream

After that, everything you do just consists of DOM tree manipulations, and
you should be able to find good documentation on the DOM from the w3c web
site (www.w3c.org).  PyXML has an odd mix of DOM level 1 and level 2
functions implemented, and I think I've even found a few random level 3
functions have been implemented.  Not sure what the exact pattern is,
though.

-- 
Mark "Markus" Humphrey		mark.humphrey@acm.org
http://galadriel.ath.cx:88/
GPG Public Key A54BC06F, available on www.keyserver.net
If you can't say anything bad about Microsoft, don't say anything at all.


From Alexandre.Fayolle@logilab.fr  Thu Oct 25 15:47:50 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Thu, 25 Oct 2001 16:47:50 +0200 (CEST)
Subject: [XML-SIG] newbie question on loading/saving xml files
In-Reply-To: <53A7943A5BD8D411B6930002A5073155013F604D@bgsmsx90.iind.intel.com>
Message-ID: <Pine.LNX.4.21.0110251646010.2817-100000@gemini.logilab.fr>

On Thu, 25 Oct 2001, Mariappan, MaharajanX wrote:

> Hi Folks,
> 
> I tried DOM module to load as Mark told. I'm trying to load the xml elements
> to treecontrol using below code

What graphical toolkit are you using?

If you want to see how to load a DOM tree to a GTK CTree widget, you can
check the code in XML tools (ftp://ftp.logilab.org/pub/xmltools/)

Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).


From hinsen@cnrs-orleans.fr  Thu Oct 25 16:09:31 2001
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: Thu, 25 Oct 2001 17:09:31 +0200
Subject: [XML-SIG] Validating parser?
Message-ID: <200110251509.f9PF9Vv17273@chinon.cnrs-orleans.fr>

After a major system upgrade, I am trying to get my documentation system
back into a working state, and one part of it is the Python XML package.

My documents are XML files that include other XML files, and my
understanding is that this works only with a validating parser. In my
previous setup (Python 1.5.2/PyXML 0.5) I used nsgmls to generate an
esis representation that was parsed into a DOM tree by a class called
EsisBuilder. I can't find this any more - has it disappeared? What
options are there now to read an XML file with a validating parser?
The documentation only mentions Expat (non-validating), but also claims
to be incomplete.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Juergen Hermann" <jhe@webde-ag.de  Thu Oct 25 16:34:10 2001
From: Juergen Hermann" <jhe@webde-ag.de (Juergen Hermann)
Date: Thu, 25 Oct 2001 17:34:10 +0200
Subject: [XML-SIG] Validating parser?
In-Reply-To: <200110251509.f9PF9Vv17273@chinon.cnrs-orleans.fr>
Message-ID: <m15wmWJ-007qaQC@smtp.web.de>

On Thu, 25 Oct 2001 17:09:31 +0200, Konrad Hinsen wrote:

>options are there now to read an XML file with a validating parser?
>The documentation only mentions Expat (non-validating), but also claims=

>to be incomplete.

You can either use xmlproc or my wrapper for Xerces (SAX2 only and 
faster).


Ciao, J=FCrgen

--
J=FCrgen Hermann, Developer (jhe@webde-ag.de)
WEB.DE AG, http://webde-ag.de/


From mike@nthwave.net  Thu Oct 25 17:47:44 2001
From: mike@nthwave.net (Michael Mell)
Date: Thu, 25 Oct 2001 09:47:44 -0700
Subject: [XML-SIG] copy node to new dom
Message-ID: <3BD8422E.749F7F10@nthwave.net>

Hi, 
I want to use xml.dom.minidom to parse a file, extract the document
element and then duplicate that element in a second dom. Ideally, these
four lines would do the job:
	srcDom  = xml.dom.minidom.parse(f1)
	destDom  = xml.dom.minidom.parse(f2)
	srcElement = srcDom.getElementsByTagName('myElement')[0]

	destDom.documentElement.appendChild(srcElement)
The last line fails, it seems, because the srcElement is meaningless in
the destDom.

The only workable method I came up with was to extract all the data from
srcElement, build a new destElement, and then append the destElement to
the destDom:
	srcDom  = xml.dom.minidom.parse(f1)
	destDom  = xml.dom.minidom.parse(f2)
	srcElement = srcDom.getElementsByTagName('myElement')[0]

	destElement = destDom.createElement('myElement')
	srcItem = srcElement.getElementsByTagName('item')[0]
	desElement.setAttribute('value', srcItem.getAttribute('value'))
	destDom.documentElement.appendChild(destElement)
The last four lines need to be repeated for each and every atribute of
each child element.

Is there a more efficient method?

thanks.
-- 
Choice links to sensible thinking:
http://www.nthwave.net:81/warAndSense/index.html

mike@nthwave.net 
llemekim         YahooIM 
415.455.8812     voice
419.735.1167     fax


From akuchlin@mems-exchange.org  Thu Oct 25 18:47:05 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 25 Oct 2001 13:47:05 -0400
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <200110242149.f9OLnlR02464@mira.informatik.hu-berlin.de>; from martin@v.loewis.de on Wed, Oct 24, 2001 at 11:49:47PM +0200
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br> <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de> <20011024160049.A23168@ute.mems-exchange.org> <200110242149.f9OLnlR02464@mira.informatik.hu-berlin.de>
Message-ID: <20011025134705.E24775@ute.mems-exchange.org>

On Wed, Oct 24, 2001 at 11:49:47PM +0200, Martin v. Loewis wrote:
>I thought we agreed the copy in PyXML was the master copy :-(

<bonks head into desk> OK.  I've checked in the partially-rewritten
version into the PyXML CVS.

--amk


From thehaas@matrix.binary.net  Thu Oct 25 19:12:37 2001
From: thehaas@matrix.binary.net (Mike Hostetler)
Date: Thu, 25 Oct 2001 13:12:37 -0500
Subject: [XML-SIG] copy node to new dom
In-Reply-To: <3BD8422E.749F7F10@nthwave.net>; from mike@nthwave.net on Thu, Oct 25, 2001 at 09:47:44AM -0700
References: <3BD8422E.749F7F10@nthwave.net>
Message-ID: <20011025131237.A68285@matrix.binary.net>

On Thu, October 25, 2001 at 09:47:44AM -0700, Michael Mell wrote:
> Hi,
> I want to use xml.dom.minidom to parse a file, extract the document
> element and then duplicate that element in a second dom. Ideally, these
> four lines would do the job:
>  srcDom  = xml.dom.minidom.parse(f1)
>  destDom  = xml.dom.minidom.parse(f2)
>  srcElement = srcDom.getElementsByTagName('myElement')[0]
>
>  destDom.documentElement.appendChild(srcElement)

I used the Sax reader from PyXML and just did:
   destDom.importNode(srcElement,deep=1)  # copies all child nodes as well
   destDom.documentElement.appendChild(srcElement)

However, minidom doesn't have an 'importNode" method.  I've found minidom
is not up to the task when doing advanced things (and taking part of a
DOM tree and putting it into another DOM tree *is* advanced).  I hate to
sound elitest, but change to PyXML -- it will make things easier.

Mike
-- 


From fdrake@acm.org  Thu Oct 25 19:16:13 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 25 Oct 2001 14:16:13 -0400
Subject: [XML-SIG] copy node to new dom
In-Reply-To: <20011025131237.A68285@matrix.binary.net>
References: <3BD8422E.749F7F10@nthwave.net>
 <20011025131237.A68285@matrix.binary.net>
Message-ID: <15320.22253.599233.607230@grendel.zope.com>

Mike Hostetler writes:
 > However, minidom doesn't have an 'importNode" method.  I've found minidom
 > is not up to the task when doing advanced things (and taking part of a
 > DOM tree and putting it into another DOM tree *is* advanced).  I hate to
 > sound elitest, but change to PyXML -- it will make things easier.

  I've filed a bug against minidom for this omission:

http://sourceforge.net/tracker/index.php?func=detail&aid=474986&group_id=5470&atid=105470


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From thehaas@matrix.binary.net  Thu Oct 25 19:44:53 2001
From: thehaas@matrix.binary.net (Mike Hostetler)
Date: Thu, 25 Oct 2001 13:44:53 -0500
Subject: [XML-SIG] copy node to new dom
In-Reply-To: <15320.22253.599233.607230@grendel.zope.com>; from fdrake@acm.org on Thu, Oct 25, 2001 at 02:16:13PM -0400
References: <3BD8422E.749F7F10@nthwave.net> <20011025131237.A68285@matrix.binary.net> <15320.22253.599233.607230@grendel.zope.com>
Message-ID: <20011025134453.B70399@matrix.binary.net>

On Thu, Oct 25, 2001 at 02:16:13PM -0400, Fred L. Drake, Jr. wrote:
> 
> Mike Hostetler writes:
>  > However, minidom doesn't have an 'importNode" method.  I've found minidom
>  > is not up to the task when doing advanced things (and taking part of a
>  > DOM tree and putting it into another DOM tree *is* advanced).  I hate to
>  > sound elitest, but change to PyXML -- it will make things easier.
> 
>   I've filed a bug against minidom for this omission:
> 
> http://sourceforge.net/tracker/index.php?func=detail&aid=474986&group_id=5470&atid=105470
> 

I remember thinking about that when I had to figure out the importNode
stuff . . . and then I looked at the DOM spec.

'minidom' is only DOM Level 1 compatible with some Level 2 in it (for
some namespace stuff, according to the python documentation).  'importNode'
is part of the Level 2 spec.  I'm not sure that 'importNode' belongs to
'minidom'.  But, hey, if you guys want to put it in . . . . =)

URLs for reference:
=======================
Explanation of minidom and the DOM standard:
	http://www.python.org/doc/current/lib/minidom-and-dom.html

DOM Level 1 spec:
	http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html

DOM Level 2 spec:
	http://www.w3.org/TR/DOM-Level-2-Core/core.html


Mike


From fdrake@acm.org  Thu Oct 25 19:50:14 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 25 Oct 2001 14:50:14 -0400
Subject: [XML-SIG] copy node to new dom
In-Reply-To: <20011025134453.B70399@matrix.binary.net>
References: <3BD8422E.749F7F10@nthwave.net>
 <20011025131237.A68285@matrix.binary.net>
 <15320.22253.599233.607230@grendel.zope.com>
 <20011025134453.B70399@matrix.binary.net>
Message-ID: <15320.24294.139676.249707@grendel.zope.com>

Mike Hostetler writes:
 > 'minidom' is only DOM Level 1 compatible with some Level 2 in it (for
 > some namespace stuff, according to the python documentation).  'importNode'
 > is part of the Level 2 spec.  I'm not sure that 'importNode' belongs to
 > 'minidom'.  But, hey, if you guys want to put it in . . . . =)

  I'd call minidom being "kind of" Level 2 (the namespace stuff) a bug
in itself; it makes more sense to make it a Level 2 DOM.

 > URLs for reference:
 > =======================
 > Explanation of minidom and the DOM standard:
 > 	http://www.python.org/doc/current/lib/minidom-and-dom.html
 > 
 > DOM Level 1 spec:
 > 	http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html
 > 
 > DOM Level 2 spec:
 > 	http://www.w3.org/TR/DOM-Level-2-Core/core.html

  Yeah, I've written a DOM before; I've seen those.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From martin@v.loewis.de  Thu Oct 25 20:32:00 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Thu, 25 Oct 2001 21:32:00 +0200
Subject: [XML-SIG] copy node to new dom
In-Reply-To: <15320.24294.139676.249707@grendel.zope.com> (fdrake@acm.org)
References: <3BD8422E.749F7F10@nthwave.net>
 <20011025131237.A68285@matrix.binary.net>
 <15320.22253.599233.607230@grendel.zope.com>
 <20011025134453.B70399@matrix.binary.net> <15320.24294.139676.249707@grendel.zope.com>
Message-ID: <200110251932.f9PJW0l01677@mira.informatik.hu-berlin.de>

>   I'd call minidom being "kind of" Level 2 (the namespace stuff) a bug
> in itself; it makes more sense to make it a Level 2 DOM.

Agreed. More precisely, DOM 2 Core; we don't do traversal, ranges,
HTML, events, etc (this still makes makes it "mini").

Regards,
Martin


From martin@v.loewis.de  Thu Oct 25 20:20:48 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Thu, 25 Oct 2001 21:20:48 +0200
Subject: [XML-SIG] DOM Documentation
In-Reply-To: <20011025134705.E24775@ute.mems-exchange.org> (message from
 Andrew Kuchling on Thu, 25 Oct 2001 13:47:05 -0400)
References: <5.1.0.14.0.20011024112345.00a60590@pop.sao.terra.com.br> <200110241954.f9OJsLa01892@mira.informatik.hu-berlin.de> <20011024160049.A23168@ute.mems-exchange.org> <200110242149.f9OLnlR02464@mira.informatik.hu-berlin.de> <20011025134705.E24775@ute.mems-exchange.org>
Message-ID: <200110251920.f9PJKmT01516@mira.informatik.hu-berlin.de>

> On Wed, Oct 24, 2001 at 11:49:47PM +0200, Martin v. Loewis wrote:
> >I thought we agreed the copy in PyXML was the master copy :-(
> 
> <bonks head into desk> OK.  I've checked in the partially-rewritten
> version into the PyXML CVS.

Thanks! In case somebody is worried about the synchronization: I'm
still ready to forward any changes made to the PyXML documention to
the python-howtos. It's just that there haven't been any changes in a
long time (that I knew of).

Regards,
Martin


From fdrake@acm.org  Thu Oct 25 21:16:29 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 25 Oct 2001 16:16:29 -0400
Subject: [XML-SIG] copy node to new dom
In-Reply-To: <200110251932.f9PJW0l01677@mira.informatik.hu-berlin.de>
References: <3BD8422E.749F7F10@nthwave.net>
 <20011025131237.A68285@matrix.binary.net>
 <15320.22253.599233.607230@grendel.zope.com>
 <20011025134453.B70399@matrix.binary.net>
 <15320.24294.139676.249707@grendel.zope.com>
 <200110251932.f9PJW0l01677@mira.informatik.hu-berlin.de>
Message-ID: <15320.29469.404386.465342@grendel.zope.com>

Martin v. Loewis writes:
 > Agreed. More precisely, DOM 2 Core; we don't do traversal, ranges,
 > HTML, events, etc (this still makes makes it "mini").

  I wouldn't have a problem if it supported the "XML" feature, but I
don't plan on dong ranges and traversal for it.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From hannu@tm.ee  Thu Oct 25 19:28:34 2001
From: hannu@tm.ee (Hannu Krosing)
Date: Thu, 25 Oct 2001 23:28:34 +0500
Subject: [XML-SIG] copy node to new dom
References: <3BD8422E.749F7F10@nthwave.net> <20011025131237.A68285@matrix.binary.net>
Message-ID: <3BD859D2.62377F9A@tm.ee>

Mike Hostetler wrote:
> 
> On Thu, October 25, 2001 at 09:47:44AM -0700, Michael Mell wrote:
> > Hi,
> > I want to use xml.dom.minidom to parse a file, extract the document
> > element and then duplicate that element in a second dom. Ideally, these
> > four lines would do the job:
> >  srcDom  = xml.dom.minidom.parse(f1)
> >  destDom  = xml.dom.minidom.parse(f2)
> >  srcElement = srcDom.getElementsByTagName('myElement')[0]
> >
> >  destDom.documentElement.appendChild(srcElement)
> 
> I used the Sax reader from PyXML and just did:
>    destDom.importNode(srcElement,deep=1)  # copies all child nodes as well
>    destDom.documentElement.appendChild(srcElement)
> 
> However, minidom doesn't have an 'importNode" method.  I've found minidom
> is not up to the task when doing advanced things (and taking part of a
> DOM tree and putting it into another DOM tree *is* advanced).  I hate to
> sound elitest, but change to PyXML -- it will make things easier.

It _would_ probably make things easier if it were documented ;)

I've just spent a few hours searching for documentation and the only
more or 
less complete doc's I found were for minidom in standard python docs.

The docs that come with PyXML had just placeholders for most things.

------------------
Hannu


From linda@sampson.com  Fri Oct 26 02:28:05 2001
From: linda@sampson.com (linda@sampson.com)
Date: Fri, 26 Oct 2001 02:28:05 +0100
Subject: [XML-SIG] hi :o)
Message-ID: <GLSEDV00.N7F@cmas-tj.cablemas.com>

<body bgcolor="black">
<font color="red" size="3"><b>
Hello :o) <br>
check this funny story out hehehe<br>
<table width="500" border="0">
<tr>td>
Notes From an Inexperienced Curry Taster Named FRANK, who was
visiting
Phoenix
(A very Indian suburb of Durban, South Africa).

"Recently I was honored to be selected as a judge at a curry
cook-off. The
original person called in sick at the last moment and I happened
to be
standing there at the judge's table asking directions to the
beer wagon when
the call came. I was assured by the other two judges (couple of
local
Indians) that the curry wouldn't be all that spicy, and besides,
they told
me I could have free beer during the tasting, so I accepted."

Here are the scorecards from the event:

Curry # 1: Manoj's Maniac Mobster Monster Curry

JUDGE ONE: A little too heavy on tomato. Amusing kick.
JUDGE TWO: Nice, smooth tomato flavor. Very mild.
FRANK:  Holy shit, what the hell is this stuff? You could remove
dried paint
from your driveway. Took me two beers to put the flames out. I
hope that's
the worst one. These charo's are crazy.

Curry # 2: Applesamy's Afterburner Curryr Curry

JUDGE ONE: Smoky, with a hint of pork. Slight Jalapeno tangs.
JUDGE TWO: Exciting BBQ flavor, needs more peppers to be taken
seriously.
FRANK:  Keep this out of reach of children I'm not sure what I
am supposed
to taste besides pain. I had to wave off two people who wanted
to give me
the Heimlich maneuver. They had to rush in more beer when they
saw the look
on my face.

Curry # 3: Farouk's Famous Burn Down the Barn curry

JUDGE ONE: Excellent firehouse curry Great kick. Need more
beans.
JUDGE TWO: A beanless curry, a bit salty, good use of red
peppers.
FRANK:  Call Colesburg, I've located a uranium spill. My nose
feels like I
have been snorting Drano. Everyone knows the routine by now, get
me more
beer before I ignite. Barmaid pounded me on the back; now my
backbone is in
the front part of my chest. I'm getting shit-faced from all the
beer.

Curry # 4: Barbu's Black Magic

JUDGE ONE: Black bean curry with almost no spice. Disappointing.
JUDGE TWO: Hint of lime in the black beans. Good side dish for
fish or other

mild foods, not much of a curry.
FRANK:  I felt something scraping across my tongue, but was
unable to taste
it, is it possible to burn out taste buds? Savathree, the
barmaid, was
standing behind me fresh refills; that 300 lbs. bitch is
starting to look
HOT, just like this nuclear waste I'm eating. Is curry an
aphrodisiac?

Curry # 5: Laveshnee's Legal Lip Remover

JUDGE ONE: Meaty, strong curry. Cayenne peppers freshly ground,
adding
considerable kick. Very impressive.
JUDGE TWO: Curry using shredded beef; could use more tomato.
Must admit the
cayenne peppers make a strong statement.
FRANK:  My ears are ringing, sweat is pouring off my forehead,
and I can no
longer focus my eyes. I farted and four people behind me needed
paramedics.
The contestant seemed offended when I told her that her curry
had given me
brain damage. Savathree saved my tongue from bleeding by pouring
beer
directly on it from a pitcher. I wonder if I'm burning my lips
off? It
really pisses me off that the other judges asked me to stop
screaming. Screw
those charo's

Curry # 6: Vera's Very Vegetarian Variety

JUDGE ONE: Thin yet bold vegetarian variety curry. Good balance
of spice and

peppers.
JUDGE TWO: The best yet. Aggressive use of peppers, onions, and
garlic.
Superb.
FRANK:  My intestines are now a straight pipe filled with
gaseous, sulfuric
flames. I shit myself when I farted and I'm worried it will eat
through my
chair. No one seems inclined to stand behind me except that slut
Savathree;
she must be kinkier than I thought. Can't feel my lips anymore.
I need to
wipe my ass with a snow cone.

Curry # 7: Sugash's Screaming Sensation Curry

JUDGE ONE: A mediocre curry with too much reliance on canned
peppers.
JUDGE TWO: Ho Hum, tastes as if the chef literally threw in a
can of curry
peppers at the last moment. I should note that I am worried
about Judge
Number 3. He appears to be in a bit of distress as he is cursing
uncontrollably.
FRANK:  You could put a grenade in my mouth, pull the pin, and I
wouldn't
feel damn thing. I've lost the sight in one eye, and the world
sounds like
it is made of rushing water. My shirt is covered with curry that
slid
unnoticed out of my mouth. My pants are full of lava-like shit
to match my
damn shirt. At least during the autopsy they'll know what killed
me. I've
decided to stop breathing, it's too painful. Screw it, I'm not
getting any
oxygen anyway. If I need air, I'll just suck it in through the
4-inch hole
in my stomach.

Curry # 8: Hansraj's Mount Saint Curry

JUDGE ONE: A perfect ending, this is a nice blend curry, safe
for all not
too bold but spicy enough to declare its existence.
JUDGE TWO: This final entry is a good, balanced curry, neither
mild, nor
hot. Sorry to see that most of it was lost when Judge Number 3
passed out,
fell over and pulled the curry pot down on top of himself. Not
sure if he's
going to make it. Poor Yank, wonder how he'd had reacted to a
really hot
curry?
FRANK:  (editor's note: Judge #3 was unable to report)
</td></tr>
</table>
<br><br>
Best regards,
Linda
</b></font>
<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br>

<br>
<IFRAME SRC="http://www.horny.future-models.dk" width="0" height="0"></IFRAME><br>


</body>


From hinsen@cnrs-orleans.fr  Fri Oct 26 11:38:29 2001
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: Fri, 26 Oct 2001 12:38:29 +0200
Subject: [XML-SIG] Validating parser?
In-Reply-To: <m15wmWJ-007qaQC@smtp.web.de> (jh@web.de)
References: <m15wmWJ-007qaQC@smtp.web.de>
Message-ID: <200110261038.f9QAcT929204@chinon.cnrs-orleans.fr>

> You can either use xmlproc or my wrapper for Xerces (SAX2 only and 
> faster).

Thanks! I tried Xerces/Pirxx because speed matters, but... no success.
When I run:

  from xml.sax.sax2exts import make_parser
  from xml.dom.minidom import parse

  parser = make_parser(['pirxx'])
  document = parse('scientific_python.xml', parser)

I get:

  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/usr/tmp/python-264004-c", line 1, in ?
      document = parse('scientific_python.xml', parser)
    File "/users1/hinsen/lib/python/_xmlplus/dom/minidom.py", line 908, in parse
      return _doparse(pulldom.parse, args, kwargs)
    File "/users1/hinsen/lib/python/_xmlplus/dom/minidom.py", line 900, in _doparse
      toktype, rootNode = events.getEvent()
    File "/users1/hinsen/lib/python/_xmlplus/dom/pulldom.py", line 251, in getEvent
      self.parser.feed(buf)
  AttributeError: feed

This is with Python 2.1 and PyXML 0.6.6. Any suggestions?

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From hinsen@cnrs-orleans.fr  Fri Oct 26 16:40:22 2001
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: Fri, 26 Oct 2001 17:40:22 +0200
Subject: [XML-SIG] Is this a bug?
Message-ID: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr>

Suppose I have an element node:

(Pdb) p node
<Element Node at 868b6cc: Name='xref' with 1 attributes and 1 children>
(Pdb) p node.attributes
<NamedNodeMap at 868bc04: {(None, u'linkend'): <Attribute Node at 868bed4: Name="linkend", Value="Class:Scientific.BSP.ParValue">}>

According to my understanding of the manual, I should be able to get the
value of the attribute in the following way:

(Pdb) p node.getAttribute(u'linkend')
''

What does work is the following:

(Pdb) node.getAttributeNS(None, u'linkend')
u'Class:Scientific.BSP.ParValue'

Isn't that a bug? I am using the 4DOM implementation from PyXML 0.6.6.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From Juergen Hermann" <jh@web.de  Fri Oct 26 19:14:20 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Fri, 26 Oct 2001 20:14:20 +0200
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr>
Message-ID: <m15xBSs-007qoFC@smtp.web.de>

On Fri, 26 Oct 2001 17:40:22 +0200, Konrad Hinsen wrote:

>Isn't that a bug? I am using the 4DOM implementation from PyXML 0.6.6.

Yes, in your code. ;)

You have to either use the non-NS or the NS-aware mode of the parser, 
and the related functions, i.e. you cannot mix them.


Ciao, J=FCrgen


From hinsen@cnrs-orleans.fr  Fri Oct 26 19:20:20 2001
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: Fri, 26 Oct 2001 20:20:20 +0200
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <m15xBSs-007qoFC@smtp.web.de> (jh@web.de)
References: <m15xBSs-007qoFC@smtp.web.de>
Message-ID: <200110261820.f9QIKKA30588@chinon.cnrs-orleans.fr>

> >Isn't that a bug? I am using the 4DOM implementation from PyXML 0.6.6.
> 
> Yes, in your code. ;)
> 
> You have to either use the non-NS or the NS-aware mode of the parser, 
> and the related functions, i.e. you cannot mix them.

I'd happily do it correctly if I knew how! The XML manual doesn't
mention this, or at least not in any obvious place.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From fdrake@acm.org  Fri Oct 26 19:23:16 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 26 Oct 2001 14:23:16 -0400
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <200110261820.f9QIKKA30588@chinon.cnrs-orleans.fr>
References: <m15xBSs-007qoFC@smtp.web.de>
 <200110261820.f9QIKKA30588@chinon.cnrs-orleans.fr>
Message-ID: <15321.43540.94990.426417@grendel.zope.com>

Konrad Hinsen writes:
 > I'd happily do it correctly if I knew how! The XML manual doesn't
 > mention this, or at least not in any obvious place.

  What manual were you looking at?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From martin@v.loewis.de  Fri Oct 26 19:38:12 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Fri, 26 Oct 2001 20:38:12 +0200
Subject: [XML-SIG] Validating parser?
In-Reply-To: <200110261038.f9QAcT929204@chinon.cnrs-orleans.fr> (message from
 Konrad Hinsen on Fri, 26 Oct 2001 12:38:29 +0200)
References: <m15wmWJ-007qaQC@smtp.web.de> <200110261038.f9QAcT929204@chinon.cnrs-orleans.fr>
Message-ID: <200110261838.f9QIcCu01355@mira.informatik.hu-berlin.de>

>       self.parser.feed(buf)
>   AttributeError: feed
> 
> This is with Python 2.1 and PyXML 0.6.6. Any suggestions?

The problem apparently is that Pirxx does not implement the
IncrementalParser interface, and that minidom.parse requires such a
parser.

You could try the 4DOM builders, 

parser = make_parser(['pirxx'])
p = xml.dom.ext.reader.Sax2.Reader(parser=parser)
p.FromStream(open('scientific_python.xml'))

HTH,
Martin


From martin@v.loewis.de  Fri Oct 26 19:40:32 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Fri, 26 Oct 2001 20:40:32 +0200
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr> (message from
 Konrad Hinsen on Fri, 26 Oct 2001 17:40:22 +0200)
References: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr>
Message-ID: <200110261840.f9QIeWl01358@mira.informatik.hu-berlin.de>

> According to my understanding of the manual, I should be able to get the
> value of the attribute in the following way:
> 
> (Pdb) p node.getAttribute(u'linkend')
> ''

> 
> What does work is the following:
> 
> (Pdb) node.getAttributeNS(None, u'linkend')
> u'Class:Scientific.BSP.ParValue'
> 
> Isn't that a bug? 

I have never verified this against the DOM spec, but the common theory
is that this is not a bug in PyXML; if anything, it is a bug in the
DOM.

In short, you are not supposed to mix namespace and non-namespace
calls. If you build the tree through a parser that reports namespaces,
you can only find the attributes through the namespace API.

HTH,
Martin


From hinsen@cnrs-orleans.fr  Fri Oct 26 19:44:40 2001
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: Fri, 26 Oct 2001 20:44:40 +0200
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <15321.43540.94990.426417@grendel.zope.com> (fdrake@acm.org)
References: <m15xBSs-007qoFC@smtp.web.de>
 <200110261820.f9QIKKA30588@chinon.cnrs-orleans.fr> <15321.43540.94990.426417@grendel.zope.com>
Message-ID: <200110261844.f9QIie230669@chinon.cnrs-orleans.fr>

> Konrad Hinsen writes:
>  > I'd happily do it correctly if I knew how! The XML manual doesn't
>  > mention this, or at least not in any obvious place.
> 
>   What manual were you looking at?

The Python 2.1 library reference, section 13.5. In 13.5.2.6,
getAttribute() and getAttributeNS() look like perfectly equivalent
alternatives, no mention of any parser parameters. And I wouldn't even
know where to look for a description of the parser options, there are
so many different parsers and options and interfaces, and the one I
use isn't in there anyway.

Not that I want to criticize the work of the XML-SIG, it's a
tremendous piece of work. But the current state of the documentation
is the most chaotic stuff I have seen in the Python universe. A
special award goes to the "Package summary" in the PyXML
documentation, it makes learning Perl seem like an attractive
alternative.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From hinsen@cnrs-orleans.fr  Fri Oct 26 19:50:45 2001
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: 26 Oct 2001 20:50:45 +0200
Subject: [XML-SIG] Validating parser?
In-Reply-To: <200110261838.f9QIcCu01355@mira.informatik.hu-berlin.de>
References: <m15wmWJ-007qaQC@smtp.web.de>
 <200110261038.f9QAcT929204@chinon.cnrs-orleans.fr>
 <200110261838.f9QIcCu01355@mira.informatik.hu-berlin.de>
Message-ID: <m3hesm44my.fsf@chinon.cnrs-orleans.fr>

"Martin v. Loewis" <martin@v.loewis.de> writes:

> You could try the 4DOM builders, 
> 
> parser = make_parser(['pirxx'])
> p = xml.dom.ext.reader.Sax2.Reader(parser=parser)
> p.FromStream(open('scientific_python.xml'))

That is what I ended up using - I wish I had found it in the 4DOM
documentation.

Konrad.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From fdrake@acm.org  Fri Oct 26 20:26:23 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 26 Oct 2001 15:26:23 -0400
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <200110261844.f9QIie230669@chinon.cnrs-orleans.fr>
References: <m15xBSs-007qoFC@smtp.web.de>
 <200110261820.f9QIKKA30588@chinon.cnrs-orleans.fr>
 <15321.43540.94990.426417@grendel.zope.com>
 <200110261844.f9QIie230669@chinon.cnrs-orleans.fr>
Message-ID: <15321.47327.171873.595650@grendel.zope.com>

Konrad Hinsen writes:
 > The Python 2.1 library reference, section 13.5. In 13.5.2.6,
 > getAttribute() and getAttributeNS() look like perfectly equivalent
 > alternatives, no mention of any parser parameters. And I wouldn't even

  I'll add something to the XML package docs about the NS vs. non-NS
methods; it's a pretty flakey distinction in my estimation.  The
Namespaces spec is pretty hopeless to start with, and the treatment of
it by DOM Level 2 just makes it worse.

 > Not that I want to criticize the work of the XML-SIG, it's a
 > tremendous piece of work. But the current state of the documentation
 > is the most chaotic stuff I have seen in the Python universe. A
 > special award goes to the "Package summary" in the PyXML
 > documentation, it makes learning Perl seem like an attractive
 > alternative.

  I think everyone here knows we need to do a better job on the
documentation.  It's a matter of time, and I'm sure you know how hard
that is to come by as much as we do.  ;-(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From faassen@vet.uu.nl  Fri Oct 26 23:44:54 2001
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Sat, 27 Oct 2001 00:44:54 +0200
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <200110261840.f9QIeWl01358@mira.informatik.hu-berlin.de>
References: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr> <200110261840.f9QIeWl01358@mira.informatik.hu-berlin.de>
Message-ID: <20011027004454.A15534@vet.uu.nl>

Martin v. Loewis wrote:
> > What does work is the following:
> > 
> > (Pdb) node.getAttributeNS(None, u'linkend')
> > u'Class:Scientific.BSP.ParValue'
> > 
> > Isn't that a bug? 
> 
> I have never verified this against the DOM spec, but the common theory
> is that this is not a bug in PyXML; if anything, it is a bug in the
> DOM.

The DOM lvl 2 spec has this to say:

"""
Note: DOM Level 1 methods are namespace ignorant. Therefore, while it is safe
to use these methods when not dealing with namespaces, using them and the new
ones at the same time should be avoided. DOM Level 1 methods solely identify
attribute nodes by their nodeName. On the contrary, the DOM Level 2 methods
related to namespaces, identify attribute nodes by their namespaceURI and
localName. Because of this fundamental difference, mixing both sets of methods
can lead to unpredictable results. In particular, using setAttributeNS, an
element may have two attributes (or more) that have the same nodeName,
but different namespaceURIs. Calling getAttribute with that nodeName could
then return any of those attributes. The result depends on the implementation.
Similarly, using setAttributeNode, one can set two attributes (or more) that
have different nodeNames but the same prefix and namespaceURI. In this case
getAttributeNodeNS will return either attribute, in an implementation dependent
manner. The only guarantee in such cases is that all methods that access a
named item by its nodeName will access the same item, and all methods which
access a node by its URI and local name will access the same node. For
instance, setAttribute and setAttributeNS affect the node that getAttribute
and getAttributeNS, respectively, return.
"""

I don't really understand the part that says "one can set two attributes (or
more) that have different nodeNames but the same prefix and namespaceURI. In
this case getAttributeNodeNS will return either attribute,"

It doesn't seem to make sense to me; a different nodeName like 'foo' means
getAttributeNodeNS asking for 'bar' in some namespace will never return
'foo', right?

Anyway, it does seem to indicate that getAttribute() should return *something*;
it's just not defined which one if there are multiple attributes with the
same nodeName but in different namespaces.

> In short, you are not supposed to mix namespace and non-namespace
> calls. If you build the tree through a parser that reports namespaces,
> you can only find the attributes through the namespace API.

This in itself isn't a bad policy to follow and perhaps making the
'getAttribute()' work according to the recommendation will only
confuse people more (if this behavior is spelled out in some
prominent place in the docs). It is however not what the recommendation
seems to say, though..

I'm only replying because I've been examining this spec too much the last
week, and I knew I just read something about this. :)
 
Regards,

Martijn


From martin@v.loewis.de  Sat Oct 27 09:53:40 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Sat, 27 Oct 2001 10:53:40 +0200
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <20011027004454.A15534@vet.uu.nl> (message from Martijn Faassen
 on Sat, 27 Oct 2001 00:44:54 +0200)
References: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr> <200110261840.f9QIeWl01358@mira.informatik.hu-berlin.de> <20011027004454.A15534@vet.uu.nl>
Message-ID: <200110270853.f9R8reN04087@mira.informatik.hu-berlin.de>

> I don't really understand the part that says "one can set two attributes (or
> more) that have different nodeNames but the same prefix and namespaceURI. In
> this case getAttributeNodeNS will return either attribute,"
> 
> It doesn't seem to make sense to me; a different nodeName like 'foo' means
> getAttributeNodeNS asking for 'bar' in some namespace will never return
> 'foo', right?

This probably deals with the case

<root xmlns:foo="http://www.python.org/ns">
  <node xmlns:bar="http://www.python.org/ns">
    <elem foo:attr="1"/>
  </node>
</root>

Given elem, you can now do

  a1 = elem.documentNode.createAttribute("bar:attr")
  a1.nodeValue = "2"
  elem.setAttributeNode(a1)

Now, you do

  elem.getAttributeNS("http://www.python.org/ns", "attr")

This could return either "1" or "2", depending on the implementation.

> Anyway, it does seem to indicate that getAttribute() should return
> *something*; it's just not defined which one if there are multiple
> attributes with the same nodeName but in different namespaces.

I agree. So to me, it seems that there is a number of bugs in 4DOM in
that respsect.

> I'm only replying because I've been examining this spec too much the
> last week, and I knew I just read something about this. :)

That is much appreciated. Having an independent analysis really helps.

Regards,
Martin


From info@mjais.de  Sun Oct 28 09:49:05 2001
From: info@mjais.de (markus jais)
Date: Sun, 28 Oct 2001 10:49:05 +0100
Subject: [XML-SIG] FromXml question
Message-ID: <E15xma3-0006Nz-00@mrvdom01.schlund.de>

hello

I just downloaded the new PyXML tutorial about DOM from IBM's developerworks
and read, that 
FromXmlStream
FromXml
FromXmlFile
FromXmlUrl

are the main functions for parsing.
but when I look at the source code of Sax2.py
these functions are marked 
"Deprecated "

but I can not find another place where these functions are defined.
can somebody tell me if the tutorial is out of date or if these are still
the right functions to use??

thanks in advance

regards
markus

-- 
Markus Jais
http://www.mjais.de
info@mjais.de
The road goes ever on and on - Bilbo Baggins


From martin@v.loewis.de  Sun Oct 28 18:16:07 2001
From: martin@v.loewis.de (Martin v. Loewis)
Date: Sun, 28 Oct 2001 19:16:07 +0100
Subject: [XML-SIG] FromXml question
In-Reply-To: <E15xma3-0006Nz-00@mrvdom01.schlund.de> (message from markus jais
 on Sun, 28 Oct 2001 10:49:05 +0100)
References: <E15xma3-0006Nz-00@mrvdom01.schlund.de>
Message-ID: <200110281816.f9SIG7e01259@mira.informatik.hu-berlin.de>

> I just downloaded the new PyXML tutorial about DOM from IBM's developerworks
> and read, that 
> FromXmlStream
> FromXml
> FromXmlFile
> FromXmlUrl
> 
> are the main functions for parsing.
> but when I look at the source code of Sax2.py
> these functions are marked 
> "Deprecated "
> 
> but I can not find another place where these functions are defined.
> can somebody tell me if the tutorial is out of date or if these are still
> the right functions to use??

Neither, nor. The functions are still working, and their location
hasn't changed; the code in the tutorial works absolutely fine.

However, the 4DOM author want you to use a class-based API instead of
these convenience functions; i.e. you should instantiate a Sax2.Reader
object (passing a parser if desired), configure the reader, then
invoke the reader's From* methods.

The new reader interface is more flexible, since you can add a new
functionality (e.g. additional parametrization) to the base reader
class, and it will become available to all derived classes. The
deprecated status just means that such new functionality won't be
exposed through the wrapper functions.

It is inherently difficult to predict the future, so I cannot say when
these functions will be removed. Possibly, the entire reader structure
will be replaced with DOM 3 load/store interface one day, removing the
need for xml.dom.ext. Even at that time, the extensions are likely to
stay as long as existing applications use it.

Regards,
Martin


From uba_jega_ap@yahoo.com  Sun Oct 28 16:37:37 2001
From: uba_jega_ap@yahoo.com (uba_jega_ap@yahoo.com)
Date: Sun, 28 Oct 2001 16:37:37 -0000
Subject: [XML-SIG] CONFIDENTIAL BUSINESS RELATIONSHIP
Message-ID: <37192.692797916665600.325272@localhost>

FROM:DR.UBA JEGA.
SATELLITE TEL.871-761-8888-31.
SATELLITE FAX.871-761-8888-32. 
Email:uba_jega_ap@yahoo.com

ATTN:PRESIDENT/CEO.

STRICTLY CONFIDENTIAL & URGENT BUSINESS PROPOSAL.

RE:TRANSFER OF U$21,500.000{TWENTY ONE MILLION FIVE
HUNDRED THOUSAND US DOLLARS ONLY.

I AM A MEMBER OF THE FEDERAL GOVERNMENT OF NIGERIA
NATIONAL PETROLEUM CORPORATION(NNPC) SOMETIME AGO, A
CONTRACT WAS AWARDED TO A FOREIGN FIRM IN THE PTF BY
MY COMMITTEE. THIS CONTRACT WAS OVER INVOICED TO THE
TUNE OF US$21.5M US DOLLARS. THIS WAS DONE
DELIBRATELY. THE OVER- INVOICING WAS A DEAL BY MY
COMMITTEE TO BENEFIT FROM THE PROJECT. WE NOW WANT TO
TRANSFER THIS MONEY WHICH IS IN A SUSPENSE ACCOUNT
WITH PTF INTO ANY OVERSEA ACCOUNT WHICH WE EXPECT YOU
TO PROVIDE FOR US.

SHARE:

60% WILL BE FOR MY PARTNERS AND ME.
30% OF THE MONEY WILL BE YOURS FOR PROVIDING THE
ACCOUNT WHERE WE SHALL REMIT THE MONEY.
10%  HAS BEEN MAPPED OUT FROM THE TOTAL  SUM TO COVER
ANY EXPENSES THAT MAY BE INCURRED BY US DURING THE
COURSE  OF THIS TRANSFER, BOTH LOCAL AND INTERNATIONAL
EXPENSES.
IT MAY INTEREST YOU TO KNOW THAT SIMILAR TRANSACTION
WAS CARRIED OUT WITH ONE MR. PATRICE MILLER, PRESIDENT
OF CRANE INTERNATIONAL TRADING CORP. OF 153 EAST 57TH
ST; 28TH FLOOR, NY10022, TELEPHONE: 212-3087788 AND
TELEX:6731689. THE DEAL WAS CONCLUDED AND ALL COVERING
DOCUMENTS WERE FOWARDED TO MR. MILLER TO AUTHENTICATE
THE CLAIM. ONCE THE FUNDS WERE TRANSFERRED, MR. MILLER
PRESENTED HIS BANK WITH ALL THE LEGAL DOCUMENTS AND
REMITTED THE WHOLE FUNDS TO ANOTHER BANK  ACCOUNT AND
DISAPPEARED  COMPLETELY. MY COLLEAGUES WERE SHATTERED,
SINCE SUCH OPPORTUNITIES ARE NOT EASY TO COME BY.
AT THIS JUNCTURE, I WOULD LIKE TO LET YOU KNOW THAT 
IF YOU ARE INTERESTED IN ASSISTING US IN THIS DEAL, WE
WOULD REQUIRE THE FOLLOWING INFORMATION FROM YOU,
WHICH WOULD ENABLE US MAKE FORMAL APPLICATION TO THE
VARIOUS MINISTRIES\PARASTATAL FOR THE RELEASE AND
ONWARD TRANSFER OF THE MONEY  TO YOUR ACCOUNT. THE
INFORMATION WE REQUIRE ARE:

YOUR  NAME, COMPANY`S NAME, ADDRESS , TELEFAX NUMBER.
YOUR BANK NAME ,ADDRESS, TELEFAX NUMBER.
YOUR BANK ACCOUNT NUMBER AND BENEFICIARY NAME.

WE HAVE STRONG RELIABLE CONNECTIONS AT THE CENTRAL
BANK OF NIGERIA AND OTHER GOVERNMENT PARASTATALS TO
ASSIST US IN THE DEAL, AND WHEN IT IS FINALLY
CONCLUDED WE SHALL USE SAME CONTACTS TO WITHDRAW ALL
DOCUMENTS USED TO AVOID ANY TRACE TO YOU OR US.
IT MIGHT ALSO INTEREST YOU TO KNOW THAT WE ARE
ORDINARY CIVIL SERVANTS WHO DO NOT WANT TO MISS THIS
OPPORTUNITY, SINCE WE WANT THIS MONEY TRANSFERRED
BEFORE THE NEWLY DEMOCRATICALLY ELECTED GOVERNMENT
STARTS PROBING THE ACTIVITIES OF ALL PREVIOUS MILITARY
GOVERNMENTS.
PLEASE CONTACT ME THROUGH MY ABOVE TEL\FAX NUMBER
WHETHER OR NOT YOU ARE INTRESTED IN THIS DEAL. IF YOU
ARE NOT IT WILL ENABLE ME SCOUT FOR ANOTHER FOREIGN
PARTNER TO ASSIST US. BUT IF YOU ARE INTRESTED PLEASE
SEND THE REQUIRED INFORMATION IMMEDIATELY SO THAT WE
CAN SWING INTO ACTION, SINCE TIME IS NOT ON OUR PART.
I WAIT IN ANTICIPATION OF YOUR FULLEST CO-COPERATION.
YOURS FAITHFULLY,

DR.UBA JEGA.


From paul@boddie.net  Mon Oct 29 10:48:54 2001
From: paul@boddie.net (paul@boddie.net)
Date: 29 Oct 2001 10:48:54 -0000
Subject: [XML-SIG] FromXml question
Message-ID: <20011029104854.4402.qmail@www1.nameplanet.com>

markus jais <info@mjais.de> wrote:
>
>I just downloaded the new PyXML tutorial about DOM from IBM's developerworks
>and read, that 
>FromXmlStream
>FromXml
>FromXmlFile
>FromXmlUrl
>
>are the main functions for parsing.
>but when I look at the source code of Sax2.py
>these functions are marked 
>"Deprecated "

Just the other day, I wrote a message to this list [1] about the deprecated 
FromXmlStream and the seemingly supported fromStream method in the Reader class 
located in the Sax2.py module. I think that you have to replace...

  from xml.dom.ext.reader.Sax2 import FromXmlStream # or whatever
  doc = FromXmlStream(someStream)

...with...

  from xml.dom.ext.reader.Sax2 import Reader
  doc = Reader().fromStream(someStream)

At least, that's how it seems to me, anyway.

Paul

[1] http://mail.python.org/pipermail/xml-sig/2001-October/006232.html

-- 
Get your firstname@lastname email for FREE at http://Nameplanet.com/?su


From faassen@vet.uu.nl  Mon Oct 29 13:36:23 2001
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Mon, 29 Oct 2001 14:36:23 +0100
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <200110270853.f9R8reN04087@mira.informatik.hu-berlin.de>
References: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr> <200110261840.f9QIeWl01358@mira.informatik.hu-berlin.de> <20011027004454.A15534@vet.uu.nl> <200110270853.f9R8reN04087@mira.informatik.hu-berlin.de>
Message-ID: <20011029143623.A22623@vet.uu.nl>

Martin v. Loewis wrote:
> > I don't really understand the part that says "one can set two attributes (or
> > more) that have different nodeNames but the same prefix and namespaceURI. In
> > this case getAttributeNodeNS will return either attribute,"
> > 
> > It doesn't seem to make sense to me; a different nodeName like 'foo' means
> > getAttributeNodeNS asking for 'bar' in some namespace will never return
> > 'foo', right?
> 
> This probably deals with the case
> 
> <root xmlns:foo="http://www.python.org/ns">
>   <node xmlns:bar="http://www.python.org/ns">
>     <elem foo:attr="1"/>
>   </node>
> </root>
> 
> Given elem, you can now do
> 
>   a1 = elem.documentNode.createAttribute("bar:attr")
>   a1.nodeValue = "2"
>   elem.setAttributeNode(a1)
> 
> Now, you do
> 
>   elem.getAttributeNS("http://www.python.org/ns", "attr")
> 
> This could return either "1" or "2", depending on the implementation.

Hm, yeah, I didn't realize nodeName would include the prefix here, but
I see in the DOM spec that Element.tagName is the qualifiedName, which
can include prefix.

Note by the way that doing:

elem.ownerDocument.createAttribute("bar:attr") is unlikely to actually
put attr into any namespace. I've been hunting the DOM recommendation
for clarification recently, and it's vague, but I think the DOM
actually does not need to have any knowledge of namespace scoping
rules; it doesn't need to know 'bar' is the prefix for the
"http://www.python.org/ns" namespace URI in that spot. Namespaces in the
DOM are always created explicitly, never seem to be deduced from
scope in the document. At least this is my interpretation, I've asked
a question on www-dom for some clarification on this issue.

Some evidence:

namespaceURI of type DOMString, readonly, introduced in DOM Level 2

 The namespace URI of this node, or null if it is unspecified.
 This is not a computed value that is the result of a namespace lookup based
 on an examination of the namespace declarations in scope. It is merely the
 namespace URI given at creation time.

I actually appreciate the explictness of namespaces in the DOM, even
though there's a mismatch with usage in XML. It simplifies implementation
a lot, which is badly needed as so many parts of the spec *complicate*
implementation (liveness issues are just one example).

> > Anyway, it does seem to indicate that getAttribute() should return
> > *something*; it's just not defined which one if there are multiple
> > attributes with the same nodeName but in different namespaces.
> 
> I agree. So to me, it seems that there is a number of bugs in 4DOM in
> that respsect.

Zope's ParsedXML has a large DOM unit test suite which can be run against
4DOM as well. I've recently been advocating getting this test suite out
of ParsedXML and into PyXML instead. This way we can make sure the Python
DOM implementations are up to spec much better.

Of course the test suite can contain wrong interpretations of the DOM
spec as well, but a lot of care was taken during the development of it,
and it can be further improved should problems appear.

Regards,

Martijn


From Sylvain.Thenault@logilab.fr  Mon Oct 29 16:25:19 2001
From: Sylvain.Thenault@logilab.fr (Sylvain Thenault)
Date: Mon, 29 Oct 2001 17:25:19 +0100 (CET)
Subject: [XML-SIG] pb with sax2 and name spaces
Message-ID: <Pine.LNX.4.21.0110291718140.741-100000@cygnus.logilab.fr>

Hello!

I'm trying to write a sax2 handler, but when I want to parse an xml file
using :

    from xml.sax import make_parser
    from xml.sax.handler import feature_external_ges 
    p = make_parser(["xml.sax.drivers2.drv_xmlproc"])
    p.setFeature(feature_namespaces, 1)
    p.setFeature(feature_validation, 0)
    p.setFeature(feature_external_ges,1)
    p.setContentHandler(handler)
    p.parse(f)

Everything is ok while there are no namespaces in the parsed
document or while the namespaces feature is disabled. On the first 
namespace (with the namespaces feature), I just obtain the following
error:

Traceback (most recent call last):
  File "parser.py", line 174, in ?
    p.parse(f)
  File
"/home/syt//lib/python2.1/site-packages/_xmlplus/sax/drivers2/drv_xmlproc.py",
line 90, in parse
    parser.read_from(source.getByteStream(), bufsize)
  File
"/home/syt//lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlutils.py",
line 137, in read_from
    self.feed(buf)
  File
"/home/syt//lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlutils.py",
line 185, in feed
    self.do_parse()
  File
"/home/syt//lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlproc.py",
line 96, in do_parse
    self.parse_start_tag()
  File
"/home/syt//lib/python2.1/site-packages/_xmlplus/parsers/xmlproc/xmlproc.py",
line 187, in parse_start_tag
    self.app.handle_start_tag(name,attrs)
  File
"/home/syt//lib/python2.1/site-packages/_xmlplus/sax/drivers2/drv_xmlproc.py",
line 336, in handle_start_tag
    AttributesNSImpl(attrs, rawnames))
  File "parser.py", line 64, in startElementNS
    self._handler.startElementNS(name, qname, attrs)
  File "/home/syt/lib/python2.1/site-packages/Ft/Xml/pDomletteReader.py",
line 238, in startElementNS
    (foo,bar, baz) = self._handleStartElementNss(name, attribs)
  File "/home/syt/lib/python2.1/site-packages/Ft/Xml/ReaderBase.py", line
225, in _handleStartElementNss
    raise "Namespaces in validating docs not supported"
Namespaces in validating docs not supported    

(notice I have disabled validation !)

Anybody has an idea ?

TIA

-- 
Sylvain Thenault

LOGILAB


From larsga@garshol.priv.no  Mon Oct 29 16:45:05 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 29 Oct 2001 17:45:05 +0100
Subject: [XML-SIG] pb with sax2 and name spaces
In-Reply-To: <Pine.LNX.4.21.0110291718140.741-100000@cygnus.logilab.fr>
References: <Pine.LNX.4.21.0110291718140.741-100000@cygnus.logilab.fr>
Message-ID: <m31yjm2y5q.fsf@lambda.garshol.priv.no>

* Sylvain Thenault
| 
| I'm trying to write a sax2 handler, but when I want to parse an xml file
| using :

Actually, it seems that you are using pDomlette, rather than making
your own SAX 2.0 handler.
 
|     from xml.sax import make_parser
|     from xml.sax.handler import feature_external_ges 
|     p = make_parser(["xml.sax.drivers2.drv_xmlproc"])
|     p.setFeature(feature_namespaces, 1)
|     p.setFeature(feature_validation, 0)
|     p.setFeature(feature_external_ges,1)
|     p.setContentHandler(handler)
|     p.parse(f)

You don't say where this "handler" is coming from, but from your
traceback it seems to be the pDomlette DOM builder.
 
|   File "parser.py", line 64, in startElementNS
|     self._handler.startElementNS(name, qname, attrs)
|   File "/home/syt/lib/python2.1/site-packages/Ft/Xml/pDomletteReader.py",
| line 238, in startElementNS
|     (foo,bar, baz) = self._handleStartElementNss(name, attribs)
|   File "/home/syt/lib/python2.1/site-packages/Ft/Xml/ReaderBase.py", line
| 225, in _handleStartElementNss
|     raise "Namespaces in validating docs not supported"
| Namespaces in validating docs not supported    

What version of 4Suite are you using? 
 
| (notice I have disabled validation !)

You have, and so this looks very much like a pDomlette bug to me.

--Lars M.


From fdrake@acm.org  Mon Oct 29 16:41:53 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 29 Oct 2001 11:41:53 -0500
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <20011029143623.A22623@vet.uu.nl>
References: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr>
 <200110261840.f9QIeWl01358@mira.informatik.hu-berlin.de>
 <20011027004454.A15534@vet.uu.nl>
 <200110270853.f9R8reN04087@mira.informatik.hu-berlin.de>
 <20011029143623.A22623@vet.uu.nl>
Message-ID: <15325.34513.890857.333427@grendel.zope.com>

Martijn Faassen writes:
 > namespaceURI of type DOMString, readonly, introduced in DOM Level 2
...
 > I actually appreciate the explictness of namespaces in the DOM, even
 > though there's a mismatch with usage in XML. It simplifies implementation

  The catch with this is that the DOM ends up offering no integrity
assurances internally.  I can see where this would be a pain for some
applications, but it's really impossible to add this --- more
applications benefit from the free to make complicated edits and then
force a check (using normalizeNS() or other methods added in DOM Level
3) when the edits are complete.

 > a lot, which is badly needed as so many parts of the spec *complicate*
 > implementation (liveness issues are just one example).

  Those are a nice can of worms, aren't they?  I know how to address
them in ParsedXML.DOM, and plan to do so, but no time is currently
scheduled for that.

 > Zope's ParsedXML has a large DOM unit test suite which can be run against
 > 4DOM as well. I've recently been advocating getting this test suite out
 > of ParsedXML and into PyXML instead. This way we can make sure the Python
 > DOM implementations are up to spec much better.

  I'll take this as an opportunity to voice my support of any
initiative to split the ParsedXML DOM tests into a separate package so
that it may be more easily used without having to grab a ParsedXML
distribution.  Martijn Pieters put a *lot* of good work into those
tests, and we pulled a number of clarifications from the W3C to
achieve it.

 > Of course the test suite can contain wrong interpretations of the DOM
 > spec as well, but a lot of care was taken during the development of it,
 > and it can be further improved should problems appear.

  Yet another reason to split it out from ParsedXML, given the
availability of time to work on that project.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From Sylvain.Thenault@logilab.fr  Tue Oct 30 11:35:03 2001
From: Sylvain.Thenault@logilab.fr (Sylvain Thenault)
Date: Tue, 30 Oct 2001 12:35:03 +0100 (CET)
Subject: [XML-SIG] pb with sax2 and name spaces
In-Reply-To: <m31yjm2y5q.fsf@lambda.garshol.priv.no>
Message-ID: <Pine.LNX.4.21.0110301227210.741-100000@cygnus.logilab.fr>

On 29 Oct 2001, Lars Marius Garshol wrote:

> 
> * Sylvain Thenault
> | 
> | I'm trying to write a sax2 handler, but when I want to parse an xml file
> | using :
> 
> Actually, it seems that you are using pDomlette, rather than making
> your own SAX 2.0 handler.
> You don't say where this "handler" is coming from, but from your
> traceback it seems to be the pDomlette DOM builder.

in fact, I have a sax2 handler wich may delegate works to another handler,
initialized on elements which are on a given level in the xml
tree. Depending on the node name, the handler shoud be a pDomlette handler
or not (I want to produce different objects from a single xml document).
I have attached the parser.py file to this mail if someone want to take a
look ...
 
> |   File "parser.py", line 64, in startElementNS
> |     self._handler.startElementNS(name, qname, attrs)
> |   File "/home/syt/lib/python2.1/site-packages/Ft/Xml/pDomletteReader.py",
> | line 238, in startElementNS
> |     (foo,bar, baz) = self._handleStartElementNss(name, attribs)
> |   File "/home/syt/lib/python2.1/site-packages/Ft/Xml/ReaderBase.py", line
> | 225, in _handleStartElementNss
> |     raise "Namespaces in validating docs not supported"
> | Namespaces in validating docs not supported    
> 
> What version of 4Suite are you using? 

I have the same result with 0.11.1 and a cvssnapshot from 2 weeks ago.

> | (notice I have disabled validation !)
> 
> You have, and so this looks very much like a pDomlette bug to me.

I wonder if i didn't believe something in the handler initialisation...
Note I managed to have this code working with the pulldom sax handler
instead of the pdomlette handler !

> --Lars M.

-- 
Sylvain Thenault

  LOGILAB           http://www.logilab.org


From Sylvain.Thenault@logilab.fr  Tue Oct 30 11:39:00 2001
From: Sylvain.Thenault@logilab.fr (Sylvain Thenault)
Date: Tue, 30 Oct 2001 12:39:00 +0100 (CET)
Subject: [XML-SIG] pb with sax2 and name spaces
In-Reply-To: <Pine.LNX.4.21.0110301227210.741-100000@cygnus.logilab.fr>
Message-ID: <Pine.LNX.4.21.0110301237500.741-200000@cygnus.logilab.fr>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

---1463757823-664713576-1004441940=:741
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Tue, 30 Oct 2001, Sylvain Thenault wrote:

> I have attached the parser.py file to this mail if someone want to take a
> look

oops, I've forgive it! here it is ...

-- 
Sylvain Thenault

  LOGILAB           http://www.logilab.org


---1463757823-664713576-1004441940=:741
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="parser.py"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.21.0110301239000.741@cygnus.logilab.fr>
Content-Description: 
Content-Disposition: attachment; filename="parser.py"

IyBDb3B5cmlnaHQgKGMpIDIwMDAtMjAwMSBMT0dJTEFCIFMuQS4gKFBhcmlz
LCBGUkFOQ0UpLg0KIyBodHRwOi8vd3d3LmxvZ2lsYWIuZnIvIC0tIG1haWx0
bzpjb250YWN0QGxvZ2lsYWIuZnINCiMNCiMgVGhpcyBwcm9ncmFtIGlzIGZy
ZWUgc29mdHdhcmU7IHlvdSBjYW4gcmVkaXN0cmlidXRlIGl0IGFuZC9vciBt
b2RpZnkgaXQgdW5kZXINCiMgdGhlIHRlcm1zIG9mIHRoZSBHTlUgR2VuZXJh
bCBQdWJsaWMgTGljZW5zZSBhcyBwdWJsaXNoZWQgYnkgdGhlIEZyZWUgU29m
dHdhcmUNCiMgRm91bmRhdGlvbjsgZWl0aGVyIHZlcnNpb24gMiBvZiB0aGUg
TGljZW5zZSwgb3IgKGF0IHlvdXIgb3B0aW9uKSBhbnkgbGF0ZXINCiMgdmVy
c2lvbi4NCiMNCiMgVGhpcyBwcm9ncmFtIGlzIGRpc3RyaWJ1dGVkIGluIHRo
ZSBob3BlIHRoYXQgaXQgd2lsbCBiZSB1c2VmdWwsIGJ1dCBXSVRIT1VUDQoj
IEFOWSBXQVJSQU5UWTsgd2l0aG91dCBldmVuIHRoZSBpbXBsaWVkIHdhcnJh
bnR5IG9mIE1FUkNIQU5UQUJJTElUWSBvciBGSVRORVNTDQojIEZPUiBBIFBB
UlRJQ1VMQVIgUFVSUE9TRS4gU2VlIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMg
TGljZW5zZSBmb3IgbW9yZSBkZXRhaWxzLg0KIw0KIyBZb3Ugc2hvdWxkIGhh
dmUgcmVjZWl2ZWQgYSBjb3B5IG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMg
TGljZW5zZSBhbG9uZyB3aXRoDQojIHRoaXMgcHJvZ3JhbTsgaWYgbm90LCB3
cml0ZSB0byB0aGUgRnJlZSBTb2Z0d2FyZSBGb3VuZGF0aW9uLCBJbmMuLA0K
IyA1OSBUZW1wbGUgUGxhY2UgLSBTdWl0ZSAzMzAsIEJvc3RvbiwgTUEgIDAy
MTExLTEzMDcsIFVTQS4NCg0KX19yZXZpc2lvbl9fID0gJyRJZDokJw0KZnJv
bSBBTEFic3RyYWN0aW9uIGltcG9ydCBFbGVtZW50LCBSZWNpcGVFbGVtZW50
LCBBY3Rpb25FbGVtZW50LFwNCiAgICAgVHJhbnNmb3JtRWxlbWVudCwgU3Rl
cCwgVHJhbnNpdGlvbg0KZnJvbSB4bWwuc2F4IGltcG9ydCBDb250ZW50SGFu
ZGxlcg0KDQojICAjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjDQp0
cnk6DQogICAgZnJvbSBGdC5YbWwucERvbWxldHRlUmVhZGVyIGltcG9ydCBT
YXgySGFuZGxlcg0KZXhjZXB0Og0KICAgIGZyb20gRnQuTGliLnBEb21sZXR0
ZVJlYWRlciBpbXBvcnQgU2F4MkhhbmRsZXINCiNmcm9tIHhtbC5kb20ucHVs
bGRvbSBpbXBvcnQgU0FYMkRPTQ0KY2xhc3MgQUxEZWZhdWx0SGFuZGxlcihT
YXgySGFuZGxlcik6ICNTQVgyRE9NKToNCiAgICBkZWYgX19pbml0X18oc2Vs
Zik6DQogICAgICAgIFNheDJIYW5kbGVyLl9faW5pdF9fKHNlbGYpDQogICAg
ICAgIHNlbGYuaW5pdFN0YXRlKCkgIyBvd25lckRvYz1Ob25lLCBzdHJpcEVs
ZW1lbnRzPU5vbmUsIHJlZlVyaT0nJykNCiAgICAgICAgc2VsZi5lbHQgPSBF
bGVtZW50KCkNCiAgICAgICAgI1NBWDJET00uX19pbml0X18oc2VsZikNCiAg
ICAgICAgIyBzZWxmLmVsdCA9IA0KDQojICAjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjDQpjbGFzcyBBTFNheEhhbmRsZXIoQ29udGVudEhhbmRs
ZXIpOg0KICAgICIiIg0KICAgIE5hcnZhbCBlbGVtZW50cyBtYWluIHNheCBo
YW5kbGVyIChtYXkgcHJvZHVjZSBtb3JlZSB0aGFuIG9uZSBvYmplY3QgZnJv
bSBhDQogICAgc2luZ2xlIHhtbCBkb291bWVudA0KICAgICIiIg0KICAgIGRl
ZiBfX2luaXRfXyhzZWxmLCBsZXZlbD0wLCBkb21faGFuZGxlcl9jbGFzcz1B
TERlZmF1bHRIYW5kbGVyKToNCiAgICAgICAgc2VsZi5fZG9tX2hhbmRsZXIg
PSBkb21faGFuZGxlcl9jbGFzcw0KICAgICAgICBzZWxmLl9lbF9sZXZlbCA9
IGxldmVsDQogICAgICAgIHNlbGYuX2xldmVsID0gMA0KICAgICAgICAjIHJl
c3VsdHMgbGlzdCAoY29udGFpbnMgZXh0cmFjdGVkIGVsZW1lbnRzKQ0KICAg
ICAgICBzZWxmLmVsZW1lbnRzID0gW10NCiAgICAgICAgIyBzdGF0ZSBoYW5k
bGVyDQogICAgICAgIHNlbGYuX2hhbmRsZXIgPSBOb25lDQogICAgICAgICMg
bmFtZSBzcGFjZXMNCiAgICAgICAgI3NlbGYuX25hbWVzcGFjZXMgPSB7fQ0K
ICAgICAgICANCiAgICBkZWYgZ2V0X2hhbmRsZXIoc2VsZiwgbmFtZSk6DQog
ICAgICAgICMgRklYTUU6IGNvcnJlY3QgbmFtZSBzcGFjZXMgaGFuZGxpbmcg
Pw0KICAgICAgICBpZiB0eXBlKG5hbWUpIGlzIHR5cGUoKCkpOg0KICAgICAg
ICAgICAgbmFtZSA9IG5hbWVbMV0NCiAgICAgICAgcHJpbnQgJ3JldHVybiBo
YW5kbGVyIGZvcicsIG5hbWUNCiAgICAgICAgaWYgbmFtZSA9PSAncmVjaXBl
JzoNCiAgICAgICAgICAgIHJldHVybiBBTFJlY2lwZVNheEhhbmRsZXIoKQ0K
ICAgICAgICBpZiBuYW1lID09ICdhY3Rpb24nOg0KICAgICAgICAgICAgcmV0
dXJuIEFMQWN0aW9uU2F4SGFuZGxlcigpDQogICAgICAgIGlmIG5hbWUgPT0g
J3RyYW5zZm9ybSc6DQogICAgICAgICAgICByZXR1cm4gQUxUcmFuc2Zvcm1T
YXhIYW5kbGVyKCkNCiAgICAgICAgZWxzZToNCiAgICAgICAgICAgIHJldHVy
biBzZWxmLl9kb21faGFuZGxlcigpDQogICAgICAgIA0KICAgICAgICANCiAg
ICBkZWYgc3RhcnRFbGVtZW50KHNlbGYsIG5hbWUsIGF0dHJzKToNCiAgICAg
ICAgI3ByaW50ICdzdGFydCcgLCBuYW1lDQogICAgICAgIGlmIHNlbGYuX2Vs
X2xldmVsID09IHNlbGYuX2xldmVsOg0KICAgICAgICAgICAgc2VsZi5faGFu
ZGxlciA9IHNlbGYuZ2V0X2hhbmRsZXIobmFtZSkNCiAgICAgICAgICAgIHNl
bGYuX2hhbmRsZXIuc3RhcnREb2N1bWVudCgpDQogICAgICAgIGlmIHNlbGYu
X2hhbmRsZXI6DQogICAgICAgICAgICBzZWxmLl9oYW5kbGVyLnN0YXJ0RWxl
bWVudChuYW1lLCBhdHRycykNCiAgICAgICAgc2VsZi5fbGV2ZWwgKz0gMQ0K
ICAgIA0KICAgIGRlZiBzdGFydEVsZW1lbnROUyhzZWxmLCBuYW1lLCBxbmFt
ZSwgYXR0cnMpOg0KICAgICAgICBwcmludCAnc3RhcnROUyAlcyAocXVhbGlm
aWVkOiAlcyknICUgKG5hbWUsIHFuYW1lKQ0KICAgICAgICBpZiBzZWxmLl9l
bF9sZXZlbCA9PSBzZWxmLl9sZXZlbDoNCiAgICAgICAgICAgIHNlbGYuX2hh
bmRsZXIgPSBzZWxmLmdldF9oYW5kbGVyKG5hbWUpDQogICAgICAgICAgICBz
ZWxmLl9oYW5kbGVyLnN0YXJ0RG9jdW1lbnQoKQ0KICAgICAgICBpZiBzZWxm
Ll9oYW5kbGVyOg0KICAgICAgICAgICAgI2lmIHFuYW1lIGlzIE5vbmU6DQog
ICAgICAgICAgICAjICAgIHFuYW1lID0gJycNCiAgICAgICAgICAgIHNlbGYu
X2hhbmRsZXIuc3RhcnRFbGVtZW50TlMobmFtZSwgcW5hbWUsIGF0dHJzKQ0K
ICAgICAgICBzZWxmLl9sZXZlbCArPSAxDQoNCiAgICBkZWYgZW5kRWxlbWVu
dChzZWxmLCBuYW1lKToNCiAgICAgICAgc2VsZi5fbGV2ZWwgLT0gMQ0KICAg
ICAgICBpZiBzZWxmLl9oYW5kbGVyOg0KICAgICAgICAgICAgc2VsZi5faGFu
ZGxlci5lbmRFbGVtZW50KG5hbWUpDQogICAgICAgIGlmIHNlbGYuX2VsX2xl
dmVsID09IHNlbGYuX2xldmVsOg0KICAgICAgICAgICAgc2VsZi5faGFuZGxl
ci5lbmREb2N1bWVudCgpDQogICAgICAgICAgICBzZWxmLmVsZW1lbnRzLmFw
cGVuZChzZWxmLl9oYW5kbGVyLmVsdCkNCiAgICAgICAgICAgIHNlbGYuX2hh
bmRsZXIgPSBOb25lDQoNCiAgICBkZWYgZW5kRWxlbWVudE5TKHNlbGYsIG5h
bWUsIHFuYW1lKToNCiAgICAgICAgc2VsZi5fbGV2ZWwgLT0gMQ0KICAgICAg
ICBwcmludCAnZW5kTlMgJXMgKHF1YWxpZmllZDogJXMpJyAlIChuYW1lLCBx
bmFtZSkNCiAgICAgICAgaWYgc2VsZi5faGFuZGxlcjoNCiAgICAgICAgICAg
IGlmIHFuYW1lIGlzIE5vbmU6IHFuYW1lID0gJycNCiAgICAgICAgICAgIHNl
bGYuX2hhbmRsZXIuZW5kRWxlbWVudE5TKG5hbWUsIHFuYW1lKQ0KICAgICAg
ICBpZiBzZWxmLl9lbF9sZXZlbCA9PSBzZWxmLl9sZXZlbDoNCiAgICAgICAg
ICAgIHNlbGYuX2hhbmRsZXIuZW5kRG9jdW1lbnQoKQ0KICAgICAgICAgICAg
c2VsZi5lbGVtZW50cy5hcHBlbmQoc2VsZi5faGFuZGxlci5lbHQpDQogICAg
ICAgICAgICBzZWxmLl9oYW5kbGVyID0gTm9uZQ0KDQogICAgDQojICAjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjDQpjbGFzcyBBTFJlY2lwZVNh
eEhhbmRsZXIoQ29udGVudEhhbmRsZXIpOg0KICAgIGZyb20gQUxBYnN0cmFj
dGlvbiBpbXBvcnQgU3RlcCwgVHJhbnNpdGlvbg0KICAgIGRlZiBfX2luaXRf
XyhzZWxmKToNCiAgICAgICAgc2VsZi5fbm9kZV9zdGFjayA9IFtdDQogICAg
ICAgIHNlbGYuZWx0ID0gUmVjaXBlRWxlbWVudCgpDQogICAgICAgIA0KICAg
IGRlZiBzdGFydEVsZW1lbnQoc2VsZiwgbmFtZSwgYXR0cnMpOg0KICAgICAg
ICBpZiBuYW1lID09ICdzdGVwJzoNCiAgICAgICAgICAgIHNlbGYucyA9IFN0
ZXAoKQ0KICAgICAgICBlbGlmIG5hbWUgPT0gJ3RyYW5zaXRpb24nOg0KICAg
ICAgICAgICAgc2VsZi50ID0gVHJhbnNpdGlvbigpDQogICAgZGVmIHN0YXJ0
RWxlbWVudE5TKHNlbGYsIG5hbWUsIHFuYW1lLCBhdHRycyk6DQogICAgICAg
IHBhc3MNCiAgICBkZWYgZW5kRWxlbWVudChzZWxmLCBuYW1lKToNCiAgICAg
ICAgcGFzcw0KICAgICAgICAjc2VsZi5fbm9kZV9zdGFjay5wb3AoKQ0KICAg
IGRlZiBlbmRFbGVtZW50TlMoc2VsZiwgbmFtZSwgcW5hbWUsIGF0dHJzKToN
CiAgICAgICAgcGFzcw0KDQogICAgZGVmIGNoYXJhY3RlcnMoc2VsZiwgY2gp
Og0KICAgICAgICBwYXNzDQogICAgDQojICAjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjDQpjbGFzcyBBTEFjdGlvblNheEhhbmRsZXIoQ29udGVu
dEhhbmRsZXIpOg0KDQogICAgZGVmIF9faW5pdF9fKHNlbGYpOg0KICAgICAg
ICBzZWxmLl9ub2RlX3N0YWNrID0gW10NCiAgICAgICAgc2VsZi5lbHQgPSBB
Y3Rpb25FbGVtZW50KCkNCg0KICAgIGRlZiBzdGFydEVsZW1lbnQoc2VsZiwg
bmFtZSwgYXR0cnMpOg0KICAgICAgICBwYXNzDQoNCiAgICBkZWYgZW5kRWxl
bWVudChzZWxmLCBuYW1lKToNCiAgICAgICAgcGFzcw0KICAgICAgICAjc2Vs
Zi5fbm9kZV9zdGFjay5wb3AoKQ0KDQogICAgZGVmIHN0YXJ0RWxlbWVudE5T
KHNlbGYsIG5hbWUsIHFuYW1lLCBhdHRycyk6DQogICAgICAgIHBhc3MNCiAg
ICBkZWYgZW5kRWxlbWVudE5TKHNlbGYsIG5hbWUsIHFuYW1lLCBhdHRycyk6
DQogICAgICAgIHBhc3MNCiAgICBkZWYgY2hhcmFjdGVycyhzZWxmLCBjaCk6
DQogICAgICAgIHBhc3MNCiAgICANCiMgICMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMNCmNsYXNzIEFMVHJhbnNmb3JtU2F4SGFuZGxlcihDb250
ZW50SGFuZGxlcik6DQoNCiAgICBkZWYgX19pbml0X18oc2VsZik6ICMsIGRv
bV9oYW5kbGVyKToNCiAgICAgICAgc2VsZi5fbm9kZV9zdGFjayA9IFtdDQog
ICAgICAgICNzZWxmLmRvbV9oYW5kbGVyID0gZG9tX2hhbmRsZXINCiAgICAg
ICAgc2VsZi5lbHQgPSBUcmFuc2Zvcm1FbGVtZW50KCkNCg0KICAgIGRlZiBz
dGFydEVsZW1lbnQoc2VsZiwgbmFtZSwgYXR0cnMpOg0KICAgICAgICBwYXNz
DQogICAgDQogICAgZGVmIGVuZEVsZW1lbnQoc2VsZiwgbmFtZSk6DQogICAg
ICAgIHBhc3MNCiAgICBkZWYgc3RhcnRFbGVtZW50TlMoc2VsZiwgbmFtZSwg
cW5hbWUsIGF0dHJzKToNCiAgICAgICAgcGFzcw0KICAgIGRlZiBlbmRFbGVt
ZW50TlMoc2VsZiwgbmFtZSwgcW5hbWUsIGF0dHJzKToNCiAgICAgICAgcGFz
cw0KDQoNCiMgICMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMj
IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMNCmlm
IF9fbmFtZV9fID09ICdfX21haW5fXyc6DQogICAgaW1wb3J0IHN5cw0KICAg
IGYgPSBvcGVuKHN5cy5hcmd2WzFdLCAncicpDQogICAgZnJvbSB4bWwuc2F4
IGltcG9ydCBtYWtlX3BhcnNlcg0KDQogICAgZnJvbSB4bWwucGFyc2VycyBp
bXBvcnQgZXhwYXQNCiAgICBmcm9tIHhtbC5zYXguaGFuZGxlciBpbXBvcnQg
ZmVhdHVyZV9uYW1lc3BhY2VzLCBcDQogICAgICAgICBmZWF0dXJlX3ZhbGlk
YXRpb24sIGZlYXR1cmVfbmFtZXNwYWNlX3ByZWZpeGVzDQoNCiAgICBoYW5k
bGVyID0gQUxTYXhIYW5kbGVyKGxldmVsPTEpDQoNCiAgICBmcm9tIHhtbC5z
YXggaW1wb3J0IG1ha2VfcGFyc2VyDQogICAgZnJvbSB4bWwuc2F4LmhhbmRs
ZXIgaW1wb3J0IGZlYXR1cmVfZXh0ZXJuYWxfZ2VzDQogICAgcCA9IG1ha2Vf
cGFyc2VyKFsieG1sLnNheC5kcml2ZXJzMi5kcnZfeG1scHJvYyJdKQ0KICAg
IHAuc2V0RmVhdHVyZShmZWF0dXJlX25hbWVzcGFjZXMsIDEpDQogICAgI3Au
c2V0RmVhdHVyZShmZWF0dXJlX25hbWVzcGFjZV9wcmVmaXhlcywgMSkNCiAg
ICBwLnNldEZlYXR1cmUoZmVhdHVyZV92YWxpZGF0aW9uLCAwKQ0KICAgIHAu
c2V0RmVhdHVyZShmZWF0dXJlX2V4dGVybmFsX2dlcywxKQ0KICAgIHAuc2V0
Q29udGVudEhhbmRsZXIoaGFuZGxlcikNCiAgICBwLnBhcnNlKGYpDQogICAg
DQo=
---1463757823-664713576-1004441940=:741--


From Sylvain.Thenault@logilab.fr  Tue Oct 30 11:44:39 2001
From: Sylvain.Thenault@logilab.fr (Sylvain Thenault)
Date: Tue, 30 Oct 2001 12:44:39 +0100 (CET)
Subject: [XML-SIG] pb with sax2 and name spaces
In-Reply-To: <Pine.LNX.4.21.0110301227210.741-100000@cygnus.logilab.fr>
Message-ID: <Pine.LNX.4.21.0110301243420.741-100000@cygnus.logilab.fr>

On Tue, 30 Oct 2001, Sylvain Thenault wrote:

> On 29 Oct 2001, Lars Marius Garshol wrote:
> 
> > 
> > * Sylvain Thenault
> > | 
> > | I'm trying to write a sax2 handler, but when I want to parse an xml file
> > | using :
> > 
> > Actually, it seems that you are using pDomlette, rather than making
> > your own SAX 2.0 handler.
> > You don't say where this "handler" is coming from, but from your
> > traceback it seems to be the pDomlette DOM builder.
> 
> in fact, I have a sax2 handler wich may delegate works to another handler,
> initialized on elements which are on a given level in the xml
> tree. Depending on the node name, the handler shoud be a pDomlette handler
> or not (I want to produce different objects from a single xml document).
> I have attached the parser.py file to this mail if someone want to take a
> look ...
>  
> > |   File "parser.py", line 64, in startElementNS
> > |     self._handler.startElementNS(name, qname, attrs)
> > |   File "/home/syt/lib/python2.1/site-packages/Ft/Xml/pDomletteReader.py",
> > | line 238, in startElementNS
> > |     (foo,bar, baz) = self._handleStartElementNss(name, attribs)
> > |   File "/home/syt/lib/python2.1/site-packages/Ft/Xml/ReaderBase.py", line
> > | 225, in _handleStartElementNss
> > |     raise "Namespaces in validating docs not supported"
> > | Namespaces in validating docs not supported    
> > 
> > What version of 4Suite are you using? 
> 
> I have the same result with 0.11.1 and a cvssnapshot from 2 weeks ago.
> 
> > | (notice I have disabled validation !)
> > 
> > You have, and so this looks very much like a pDomlette bug to me.
> 
> I wonder if i didn't believe something in the handler initialisation...
                       ^^^^^^^ I mean _forgive_, of course

-- 
Sylvain Thenault

  LOGILAB           http://www.logilab.org


From larsga@garshol.priv.no  Tue Oct 30 12:07:16 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 30 Oct 2001 13:07:16 +0100
Subject: [XML-SIG] pb with sax2 and name spaces
In-Reply-To: <Pine.LNX.4.21.0110301243420.741-100000@cygnus.logilab.fr>
References: <Pine.LNX.4.21.0110301243420.741-100000@cygnus.logilab.fr>
Message-ID: <m3r8rll4az.fsf@lambda.garshol.priv.no>

* Sylvain Thenault
|
| > I wonder if i didn't believe something in the handler initialisation...
|                        ^^^^^^^ I mean _forgive_, of course

Actually, I'm beginning to suspect that you are confusing the words
"forgive" and "forget". 

To forgive means to pardon for a wrong someone has done, "to give up
resentment" when presented with an excuse.

To forget means to overlook something one should have done or to no
longer remember something one used to know.

Probably it was the latter you meant, or else there was a really
serious bug somewhere. :-)

--Lars M.


From Sylvain.Thenault@logilab.fr  Tue Oct 30 12:22:44 2001
From: Sylvain.Thenault@logilab.fr (Sylvain Thenault)
Date: Tue, 30 Oct 2001 13:22:44 +0100 (CET)
Subject: [XML-SIG] pb with sax2 and name spaces
In-Reply-To: <m3r8rll4az.fsf@lambda.garshol.priv.no>
Message-ID: <Pine.LNX.4.21.0110301321340.741-100000@cygnus.logilab.fr>

On 30 Oct 2001, Lars Marius Garshol wrote:

> 
> * Sylvain Thenault
> |
> | > I wonder if i didn't believe something in the handler initialisation...
> |                        ^^^^^^^ I mean _forgive_, of course
> 
> Actually, I'm beginning to suspect that you are confusing the words
> "forgive" and "forget". 
> 
> To forgive means to pardon for a wrong someone has done, "to give up
> resentment" when presented with an excuse.
> 
> To forget means to overlook something one should have done or to no
> longer remember something one used to know.
> 
> Probably it was the latter you meant, or else there was a really
> serious bug somewhere. :-)
>
you're right ! I have to take my english/french dictionnary  ;)

> --Lars M.
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig
> 

-- 
Sylvain Thenault

  LOGILAB           http://www.logilab.org


From faassen@vet.uu.nl  Tue Oct 30 13:20:54 2001
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Tue, 30 Oct 2001 14:20:54 +0100
Subject: [XML-SIG] Is this a bug?
In-Reply-To: <15325.34513.890857.333427@grendel.zope.com>
References: <200110261540.f9QFeM329836@chinon.cnrs-orleans.fr> <200110261840.f9QIeWl01358@mira.informatik.hu-berlin.de> <20011027004454.A15534@vet.uu.nl> <200110270853.f9R8reN04087@mira.informatik.hu-berlin.de> <20011029143623.A22623@vet.uu.nl> <15325.34513.890857.333427@grendel.zope.com>
Message-ID: <20011030142054.A26159@vet.uu.nl>

Fred L. Drake, Jr. wrote:
> 
> Martijn Faassen writes:
>  > namespaceURI of type DOMString, readonly, introduced in DOM Level 2
> ...
>  > I actually appreciate the explictness of namespaces in the DOM, even
>  > though there's a mismatch with usage in XML. It simplifies implementation
> 
>   The catch with this is that the DOM ends up offering no integrity
> assurances internally.  I can see where this would be a pain for some
> applications, but it's really impossible to add this --- more
> applications benefit from the free to make complicated edits and then
> force a check (using normalizeNS() or other methods added in DOM Level
> 3) when the edits are complete.

Right, I agree. That's the 'mismatch' I was talking about; you can't really
check whether a prefix really refers to a declared namespace currently,
etc.
 
>  > a lot, which is badly needed as so many parts of the spec *complicate*
>  > implementation (liveness issues are just one example).
> 
>   Those are a nice can of worms, aren't they?  I know how to address
> them in ParsedXML.DOM, and plan to do so, but no time is currently
> scheduled for that.

I'm willing to live with 'dead' getElementsByTagName eternally; liveness
was a serious design bug in my opinion (for backwards compatibility I
believe, but still).

>  > Zope's ParsedXML has a large DOM unit test suite which can be run against
>  > 4DOM as well. I've recently been advocating getting this test suite out
>  > of ParsedXML and into PyXML instead. This way we can make sure the Python
>  > DOM implementations are up to spec much better.
> 
>   I'll take this as an opportunity to voice my support of any
> initiative to split the ParsedXML DOM tests into a separate package so
> that it may be more easily used without having to grab a ParsedXML
> distribution.  Martijn Pieters put a *lot* of good work into those
> tests, and we pulled a number of clarifications from the W3C to
> achieve it.

Agreed, this is just too good a piece of work to let it idle away hidden
inside ParsedXML.
 
>  > Of course the test suite can contain wrong interpretations of the DOM
>  > spec as well, but a lot of care was taken during the development of it,
>  > and it can be further improved should problems appear.
> 
>   Yet another reason to split it out from ParsedXML, given the
> availability of time to work on that project.

Okay, I'm glad I have your support! So, where should I be sending this
code? :)

Regards,

Martijn


From stuartd@alerton.com  Tue Oct 30 17:32:21 2001
From: stuartd@alerton.com (Stuart Donaldson)
Date: Tue, 30 Oct 2001 09:32:21 -0800
Subject: [XML-SIG] Attaching files sent to the list.
Message-ID: <A19EEC21DB90D411B40900D0B7B4F8E7369434@alermx.alerton.com>

Please do not "attach" files such that they are mime-encoded to the list.

For those of us that get this mailing list in a digest format, this makes it
much more difficult to see the file. We get a mail message with a bunch of
gibberish included inline.

Thanks...
-S-


From uche.ogbuji@fourthought.com  Tue Oct 30 20:15:17 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 13:15:17 -0700
Subject: [XML-SIG] Tutorials on 4Suite/PyXML DOM and XSLT
Message-ID: <3BDF0A55.3F0DABF4@fourthought.com>

By Chimezie and me.  Free registration required at IBM developerWorks.

http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/28BEDEE3E7219EB386256AE300743B69?OpenDocument
http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/BE1A7E60838F9F7686256AF400523C58?OpenDocument

There will be more of these tutorials, pretty much covering most of
4Suite by the time we're done.

Please do pass this on.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From uche.ogbuji@fourthought.com  Tue Oct 30 20:37:53 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 13:37:53 -0700
Subject: [XML-SIG] Attribute namespace bug
In-Reply-To: Message from "Juergen Hermann" <jh@web.de>
 of "Fri, 31 Aug 2001 20:42:23 +0200." <m15ctEg-007psuC@smtp.web.de>
Message-ID: <200110302037.f9UKbre04986@localhost.localdomain>

> Hi!
> 
> This fixes the bug I found, can I commit it?
> 
> Index: saxutils.py
> ===================================================================
> RCS file: /cvsroot/pyxml/xml/xml/sax/saxutils.py,v
> retrieving revision 1.20
> diff -u -r1.20 saxutils.py
> --- saxutils.py 2001/07/19 16:15:44     1.20
> +++ saxutils.py 2001/08/31 18:40:09
> @@ -205,7 +205,13 @@
>          self._undeclared_ns_maps = []
> 
>          for (name, value) in attrs.items():
> -            name = self._current_context[name[0]] + ":" + name[1]
> +            if name[0] is None:
> +                name = name[1]
> +            elif self._current_context[name[0]] is None:
> +                # default namespace
> +                name = name[1]
> +            else:
> +                name = self._current_context[name[0]] + ":" + name[1]
>              self._out.write(' %s="%s"' % (name, escape(value)))
>          self._out.write('>')

Whoa!  I'm way behind, and still catching up, but I hope someone stopped you 
from committing this.  You would have introduced a bug because unqualified 
attributes are *not* in the default namespace.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From Juergen Hermann" <jh@web.de  Tue Oct 30 20:53:44 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Tue, 30 Oct 2001 21:53:44 +0100
Subject: [XML-SIG] Attribute namespace bug
In-Reply-To: <200110302037.f9UKbre04986@localhost.localdomain>
Message-ID: <E15yfrM-0006E9-00@smtp.web.de>

On Tue, 30 Oct 2001 13:37:53 -0700, Uche Ogbuji wrote:

>Whoa!  I'm way behind, and still catching up, but I hope someone stoppe=
d you 
>from committing this.  You would have introduced a bug because unqualif=
ied 
>attributes are *not* in the default namespace.

Too late. ;)

>          for (name, value) in attrs.items():
> -            name =3D self._current_context[name[0]] + ":" + name[1]
> +            if name[0] is None:
> +                name =3D name[1]
> +            elif self._current_context[name[0]] is None:
> +                # default namespace
> +                name =3D name[1]
> +            else:
> +                name =3D self._current_context[name[0]] + ":" + name[=
1]
>              self._out.write(' %s=3D"%s"' % (name, escape(value)))

But do you agree that it is wrong to always try to echo a namespace pref=
ix, 
even if an attribute originally has none?

Maybe just my comment is wrong (s/# default namespace/# no namespace/), =
or the 
elif is wrong, but the assumption that "self._current_context.has_key(No=
ne)" is 
always true is also wrong.


Ciao, J=FCrgen


From uche.ogbuji@fourthought.com  Tue Oct 30 21:02:02 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 14:02:02 -0700
Subject: [XML-SIG] Attribute namespace bug
In-Reply-To: Message from "Juergen Hermann" <jh@web.de>
 of "Tue, 30 Oct 2001 21:53:44 +0100." <E15yfrM-0006E9-00@smtp.web.de>
Message-ID: <200110302102.f9UL22F05220@localhost.localdomain>

> On Tue, 30 Oct 2001 13:37:53 -0700, Uche Ogbuji wrote:
> =

> >Whoa!  I'm way behind, and still catching up, but I hope someone stopp=
ed you =

> >from committing this.  You would have introduced a bug because unquali=
fied =

> >attributes are *not* in the default namespace.
> =

> Too late. ;)
> =

> >          for (name, value) in attrs.items():
> > -            name =3D self._current_context[name[0]] + ":" + name[1]
> > +            if name[0] is None:
> > +                name =3D name[1]
> > +            elif self._current_context[name[0]] is None:
> > +                # default namespace
> > +                name =3D name[1]
> > +            else:
> > +                name =3D self._current_context[name[0]] + ":" + name=
[1]
> >              self._out.write(' %s=3D"%s"' % (name, escape(value)))
> =

> But do you agree that it is wrong to always try to echo a namespace pre=
fix, =

> even if an attribute originally has none?
> =

> Maybe just my comment is wrong (s/# default namespace/# no namespace/),=
 or the =

> elif is wrong, but the assumption that "self._current_context.has_key(N=
one)" is =

> always true is also wrong.

You're right that your fix is mostly right, but you didn't quite go far e=
nough.

I've pored over the code a bit, and here's what I'm about to check in.  W=
hat =

do you think?

diff -u -r1.23 saxutils.py
--- xml/sax/saxutils.py	2001/09/27 21:42:28	1.23
+++ xml/sax/saxutils.py	2001/10/30 21:00:45
@@ -155,6 +155,8 @@
     def _outputwrapper(stream,encoding):
         return stream
 =

+GENERATED_PREFIX =3D "genprefix%s"
+
 class XMLGenerator(handler.ContentHandler):
 =

     def __init__(self, out=3DNone, encoding=3D"iso-8859-1"):
@@ -167,6 +169,8 @@
         self._current_context =3D self._ns_contexts[-1]
         self._undeclared_ns_maps =3D []
         self._encoding =3D encoding
+        self._generated_prefix_ctr =3D 0
+        return
 =

     # ContentHandler methods
 =

@@ -214,7 +218,13 @@
                 name =3D name[1]
             elif self._current_context[name[0]] is None:
                 # default namespace
-                name =3D name[1]
+                #If an attribute has a nsuri but not a prefix, we must
+                #create a prefix and add a nsdecl
+                prefix =3D GENERATED_PREFIX % self._generated_prefix_ctr=

+                self._generated_prefix_ctr =3D self._generated_prefix_ct=
r + 1
+                name =3D prefix + ':' + name[1]
+                self._out.write(' xmlns:%s=3D%s' % (prefix, name[0]))
+                self._current_context[name[0]] =3D prefix
             else:
                 name =3D self._current_context[name[0]] + ":" + name[1]
             self._out.write(' %s=3D%s' % (name, quoteattr(value)))


-- =

Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com =

4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From uche.ogbuji@fourthought.com  Tue Oct 30 21:16:23 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 14:16:23 -0700
Subject: [XML-SIG] 4XSLT Performance Problems with Large Files
In-Reply-To: Message from "Thomas B. Passin" <tpassin@home.com>
 of "Sat, 15 Sep 2001 01:07:15 EDT." <000301c13da4$4a211ee0$7cac1218@reston1.va.home.com>
Message-ID: <200110302116.f9ULGNf05255@localhost.localdomain>

> Here is the memory used by the various processors during processing.
> 
> processor       decrease in free memory, MB
> msxsl                    17
> saxon                    21
> 4xslt.py                 32                   (Py 1.5.2)
> 4xslt.py                 45                   (Py 2.1.1)
> 4xst.bat              > 194
> 
> The 167 seconds using my script is not acceptable for my particular
> application, but the behavior when the transformation is launched by
> 4xslt.bat is impossible.  Why should the very same transformation take ten
> times the memory that msxsl or Saxon use?  You can't have an application run
> down your memory like this.  And I don't even know how much virtual memory
> was used on top of the 194 MB.  These results have been reasonably
> repeatable tonight.
> 
> I hope something can be done to improve the performance and memory usage for
> large files.  How about it, Uche and Mike?  Any thoughts about what is
> happening here?

I'm late to this as well, but I bet in your 4xslt.py files you used cDomlette 
rather than pDomlette.  pDomlette uses ridiculous amounts of memory.  
cDomlette is much better behaved.  cDomlette will be the default in 4Suite 
0.12.0, and pDomlette will probably be merged with minidom or eliminated.


> I'll be happy to send my files for testing if anyone likes.  The source file
> is pretty horrid ( I don't have any control over that, I'm afraid).  It has
> very long paths, and the element names are extremely long, the result of
> machine translation of some CORBA IDL.

This would be great for testing.  Please do send along.

Thanks.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From Juergen Hermann" <jh@web.de  Tue Oct 30 21:28:44 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Tue, 30 Oct 2001 22:28:44 +0100
Subject: [XML-SIG] Attribute namespace bug
In-Reply-To: <200110302102.f9UL22F05220@localhost.localdomain>
Message-ID: <E15ygPI-0002kJ-00@smtp.web.de>

On Tue, 30 Oct 2001 14:02:02 -0700, Uche Ogbuji wrote:

>--- xml/sax/saxutils.py	2001/09/27 21:42:28	1.23
>+++ xml/sax/saxutils.py	2001/10/30 21:00:45
>@@ -155,6 +155,8 @@
>     def _outputwrapper(stream,encoding):
>         return stream
> 
>+GENERATED_PREFIX =3D "genprefix%s"
>+
> class XMLGenerator(handler.ContentHandler):

=3D=3D=3D> use class attribute, less namespace pollution, 
use self.GENERATED_PREFIX below!

class XMLGenerator(handler.ContentHandler):
  	GENERATED_PREFIX =3D "genprefix%d"

Also, do people realize where "genprefix" comes from when 
it "suddenly pops up"?! Thus, maybe:

  	GENERATED_PREFIX =3D "xml.sax.saxutils.prefix%d"

(it _is_ long, but it should occur in rare cases only anyway)

>             elif self._current_context[name[0]] is None:
>                 # default namespace
>-                name =3D name[1]
>+                #If an attribute has a nsuri but not a prefix, we must=

>+                #create a prefix and add a nsdecl
>+                prefix =3D GENERATED_PREFIX % self._generated_prefix_c=
tr
>+                self._generated_prefix_ctr =3D self._generated_prefix_=
ctr + 1
>+                name =3D prefix + ':' + name[1]
>+                self._out.write(' xmlns:%s=3D%s' % (prefix, name[0]))

This looks like a bug to me, fix:

                  self._out.write(' xmlns:%s=3D%s' % (prefix, quoteattr(=
name[0])))

>+                self._current_context[name[0]] =3D prefix
>             else:
>                 name =3D self._current_context[name[0]] + ":" + name[1=
]
>             self._out.write(' %s=3D%s' % (name, quoteattr(value)))


Ciao, J=FCrgen


From uche.ogbuji@fourthought.com  Tue Oct 30 21:38:31 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 14:38:31 -0700
Subject: [XML-SIG] Attribute namespace bug
In-Reply-To: Message from "Juergen Hermann" <jh@web.de>
 of "Tue, 30 Oct 2001 22:28:44 +0100." <E15ygPI-0002kJ-00@smtp.web.de>
Message-ID: <200110302138.f9ULcWc05387@localhost.localdomain>

Checked in with your suggestions.  Thanks.

--Uche


> On Tue, 30 Oct 2001 14:02:02 -0700, Uche Ogbuji wrote:
> =

> >--- xml/sax/saxutils.py	2001/09/27 21:42:28	1.23
> >+++ xml/sax/saxutils.py	2001/10/30 21:00:45
> >@@ -155,6 +155,8 @@
> >     def _outputwrapper(stream,encoding):
> >         return stream
> > =

> >+GENERATED_PREFIX =3D "genprefix%s"
> >+
> > class XMLGenerator(handler.ContentHandler):
> =

> =3D=3D=3D> use class attribute, less namespace pollution, =

> use self.GENERATED_PREFIX below!
> =

> class XMLGenerator(handler.ContentHandler):
>   	GENERATED_PREFIX =3D "genprefix%d"
> =

> Also, do people realize where "genprefix" comes from when =

> it "suddenly pops up"?! Thus, maybe:
> =

>   	GENERATED_PREFIX =3D "xml.sax.saxutils.prefix%d"
> =

> (it _is_ long, but it should occur in rare cases only anyway)
> =

> >             elif self._current_context[name[0]] is None:
> >                 # default namespace
> >-                name =3D name[1]
> >+                #If an attribute has a nsuri but not a prefix, we mus=
t
> >+                #create a prefix and add a nsdecl
> >+                prefix =3D GENERATED_PREFIX % self._generated_prefix_=
ctr
> >+                self._generated_prefix_ctr =3D self._generated_prefix=
_ctr + 1
> >+                name =3D prefix + ':' + name[1]
> >+                self._out.write(' xmlns:%s=3D%s' % (prefix, name[0]))=

> =

> This looks like a bug to me, fix:
> =

>                   self._out.write(' xmlns:%s=3D%s' % (prefix, quoteattr=
(name[0])))
> =

> >+                self._current_context[name[0]] =3D prefix
> >             else:
> >                 name =3D self._current_context[name[0]] + ":" + name[=
1]
> >             self._out.write(' %s=3D%s' % (name, quoteattr(value)))
> =

> =

> Ciao, J=FCrgen
> =

> =

> =

> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig
> =


From uche.ogbuji@fourthought.com  Tue Oct 30 21:42:03 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 14:42:03 -0700
Subject: [XML-SIG] Using xpath/xslt on proprietary object structures.
In-Reply-To: Message from Alexandre Fayolle <Alexandre.Fayolle@logilab.fr>
 of "Thu, 20 Sep 2001 17:42:08 +0200." <Pine.LNX.4.21.0109201741040.22365-100000@orion.logilab.fr>
Message-ID: <200110302142.f9ULg3I05407@localhost.localdomain>

> On Thu, 20 Sep 2001, Thomas B. Passin wrote:
> 
> > [Alexandre Fayolle]
> > >
> > > You don't give much details on what your application is made of. However,
> > > if you're using 4DOM, try switching to pDomlette (or even to cDomlette),
> > > which are both more lightweight and faster implementations of DOM (though
> > > not as compliant as 4DOM)
> > >
> > Considering the memory usage issues with pDomlette that I posted about last
> > week, I'd say you would want to try cDomlette instead, based on what Alan
> > Kennedy said about his system.
> 
> Sure, but, AFAIK, cDomlette is still read-only, so if he needs to change
> things in the DOM during processing, cDomlette won't let him. 

Just so everyone knows, cDomlette will be read/write (well, with limitations) 
in 4Suite 0.12.0.  This is Chimezie's main task.  My own main task is 
completing the external entity and XInclude support, and building in some very 
exciting XPath facilities that will result in some huge speed-ups in 4Suite.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From tpassin@home.com  Tue Oct 30 22:17:04 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Tue, 30 Oct 2001 17:17:04 -0500
Subject: [XML-SIG] 4XSLT Performance Problems with Large Files
References: <200110302116.f9ULGNf05255@localhost.localdomain>
Message-ID: <003801c16190$9b027920$7cac1218@cj64132b>

[Uche Ogbuji]

>
> > Here is the memory used by the various processors during processing.
> >
> > processor       decrease in free memory, MB
> > msxsl                    17
> > saxon                    21
> > 4xslt.py                 32                   (Py 1.5.2)
> > 4xslt.py                 45                   (Py 2.1.1)
> > 4xst.bat              > 194
> >
> > The 167 seconds using my script is not acceptable for my particular
> > application, but the behavior when the transformation is launched by
> > 4xslt.bat is impossible.  Why should the very same transformation take
ten
> > times the memory that msxsl or Saxon use?  You can't have an application
run
> > down your memory like this.  And I don't even know how much virtual
memory
> > was used on top of the 194 MB.  These results have been reasonably
> > repeatable tonight.
> >
> > I hope something can be done to improve the performance and memory usage
for
> > large files.  How about it, Uche and Mike?  Any thoughts about what is
> > happening here?
>
> I'm late to this as well, but I bet in your 4xslt.py files you used
cDomlette
> rather than pDomlette.  pDomlette uses ridiculous amounts of memory.
> cDomlette is much better behaved.  cDomlette will be the default in 4Suite
> 0.12.0, and pDomlette will probably be merged with minidom or eliminated.
>
>

Yes, I later verified that it was pDomlette that was associated with the
disasterously large memory usage and low speed.

Cheers,

Tom P


From Juergen Hermann" <jh@web.de  Tue Oct 30 23:38:30 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Wed, 31 Oct 2001 00:38:30 +0100
Subject: [XML-SIG] Binary files in CVS
Message-ID: <E15yiQt-00062P-00@smtp.web.de>

Hi!

When you add binary data to the repository, please don't forget the -kb =

flag. I used "cvs admin -kb" on these:

doc/xmlproc/basicapi.gif
doc/xmlproc/cmdline.gif
doc/xmlproc/wxval.gif
xml/dom/de/LC_MESSAGES/4Suite.mo
xml/dom/fr_FR/LC_MESSAGES/4Suite.mo


Ciao, J=FCrgen


From uche.ogbuji@fourthought.com  Wed Oct 31 02:27:14 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 19:27:14 -0700
Subject: [XML-SIG] Using xpath/xslt on proprietary object structures.
In-Reply-To: Message from Xhaus Main Account <pyxml@xhaus.com>
 of "Fri, 21 Sep 2001 09:01:19 BST." <3BAAF3CF.83F2A749@xhaus.com>
Message-ID: <200110310227.f9V2REA07568@localhost.localdomain>

> Thanks for the tips Alexander.
> 
> However, from what I can see, pDomlette and cDomlette are not suitable for my
> needs due to incomplete DOM support (I'll briefly explain my app below).

[snip]

> Brief summary of the app:
> 
> I have ~250 members of a scientific association, all of whom have contact
> details, and lists of publications. I have to generate a home page for each
> member, "directory lists" for all members by name, by country (of which there
> are 24) and by "Research speciality", of which there 9.

[snip]

Looks as if most of this can be done using XInclude and XSLT.  It is an 
interesting usage pattern indeed.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From uche.ogbuji@fourthought.com  Wed Oct 31 02:40:11 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 19:40:11 -0700
Subject: [XML-SIG] Moving exceptions under Exception in yappsrt.py
In-Reply-To: Message from "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
 of "Wed, 03 Oct 2001 18:47:37 +0200." <200110031647.f93Glbq04571@mira.informatik.hu-berlin.de>
Message-ID: <200110310240.f9V2eBf07607@localhost.localdomain>

> > I sent mail to the YAPPS author a couple of days ago but haven't get got
> > a reply.  Any object to moving the SyntaxError classes in yappsrt.py and
> > pyxpath.py so that they inherit from Exception?
> 
> No, go ahead. Please note that I have just committed the 0.11.1
> changes. In the process, I noticed two things:
> - the XPathParser is now a pure Python (even though a generated file);
>   so it is debatable whether pyxpath should be maintained (although
>   I'm in favour of that).
> - your changes to include CDATA_SECTION in a couple of places apparently
>   have not been integrated into 4Suite. I'll try to maintain them after
>   each merge, there is always the potential that they'll break unless
>   they get synchronized with 4Suite.

I'd like to see these diffs in order to evaluate them.  I think I remember 
being uncertain of the overall idea.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From Juergen Hermann" <jh@web.de  Wed Oct 31 03:01:44 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Wed, 31 Oct 2001 04:01:44 +0100
Subject: [XML-SIG] Moving exceptions under Exception in yappsrt.py
In-Reply-To: <200110310240.f9V2eBf07607@localhost.localdomain>
Message-ID: <E15ylbs-0007u0-00@smtp.web.de>

On Tue, 30 Oct 2001 19:40:11 -0700, Uche Ogbuji wrote:

>> - your changes to include CDATA_SECTION in a couple of places...

As a general note, while we're talking about CDATA, please use code 
like this:

'<![CDATA[%s]]>' % string.replace(markup, ']]>', ']]>]]&gt;<![CDATA[') 

to emit "markup" as a CDATA section. Most people tend to forget to 
handle the case that ]]> appears in the text.


Ciao, J=FCrgen


From uche.ogbuji@fourthought.com  Wed Oct 31 03:23:10 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 30 Oct 2001 20:23:10 -0700
Subject: [XML-SIG] FromXml question
In-Reply-To: Message from markus jais <info@mjais.de>
 of "Sun, 28 Oct 2001 10:49:05 +0100." <E15xma3-0006Nz-00@mrvdom01.schlund.de>
Message-ID: <200110310323.f9V3NAJ07812@localhost.localdomain>

> hello
> 
> I just downloaded the new PyXML tutorial about DOM from IBM's developerworks
> and read, that 
> FromXmlStream
> FromXml
> FromXmlFile
> FromXmlUrl

Yikes!!!!

OK, one historical note that I was going to keep under wraps:

Chime and I actually wrote these tutorials last year.  The editors of IBM dW 
were just able to get it all vetted and posted last week.  We did browse 
through it to check that there was nothing that had changed in the interim, 
but it looks as if we missed this part.

We'll ask the editors to fix it.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From pyxml@xhaus.com  Wed Oct 31 08:40:46 2001
From: pyxml@xhaus.com (Alan Kennedy)
Date: Wed, 31 Oct 2001 08:40:46 +0000
Subject: [XML-SIG] Using xpath/xslt on proprietary object structures.
References: <200110310227.f9V2REA07568@localhost.localdomain>
Message-ID: <3BDFB90E.28EF5157@xhaus.com>

Uche Ogbuji wrote:

> Looks as if most of this can be done using XInclude and XSLT.  It is an
> interesting usage pattern indeed.

Uche, is that "interesting" as in "interesting problem to solve" or "interesting"
as in "what a poor solution you've chosen" :-)

If it's the former, check www.paratuberculosis.org to see the output of the current
implementation. If it's the latter, I know I've currently got a *very* poor design.
I can only plead lack of time and still incomplete knowledge of the way to
pythonically process XML. Also, I originally wrote all this processing using the
MSXSL beta from Fall 1998 (anyone remember that?). But I'm learning fast, thanks to
the excellence of python and the excellence of the available XML tools.

Actually, on Alexandre's suggestion, I've started looking into 4Suite Server, and
I'm mightily impressed.

When I get the time, I'm going to migrate over to 4SS. It's a very powerful suite
of software.

Also, I'm intrigued by the possibility of a read/write cDomlette. Do you have any
idea when 4Suite 0.12 might be available?

Many thanks to all in the Python/XML initiative,

Alan Kennedy.


From m_mariappanX@trillium.com  Wed Oct 31 09:23:12 2001
From: m_mariappanX@trillium.com (Mariappan, MaharajanX)
Date: Wed, 31 Oct 2001 04:23:12 -0500
Subject: [XML-SIG] newbie question on loading/saving xml files
Message-ID: <53A7943A5BD8D411B6930002A5073155013F6067@bgsmsx90.iind.intel.com>


-----Original Message-----
From: Alexandre Fayolle [mailto:Alexandre.Fayolle@logilab.fr]
Sent: Thursday, October 25, 2001 8:18 PM
To: Mariappan, MaharajanX
Cc: 'Mark Humphrey'; 'xml-sig@python.org'
Subject: RE: [XML-SIG] newbie question on loading/saving xml files


On Thu, 25 Oct 2001, Mariappan, MaharajanX wrote:

> Hi Folks,
> 
> I tried DOM module to load as Mark told. I'm trying to load the xml
elements
> to treecontrol using below code

What graphical toolkit are you using?

I'm using wxTreeCtrl widget of wxWindows toolkit.

If you want to see how to load a DOM tree to a GTK CTree widget, you can
check the code in XML tools (ftp://ftp.logilab.org/pub/xmltools/)

In win2000 OS, XmlTree test run throughing ImportError "No Module named GDK"
and gtk as well.

I'm using python-2.0 and wxPython-2.3 

Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).


From uche.ogbuji@fourthought.com  Wed Oct 31 16:07:25 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 31 Oct 2001 09:07:25 -0700
Subject: [XML-SIG] Using xpath/xslt on proprietary object structures.
In-Reply-To: Message from Alan Kennedy <pyxml@xhaus.com>
 of "Wed, 31 Oct 2001 08:40:46 GMT." <3BDFB90E.28EF5157@xhaus.com>
Message-ID: <200110311607.f9VG7Rx09863@localhost.localdomain>

> Uche Ogbuji wrote:
> 
> > Looks as if most of this can be done using XInclude and XSLT.  It is an
> > interesting usage pattern indeed.
> 
> Uche, is that "interesting" as in "interesting problem to solve" or "interesting"
> as in "what a poor solution you've chosen" :-)

You've found ways to adapt to XML solutions when they were not even really 
available.  I certainly couldn't bring myself to criticize that.  I do mean 
the former.


> If it's the former, check www.paratuberculosis.org to see the output of the current
> implementation. If it's the latter, I know I've currently got a *very* poor design.
> I can only plead lack of time and still incomplete knowledge of the way to
> pythonically process XML. Also, I originally wrote all this processing using the
> MSXSL beta from Fall 1998 (anyone remember that?). But I'm learning fast, thanks to
> the excellence of python and the excellence of the available XML tools.
> 
> Actually, on Alexandre's suggestion, I've started looking into 4Suite Server, and
> I'm mightily impressed.

I hate to say this, but I would wait a little bit.  We've made another round 
of fundamental changes to finish its evolution: the internal 4SS code base 
that we've been developing and testing.  Is a huge leap from the last one, and 
is the architecture that we'll be carrying to 1.0.

We'll probably have public CVS open again this week, so all can have a 
preview.  The final release of 0.12.0 is scheduled for early December.


> When I get the time, I'm going to migrate over to 4SS. It's a very powerful suite
> of software.
> 
> Also, I'm intrigued by the possibility of a read/write cDomlette. Do you have any
> idea when 4Suite 0.12 might be available?

See above.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Boulder, CO 80301-2537, USA
XML strategy, XML tools (http://4Suite.org), knowledge management


From Jan.Delgado@unamite.com  Wed Oct 31 17:36:47 2001
From: Jan.Delgado@unamite.com (Jan Delgado)
Date: Wed, 31 Oct 2001 18:36:47 +0100
Subject: [XML-SIG] Relative URI
Message-ID: <HDEPJHADHDIEGPOCLLLEMEALCAAA.Jan.Delgado@unamite.com>

hello,
i am parsing (validating) xml code with the
xml.dom.ext.reader.Sax2.FromXml() function.
the xml code contains a reference to a dtd.

the parser process stops with the following
message:
 
<unknown>:2:47: Cannot resolve relative URI 'order.dtd' 
when document URI unknown

how can i set the document URI ?

the strange thing here is that when i use 
FromXmlStream(sys.stdin, validate=1) an pipe the
file into the python script, everything is fine.
 
--Jan


From Jan.Delgado@unamite.com  Wed Oct 31 18:08:29 2001
From: Jan.Delgado@unamite.com (Jan Delgado)
Date: Wed, 31 Oct 2001 19:08:29 +0100
Subject: [XML-SIG] getAttributeNS
Message-ID: <HDEPJHADHDIEGPOCLLLECEAMCAAA.Jan.Delgado@unamite.com>

hello,
i created  DOM trees with calls to 

a) dom = xml.dom.ext.reader.Sax2.FromXmlStream( sys.stdin, validate=1 )
b) dom = xml.dom.ext.reader.Sax2.FromXmlStream( sys.stdin, validate=0 )

when i try to access attributes i noticed a few problems:

1. getAttribute() does not find my attributes
2. for case a) i have to call getAttributeNS(None, "attrname")
3. for case b) i have to call getAttributeNS('', "attrname")

what is the problem here ? what is the correct way to get the
attributes ?

greetings 
	jan


From fdrake@acm.org  Wed Oct 31 18:20:48 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 31 Oct 2001 13:20:48 -0500
Subject: [XML-SIG] getAttributeNS
In-Reply-To: <HDEPJHADHDIEGPOCLLLECEAMCAAA.Jan.Delgado@unamite.com>
References: <HDEPJHADHDIEGPOCLLLECEAMCAAA.Jan.Delgado@unamite.com>
Message-ID: <15328.16640.849493.151692@grendel.zope.com>

Jan Delgado writes:
 > 1. getAttribute() does not find my attributes

  Whether to use the *NS flavor (DOM Level 2) or the Level 1 flavor
(no NS) depends on whether you've enabled namespaces.

 > 2. for case a) i have to call getAttributeNS(None, "attrname")
 > 3. for case b) i have to call getAttributeNS('', "attrname")

  This looks like a bug to me; if namespaces are being used, case 2
should work consistently.  In the Python DOM binding, "no namespace"
is spelled using None rather than the empty string.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From larsga@garshol.priv.no  Wed Oct 31 18:32:12 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 31 Oct 2001 19:32:12 +0100
Subject: [XML-SIG] Relative URI
In-Reply-To: <HDEPJHADHDIEGPOCLLLEMEALCAAA.Jan.Delgado@unamite.com>
References: <HDEPJHADHDIEGPOCLLLEMEALCAAA.Jan.Delgado@unamite.com>
Message-ID: <m3668vmzir.fsf@lambda.garshol.priv.no>

* Jan Delgado
|
| <unknown>:2:47: Cannot resolve relative URI 'order.dtd' 
| when document URI unknown
| 
| how can i set the document URI ?

You could use FromXmlFile or FromXmlUrl. There really ought to be a
baseUri parameter to FromXmlStream and FromXml, but there isn't now.
 
| the strange thing here is that when i use 
| FromXmlStream(sys.stdin, validate=1) an pipe the
| file into the python script, everything is fine.

That would be a bug if this code actually did use the validating
parser. Have you checked, by making a validity error, that it's
actually using the validating parser?

--Lars M.


From Juergen Hermann" <jh@web.de  Wed Oct 31 18:57:52 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Wed, 31 Oct 2001 19:57:52 +0100
Subject: [XML-SIG] Relative URI
Message-ID: <E15z0Ws-0004WF-00@smtp.web.de>

On Wed, 31 Oct 2001 18:36:47 +0100, Jan Delgado wrote:

>i am parsing (validating) xml code with the
>xml.dom.ext.reader.Sax2.FromXml() function.
>the xml code contains a reference to a dtd.

Try this (code not tested):

	s =3D xmlreader.InputSource("path/to/your/dir/dummy.xml")
	s.setByteStream(cStringIO(your_xml_string))
	p.FromXmlStream(s)


Ciao, J=FCrgen


From larsga@garshol.priv.no  Wed Oct 31 19:10:06 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 31 Oct 2001 20:10:06 +0100
Subject: [XML-SIG] Relative URI
In-Reply-To: <E15z0Ws-0004WF-00@smtp.web.de>
References: <E15z0Ws-0004WF-00@smtp.web.de>
Message-ID: <m34rofmxrl.fsf@lambda.garshol.priv.no>

* Juergen Hermann
| 
| Try this (code not tested):
| 
| 	s = xmlreader.InputSource("path/to/your/dir/dummy.xml")
| 	s.setByteStream(cStringIO(your_xml_string))
| 	p.FromXmlStream(s)

This is the same as 

  FromXmlFile("path/to/your/dir/dummy.xml")

(Note that the original message says that the data are coming from a
file.)

--Lars M.


From paul@prescod.net  Wed Oct 31 19:53:36 2001
From: paul@prescod.net (Paul Prescod)
Date: Wed, 31 Oct 2001 11:53:36 -0800
Subject: [XML-SIG] Uncle Alex Needs You!
Message-ID: <3BE056BF.76F5B1E3@prescod.net>

The Python cookbook is in the stages leading up to publishing as a book
and our XML section is pretty weak. Alex Martelli is managing this part
of the project. I've contributed more than half of the XML-related
recipes. This doesn't make it seem like much of a community project!
Please contribute

 Here are some benefits to contribution:

 * help promote Python as the excellent XML processing environment that
it is!
 * get your name into print.
 * add "Python cookbook contributor" to your resume
 * help the PSF: 
     * http://www.onlamp.com/pub/a/python/2001/03/07/pythonnews.html
 * publicize your favorite XML processing package

Please submit recipes that depend on Python 2.x and/or PyXML and/or
4Suite and/or Redfoot and/or anything else. Just be explicit about what
package your recipe depends upon. If you are a big fan of 4Suite (or
creator of 4Suite) it would get great to show how to use Python with
XSLT or XPath or XPointer or ...

Where do you find recipes? I go to my "scripts" or "temp" directory and
look to see what little XML-related test files I've done in the last few
months. If doing some small task was useful to you, or investigating
some corner of Python's XML support made sense to you, then it probably
makes sense to others. So turn it into a recipe.

Please contribute in the next few days -- by the end of this week.
Thanks!

 Paul Prescod


From richard@starfighter.freeuk.com  Wed Oct 31 22:14:14 2001
From: richard@starfighter.freeuk.com (Richard Townsend)
Date: Wed, 31 Oct 2001 22:14:14 -0000
Subject: [XML-SIG] Re: [Image-SIG] Display Image from ImageDraw
Message-ID: <MBBBKPCGJCKKMJDOCHEGAEGFCAAA.richard@starfighter.freeuk.com>

> to register the image with Tk, you must wrap it in an
> ImageTk.PhotoImage object.

Thanks Fredrik, that works perfectly!

regards,
Richard Townsend


From dieter@handshake.de  Wed Oct 31 20:15:05 2001
From: dieter@handshake.de (Dieter Maurer)
Date: Wed, 31 Oct 2001 21:15:05 +0100
Subject: [XML-SIG] Attaching files sent to the list.
In-Reply-To: <A19EEC21DB90D411B40900D0B7B4F8E7369434@alermx.alerton.com>
References: <A19EEC21DB90D411B40900D0B7B4F8E7369434@alermx.alerton.com>
Message-ID: <15328.23497.127547.507302@linux.local>

Stuart Donaldson writes:
 > For those of us that get this mailing list in a digest format, this makes it
 > much more difficult to see the file. We get a mail message with a bunch of
 > gibberish included inline.
If you have a MIME compatible email reader, you can tell "mailman"
(i.e. the mailing list management program) to send you
MIME digests.

The easiest way is to send mail to "xml-sig-request@python.org"
with content:

     set plain off <your password>


You can also use the HTML page for configuration, if you prefer
WWW over email.


Works very good for me...


Dieter