From faassen at infrae.com  Tue Feb  1 20:03:53 2005
From: faassen at infrae.com (Martijn Faassen)
Date: Tue Feb  1 20:03:55 2005
Subject: [XML-SIG] SOAPpy streaming base64
In-Reply-To: <200501311627.14973.erik@cq2.nl>
References: <200501311627.14973.erik@cq2.nl>
Message-ID: <41FFD299.1030902@infrae.com>

Erik J. Groeneveld wrote:
> I am new to this list.  I am developing a web site that harvests OAI 
> repositories using the oai-mph protocol, and uploads the records to a 
> indexing service using SOAPpy.

Just in case you hadn't seen it yet, have you seen Infrae's oaipmh 
module? Our software stack does much more than that module (including 
indexing using Zope and CMS integration), but it may be interesting to 
you. The stuff is all open source.

The python module:

http://www.infrae.com/download/oaipmh/

Our stack of OAI stuff:

http://www.infrae.com/products/oaipack

Regards,

Martijn
From korea12123 at korea.com  Wed Feb  2 06:03:56 2005
From: korea12123 at korea.com (플러스론3)
Date: Wed Feb  2 06:04:18 2005
Subject: [XML-SIG] =?iso-8859-1?q?=A2=BC=B1=E4=B1=DE=C0=DA=B1=DD_=C4=AB?=
	=?iso-8859-1?q?=B5=E5=B0=E1=C1=A6_=BF=F90=2E9=7E1=2E7=25=B1=DD=B8?=
	=?iso-8859-1?q?=AE=B7=CE5=C3=B5=B8=B8=BF=F8=B1=EE=C1=F6?=
Message-ID: <OATWYPZRMECYWTWTURBQEAFM@msn.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20050202/07a7dc1f/attachment.html
From prasad_st at beceem.com  Wed Feb  2 10:40:00 2005
From: prasad_st at beceem.com (Prasad PS)
Date: Wed Feb  2 10:40:10 2005
Subject: [XML-SIG] Re: Could somebody help me?
Message-ID: <E4750F13EE9A5C4F8CDCDCCC388F94720F0DFE@becm-in-mx-srvr.beceem.com>

Hi,
 Using the code below, I have created a file "TempView.xml". Could
anybody tell me how to append another "Employee" to the existing xml
file?
Here's the snippet I did to create an xml file.

import getopt
import os
import string
import sys
import xml.dom.minidom
from xml.dom.minidom import Node
from xml.dom import minidom
from xml.dom.ext.reader.Sax2 import FromXmlStream
from xml.dom.ext.reader import Sax2
from xml.dom.ext import PrettyPrint
from xml.dom.DOMImplementation import implementation
import xml.sax.writer
import xml.utils

class LogView:
    def __init__(self):
        self.LogViewFile = open("TempView.xml",'w')
        self.document = implementation.createDocument(None,None,None)
        self.logViews = self.document.createElement("EmpDetails")
        self.document.appendChild(self.logViews)
    
    def createViewFile(self):
        self.logViews.appendChild(doc.createTextNode("\n  "))
        logdetail = doc.createElement("Address")
        self.logViews.appendChild(logdetail)
    
        logdetail.appendChild(doc.createTextNode("\n   "))
        
        tcidNode = doc.createElement("Name")
        tcidNode.appendChild(doc.createTextNode("Prasad"))
        logdetail.appendChild(tcidNode)
        logdetail.appendChild(doc.createTextNode("\n "))
        
        grpNode = doc.createElement("Age")
        grpNode.appendChild(doc.createTextNode("28"))
        logdetail.appendChild(grpNode)
        logdetail.appendChild(doc.createTextNode("\n  "))
        
        
    def finalStep(self):
        t = self.document.createTextNode("\n")
        self.logViews.appendChild(t)
        PrettyPrint(self.document, self.LogViewFile)
        self.LogViewFile.write("\n")  


Prasad.p.s.


-----Original Message-----
From: Uche Ogbuji [mailto:Uche.Ogbuji@fourthought.com] 
Sent: Friday, January 28, 2005 8:43 PM
To: Prasad PS
Cc: XML-SIG
Subject: RE: [XML-SIG] Re: Could somebody help me?

On Fri, 2005-01-28 at 15:48 +0530, Prasad PS wrote:
> Sure, here is the code
> 
> In the code below, what I am doing is - I am opening an xml file and
> appending a node to the root document. Then I add this root document
to
> the xml file
> fp = open (string.strip(self.cnfDtls.GetLogFilePath()), 'w')
> xml.dom.ext.PrettyPrint(doc, self.xmlFile)
> self.xmlFile.write("\n") 
> fp.close().

So you tried the first choice (PyXML) rather than the second (Amara).
OK.  You were not clear on that.

Your first problem is that you're using xml.dom.ext.reader.FromXmlStream
rather than 

from xml.dom import minidom
doc = minidom.parse(string.strip(self.cnfDtls.GetLogFilePath()))

...

doc.toprettyxml() (rather than xml.dom.ext.PrettyPrint)

That's the fault of the PyXML docs, which should really be updated.

Side question: you mean you're appending a node to the document element,
right?  Not the root document.  The latter would result in an invalid
XML document entity.

In the code you posted, it looks as if you only append to subsidiary
nodes, so that should be OK.

Even using 4DOM, your general approach should work, and I've used it
oftentimes before (in the far-off past), with no problem, so I wonder:
Are you sure self.xmlFile is "empty" at the point of the
xml.dom.ext.PrettyPrint?

If so, I suggest you whittle down a test case that reveals the apparent
bug, and post data and complete, runnable code (preferably after
switching to minidom).  If it seems a clear bug, you can use the PyXML
bug tracker.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML -
http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit -
http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) -
http://www.adtmag.com/article.asp?id=10286
UBL 1.0 -
http://www-106.ibm.com/developerworks/xml/library/x-think28.html
Manage XML collections with XAPI -
http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables -
http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions -
http://www.ibm.com/developerworks/xml/library/x-tiplook2.html


From Chandra.Reddy01 at ca.com  Wed Feb  2 12:54:00 2005
From: Chandra.Reddy01 at ca.com (Reddy, Chandra B)
Date: Wed Feb  2 12:54:05 2005
Subject: [XML-SIG] import Error No module named ext.reader.Sax2
Message-ID: <16C3BD3BBB0FA04D967519E9FED1E8C0840144@inhyms21.ca.com>

Hi,

  When I try to import the xml.dom.ext.reader.Sax2 I am getting the
following error.Can any one help me how to solve this problem.

 
from xml.dom.ext.reader.Sax2 import FromXmlStream, 

ImportError: No module named ext.reader.Sax2

 
Thanks & Regards,

B. Chandra Reddy

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20050202/52698159/attachment.htm
From fredrik at pythonware.com  Wed Feb  2 18:50:33 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Feb  2 18:51:14 2005
Subject: [XML-SIG] Re: import Error No module named ext.reader.Sax2
References: <16C3BD3BBB0FA04D967519E9FED1E8C0840144@inhyms21.ca.com>
Message-ID: <ctr3rr$ut6$1@sea.gmane.org>

"Reddy, Chandra B" wrote:

> When I try to import the xml.dom.ext.reader.Sax2 I am getting the
> following error.Can any one help me how to solve this problem.
>
> from xml.dom.ext.reader.Sax2 import FromXmlStream,
>
> ImportError: No module named ext.reader.Sax2

have you installed the PyXML extension?

    http://pyxml.sourceforge.net/
    http://pyxml.sourceforge.net/topics/howto/section-install.html

</F> 


From jedp at ilm.com  Wed Feb  2 18:59:49 2005
From: jedp at ilm.com (Jed Parsons)
Date: Wed Feb  2 18:59:56 2005
Subject: [XML-SIG] chaining sax handlers
Message-ID: <20050202095949.T11196@ilm.com>


Hi, all,

I would like to do with sax processors what I can do with the document()
function in xslt, namely include other documents into the one that's being
parsed.

Here's a sample handler, and three xml files.  This approach seems to work for
simple cases, but appears to break the innards of the handler (described
below):

# ----------------------------------------------------------------------
# chaining handler

    class FooHandler(xml.sax.handler.ContentHandler):
        
        def characters(self, data):
            print data

        def include_proc(self, href):
            filter = xml.sax.make_parser()
            filter.setContentHandler(self)
            filter.parse(href)
            
        def startElement(self, name, attrs):
            if name == 'include':
                self.include_proc(attrs.get('href'))

# ----------------------------------------------------------------------
# some xml files to work with:

# file1.xml:
#
<?xml version='1.0'?>
<foo>
 <bar>This is file1</bar>
 <include href='file3.xml' />
</foo>

# file2.xml:
#
<?xml version='1.0'?>
<foo>
 <bar>This is file2</bar>
 <include href='file3.xml' />
 <bar>Back in file2 again after include</bar>
</foo>

# file3.xml:
#
<?xml version='1.0'?>
<foo>
 <bar>This is file3.</bar>
</foo>

# ----------------------------------------------------------------------
# results (with whitespace removed):

>>> filter = xml.sax.make_parser()
>>> handler = FooHandler()
>>> filter.setContentHandler(handler)
>>> filter.parse('file1.xml')
This is file1
This is file3.
>>> filter.parse('file2.xml')
This is file2
This is file3.
Back in file2 again after include
>>> 

So this seems to work in a simple case.  w00t!  But in a more involved handler,
I get errors like "weakly-referenced object no longer exists" when I try to
access the document locator after re-entering.

Can anyone tell me what I'm doing wrong?

Many thanks for any help,

Jed

-- 
Jed Parsons                        / Industrial Light + Magic : 415.448.2974
  
grep(do{for(ord){$o+=$_&7;grep(vec($j,+$o++,1)=1,5..($_>>3||print"$j\n"))}},
(split(//,"))*))2+29*2:.*4:1A1+9,1))2*:..)))2*:31.-1)4131)1))2*:\7Glug!")));

From uche.ogbuji at fourthought.com  Wed Feb  2 23:57:04 2005
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Wed Feb  2 23:57:30 2005
Subject: [XML-SIG] ANN: Amara XML Toolkit 0.9.4
Message-ID: <1107385024.4527.3.camel@borgia>

http://uche.ogbuji.net/tech/4Suite/amara
ftp://ftp.4suite.org/pub/Amara/

Changes in this release:

* Add binderytools.type_inference rule which automatically converts XML
  nodes to native Python objects such as int, float and datetime
* Improve threading and signal behavior of pushdom and pushbind
* Add support for attributes() method on nodes.
  Can now call Ft.Xml.Domlette.PrettyPrint on bindery nodes
* Add lazy attributes support by default.
  amara.binderytools.preserve_attribute_details rule now obsolete
  XPath always supports attribute access, now
* rename prefixes node property to xmlns_prefixes
* Update demos and tests
* Add CherryPy demo (CherryPy rocks: http://www.cherrypy.org/)
* Bug fixes

The new binderytools.type_inference is similar to what's popularly
called "XML marshalling":

    TYPE_MIX = """\
    <?xml version="1.0" encoding="utf-8"?>
    <a a1="1">
      <b b1="2.1"/>
      <c c1="2005-01-31">
        <d>5</d>
        <e>2003-01-30T17:48:07.848769Z</e>
      </c>
      <g>good</g>
    </a>"""

    rules=[binderytools.type_inference()]
    doc = binderytools.bind_string(TYPE_MIX, rules=rules)
    doc.a.a1 == 1     #type int
    doc.a.b.b1 == 2.1 #type float
    doc.a.c.c1 == datetime.datetime(2005, 1, 31) #type datetime.

So wherever it's reasonable to interpret an XML node as one of these
simple Python types, this new rule will work them naturally into the
data binding.


Amara XML Toolkit is a collection of Python tools for XML processing--
not just tools that happen to be written in Python, but tools built from
the ground up to use Python idioms and take advantage of the many
advantages of Python.

Amara builds on 4Suite [http://4Suite.org], but whereas 4Suite focuses
more on literal implementation of XML standards in Python, Amara
focuses on Pythonic idiom.  It provides tools you can trust to conform
with XML standards without losing the familiar Python feel.

The components of Amara are:

* Bindery: data binding tool (a very Pythonic XML API)
* Scimitar: implementation of the ISO Schematron schema language for
            XML; converts Schematron files to Python scripts
* domtools: set of tools to augment Python DOMs
* saxtools: set of tools to make SAX easier to use in Python
* Flextyper: user-defined datatypes in Python for XML processing

There's a lot in Amara, but here are highlights:

Amara Bindery: XML as easy as py
--------------------------------

Based on the retired project Anobind, but updated to use SAX rather than
DOM to create bindings.  Bindery reads an XML document and returns a
data structure of Python objects corresponding to the vocabulary used
in the XML document, for maximum clarity.

Bindery turns the document

<monty>
  <python spam="eggs">What do you mean "bleh"</python>
  <python ministry="abuse">But I was looking for argument</python>
</monty>

Into a set of objects such that you can write

binding.monty.python.spam

In order to get the value "eggs" or

binding.monty.python[1]

In order to get the value "But I was looking for argument".

There are other such tools for Python, and what makes Anobind unique is
that it's driven by a very declarative rules-based system for binding
XML to the Python data.  You can register rules that are triggered by
XPattern expressions specialized binding behavior.  It includes XPath
support and supports mutation.  Bindery is very efficient, using SAX
to generate bindings.

Scimitar: Schematron for Python
--------------------------------

Merged in from a separate project, Scimitar is an implementation of ISO
Schematron that compiles a Schematron schema into a Python validator
script.

You typically use scimitar in two phases.  Say you have a schematron
schema schema1.stron and you want to validate multiple XML files
against it, instance1.xml, instance2.xml, instance3.xml.

First you run schema1.stron through the scimitar compiler script,
scimitar.py:

scimitar.py schema1.stron

The generated file, schema1.py, can be used to validate XML instances:

python schema1.py instance1.xml

Which emits a validation report.

Amara DOM Tools: giving DOM a more Pythonic face
------------------------------------------------

DOM came from the Java world, hardly the most Pythonic API possible.
Some DOM-like implementations such as 4Suite's Domlettes mix in some
Pythonic idiom. Amara DOM Tools goes even further.

Amara DOM Tools feature pushdom, similar to xml.dom.pulldom, but
easier to use.  It also includes Python generator-based tools for
DOM processing, and a function to return an XPath location for
any DOM node.

Amara SAX Tools: SAX without the brain explosion
------------------------------------------------

Tenorsax (amara.saxtools.tenorsax) is a framework for "linerarizing" SAX
logic so that it flows more naturally, and needs a lot less state
machine wizardry.

License
-------

Amara is open source, provided under the 4Suite variant of the Apache
license.  See the file COPYING for details.

Installation
------------

Amara requires Python 2.3 or more recent and 4Suite 1.0a4 or more
recent.  Make sure these are installed, unpack Amara to a convenient
location and run

python setup.py install


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From uche.ogbuji at fourthought.com  Fri Feb  4 07:23:21 2005
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Fri Feb  4 07:23:29 2005
Subject: [XML-SIG] Article on converting WordNet to XML using Python
Message-ID: <1107498201.4527.45.camel@borgia>

Thought I should mention it, since it's not in a spot where you'd
usually find about Python/XML.

http://www.ibm.com/developerworks/xml/library/x-think29.html

-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From uche.ogbuji at fourthought.com  Fri Feb  4 18:20:56 2005
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Fri Feb  4 18:21:00 2005
Subject: [XML-SIG] XBEL resource page updates
In-Reply-To: <41FE8261.4020705@v.loewis.de>
References: <1106898215.8243.44.camel@borgia> <41FCA80D.4050709@v.loewis.de>
	<1107096130.8243.172.camel@borgia> <41FD23AB.1020302@v.loewis.de>
	<1107182375.8243.194.camel@borgia>  <41FE8261.4020705@v.loewis.de>
Message-ID: <1107537656.4527.74.camel@borgia>

On Mon, 2005-01-31 at 20:09 +0100, "Martin v. L?wis" wrote:
> Uche Ogbuji wrote:
> > Well, I've done the last few Web page updates, anyway, and I'm already
> > set up as a developer.  Besides the 1.2 discussion, it's light enough
> > work that I'm willing to take responsibility as XBEL maintainer.
> 
> Very good! If I can help with more infrastructure (mailing lists on SF
> or python.org, etc) please let me know.

Other XBEL folks, what do you think of:

* An XBEL SF project of its own
* Its own SF mailing list
* Its own SF home page, file releases, etc.

?

Of this sounds good, I'll need some help getting it all set up.  My time
is limited.  I'm OK making the basic SF project request, and some
initial set-up.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From junkc at fh-trier.de  Sat Feb  5 11:57:36 2005
From: junkc at fh-trier.de (Christian Junk)
Date: Sat Feb  5 11:57:26 2005
Subject: [XML-SIG] XBEL resource page updates
In-Reply-To: <1107537656.4527.74.camel@borgia>
References: <1106898215.8243.44.camel@borgia> <41FE8261.4020705@v.loewis.de>
	<1107537656.4527.74.camel@borgia>
Message-ID: <200502051157.36582.junkc@fh-trier.de>

Am Freitag, 4. Februar 2005 18:20 schrieb Uche Ogbuji:
> Other XBEL folks, what do you think of:
>
> * An XBEL SF project of its own
> * Its own SF mailing list
> * Its own SF home page, file releases, etc.
>
> ?
>
> Of this sounds good, I'll need some help getting it all set up.  My time
> is limited.  I'm OK making the basic SF project request, and some
> initial set-up.

Hi!

I think it is a very good idea and this was my intention when I created the 
site: http://xbel.webinternals.de

I'm able to help you with the design of the home page.

Regards,
Christian

-- 
Christian Junk <junkc@fh-trier.de>
FH Trier, University of Applied Sciences
Faculty of Design and Applied Computer Science

http://christianjunk.webinternals.de
http://xbel.webinternals.de
From fredrik at pythonware.com  Sat Feb  5 14:51:16 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Feb  5 14:51:12 2005
Subject: [XML-SIG] ANN: ElementTidy 1.0 beta 1 (january 3, 2005)
Message-ID: <cu2iub$8u5$1@sea.gmane.org>

The ElementTidy library is an add-on to ElementTree that provides an
alternative tree builder that can read (almost) arbitrary HTML, and turn
it into well-formed XHTML element trees.

The ElementTidy library uses a library version of Dave Raggett's HTML
Tidy utility to do the cleanup (source code is included), and does not rely
on external utilities.

The beta 1 release adds improved support for source document encoding,
and more aggressive tidying (producing output also for seriously malformed
HTML).

For downloads and more information, see:

    http://effbot.org/downloads#elementtidy
    http://effbot.org/zone/element-tidylib.htm

enjoy /F 


From Sylvain.Thenault at logilab.fr  Mon Feb  7 12:18:48 2005
From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=)
Date: Mon Feb  7 12:18:51 2005
Subject: [XML-SIG] prepare_input_source and relative path
Message-ID: <20050207111848.GA4540@logilab.fr>

Hey,

I've been heating a bug which is already registered as #616431 in the
bug tracker. I find it very annoying and I've patched the function to
make it work before noticing a patch was already available. Is there any
reason to still wait to apply it ?
Anyway I've joined to this mail my version of the fix, which fix the
following cases:

- prepare_input_source('relative.xml', '/base') -> /base/relative.xml
  the sf submitted patch fix this one to.

- prepare_input_source('file:relative.xml', '/base') ->
  file:/base/relative.xml


this allow to have a xml file containing relative system identifiers
such as:

  <!ENTITY  plans SYSTEM "file:plans.xml">
  <!ENTITY  chatbot SYSTEM "chatbot.xml">

where parse(open('path to my xml file')) should not fail as it currently
does.

If this patch sounds good to you, I can check it in.

-- 
Sylvain Th?nault                               LOGILAB, Paris (France).

http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org

-------------- next part --------------
--- /usr/lib/python2.3/site-packages/_xmlplus/sax/saxutils.py	2004-11-29 13:36:36.000000000 +0100
+++ cvs_work/_xmlplus/sax/saxutils.py	2005-02-07 12:01:42.000000000 +0100
@@ -5,7 +5,7 @@
 $Id: saxutils.py,v 1.35 2004/03/20 07:46:04 fdrake Exp $
 """
 
-import os, urlparse, urllib2, types
+import os, urlparse, urllib, urllib2, types
 import handler
 import xmlreader
 import sys, _exceptions, saxlib
@@ -511,14 +511,24 @@
         source.setByteStream(f)
         if hasattr(f, "name"):
             source.setSystemId(f.name)
-
     if source.getByteStream() is None:
         sysid = source.getSystemId()
-        if os.path.isfile(sysid):
+        # if a base is given, sysid may be relative to it, make the
+        # join before isfile() test
+        if base:
             basehead = os.path.split(os.path.normpath(base))[0]
-            source.setSystemId(os.path.join(basehead, sysid))
-            f = open(sysid, "rb")
+            path = os.path.join(basehead, sysid)
+        else:
+            path = sysid
+        if os.path.isfile(path):
+            source.setSystemId(path)
+            f = open(path, "rb")
         else:
+            # if sysid is an url while base isn't, urljoin will fail, so
+            # insert the protocol identifier into base
+            proto = urlparse.urlparse(sysid)[0]
+            if proto and not urlparse.urlparse(base)[0]:
+                base = '%s:%s' % (proto, urllib.pathname2url(base))
             source.setSystemId(urlparse.urljoin(base, sysid))
             f = urllib2.urlopen(source.getSystemId())
 
From Uche.Ogbuji at fourthought.com  Mon Feb  7 18:04:34 2005
From: Uche.Ogbuji at fourthought.com (Uche Ogbuji)
Date: Mon Feb  7 18:04:43 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <20050207111848.GA4540@logilab.fr>
References: <20050207111848.GA4540@logilab.fr>
Message-ID: <1107795874.4527.140.camel@borgia>

On Mon, 2005-02-07 at 12:18 +0100, Sylvain Th?nault wrote:
> Hey,
> 
> I've been heating a bug which is already registered as #616431 in the
> bug tracker. I find it very annoying and I've patched the function to
> make it work before noticing a patch was already available. Is there any
> reason to still wait to apply it ?
> Anyway I've joined to this mail my version of the fix, which fix the
> following cases:
> 
> - prepare_input_source('relative.xml', '/base') -> /base/relative.xml
>   the sf submitted patch fix this one to.
> 
> - prepare_input_source('file:relative.xml', '/base') ->
>   file:/base/relative.xml
> 
> 
> this allow to have a xml file containing relative system identifiers
> such as:
> 
>   <!ENTITY  plans SYSTEM "file:plans.xml">
>   <!ENTITY  chatbot SYSTEM "chatbot.xml">
> 
> where parse(open('path to my xml file')) should not fail as it currently
> does.
> 
> If this patch sounds good to you, I can check it in.

Wow.  I'm always amazed at some of bugs that have lived on for so long
in PyXML.

Your patch seems fine to me, but there is one area that is probably
worth discussion.  I hope Mike Brown has a moment to chip in because
he's an expert at such matters.

For the case of the file: URL scheme (BTW, you might want to consider
replacing your variable name "proto" with "scheme"), it's probably OK to
have 

file:///base + file:relative.xml -> file:///base/relative.xml

Since the file scheme's semantics are so wooly.  But this wouldn't make
sense if you replaced "file" with "http".

Then there's the matter of a base URI given as 

/base

in 4Suite we require all base URIs to be proper base URIs (so they must
at least have a scheme).  I think this is a reasonable restriction based
on RFC requirements.  Is there a valid user case where there would not
be a proper base URI, anyway?


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From Sylvain.Thenault at logilab.fr  Mon Feb  7 18:17:36 2005
From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=)
Date: Mon Feb  7 18:17:40 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <1107795874.4527.140.camel@borgia>
References: <20050207111848.GA4540@logilab.fr>
	<1107795874.4527.140.camel@borgia>
Message-ID: <20050207171736.GA5096@logilab.fr>

On Monday 07 February ? 10:04, Uche Ogbuji wrote:
> On Mon, 2005-02-07 at 12:18 +0100, Sylvain Th?nault wrote:
> > Hey,
> > 
> > I've been heating a bug which is already registered as #616431 in the
> > bug tracker. I find it very annoying and I've patched the function to
> > make it work before noticing a patch was already available. Is there any
> > reason to still wait to apply it ?
> > Anyway I've joined to this mail my version of the fix, which fix the
> > following cases:
> > 
> > - prepare_input_source('relative.xml', '/base') -> /base/relative.xml
> >   the sf submitted patch fix this one to.
> > 
> > - prepare_input_source('file:relative.xml', '/base') ->
> >   file:/base/relative.xml
> > 
> > 
> > this allow to have a xml file containing relative system identifiers
> > such as:
> > 
> >   <!ENTITY  plans SYSTEM "file:plans.xml">
> >   <!ENTITY  chatbot SYSTEM "chatbot.xml">
> > 
> > where parse(open('path to my xml file')) should not fail as it currently
> > does.
> > 
> > If this patch sounds good to you, I can check it in.
> 
> Wow.  I'm always amazed at some of bugs that have lived on for so long
> in PyXML.

isn't it...
 
> Your patch seems fine to me, but there is one area that is probably
> worth discussion.  I hope Mike Brown has a moment to chip in because
> he's an expert at such matters.
> 
> For the case of the file: URL scheme (BTW, you might want to consider
> replacing your variable name "proto" with "scheme"), it's probably OK to
> have 

thanks for fixing my url's vocabulary :)
 
> file:///base + file:relative.xml -> file:///base/relative.xml
> 
> Since the file scheme's semantics are so wooly.  But this wouldn't make
> sense if you replaced "file" with "http".

yep. But notice my patch doesn't change anything in that case, which
will so behave according to urlparse.urljoin's behaviour:

>>> urlparse.urljoin('file:///base', 'file:relative.xml')
'file:///relative.xml'
>>> urlparse.urljoin('file:///base', 'http:relative.xml')
'http:relative.xml'
 
> Then there's the matter of a base URI given as 
> 
> /base
> 
> in 4Suite we require all base URIs to be proper base URIs (so they must
> at least have a scheme).  I think this is a reasonable restriction based
> on RFC requirements.  Is there a valid user case where there would not
> be a proper base URI, anyway?

always having proper URI as base sounds like a reasonable restriction to
me too, and I can't see user case where it would not. But we may have
backward compat problem here if decide to care about it. Maybe
InputSource.setSystemId could check for scheme presence, and if not add
a file: and issue a deprecation warning ?

-- 
Sylvain Th?nault                               LOGILAB, Paris (France).

http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org

From uche.ogbuji at fourthought.com  Tue Feb  8 00:33:01 2005
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Tue Feb  8 00:33:06 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <20050207171736.GA5096@logilab.fr>
References: <20050207111848.GA4540@logilab.fr>
	<1107795874.4527.140.camel@borgia> <20050207171736.GA5096@logilab.fr>
Message-ID: <1107819181.4527.151.camel@borgia>

On Mon, 2005-02-07 at 18:17 +0100, Sylvain Th?nault wrote:
> On Monday 07 February ? 10:04, Uche Ogbuji wrote:
> > file:///base + file:relative.xml -> file:///base/relative.xml
> > 
> > Since the file scheme's semantics are so wooly.  But this wouldn't make
> > sense if you replaced "file" with "http".
> 
> yep. But notice my patch doesn't change anything in that case, which
> will so behave according to urlparse.urljoin's behaviour:
> 
> >>> urlparse.urljoin('file:///base', 'file:relative.xml')
> 'file:///relative.xml'
> >>> urlparse.urljoin('file:///base', 'http:relative.xml')
> 'http:relative.xml'

Bleah.  I guess that's why Mike Brown has had to create fixed versions
of all the Python stdlib URI functions for 4Suite :-)


> > Then there's the matter of a base URI given as 
> > 
> > /base
> > 
> > in 4Suite we require all base URIs to be proper base URIs (so they must
> > at least have a scheme).  I think this is a reasonable restriction based
> > on RFC requirements.  Is there a valid user case where there would not
> > be a proper base URI, anyway?
> 
> always having proper URI as base sounds like a reasonable restriction to
> me too, and I can't see user case where it would not. But we may have
> backward compat problem here if decide to care about it. Maybe
> InputSource.setSystemId could check for scheme presence, and if not add
> a file: and issue a deprecation warning ?

I do like the idea of a deprecation warning for this case, but what
about backwards compat?  The warnings module dates from Python 2.1.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From mike at skew.org  Tue Feb  8 12:13:01 2005
From: mike at skew.org (Mike Brown)
Date: Tue Feb  8 12:13:17 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <20050207111848.GA4540@logilab.fr>
Message-ID: <200502081113.j18BD1Zw090277@chilled.skew.org>

Sylvain Th?nault wrote:
> - prepare_input_source('relative.xml', '/base') -> /base/relative.xml
>   the sf submitted patch fix this one to.

Under no circumstances should '/base' + 'relative.xml' == '/base/relative.xml'.
It would only be an acceptable result if you had '/base/' instead of '/base'.


> - prepare_input_source('file:relative.xml', '/base') ->
>   file:/base/relative.xml

Same here. This is incorrect.


> this allow to have a xml file containing relative system identifiers
> such as:
> 
>   <!ENTITY  plans SYSTEM "file:plans.xml">

(1) 'file:plans.xml' is not a relative URI reference.

(2) The result of merging the reference 'file:plans.xml' with *any* base URI
    must be 'file:plans.xml'.  RFC 3986 sec. 5 governs this resolution.

>   <!ENTITY  chatbot SYSTEM "chatbot.xml">
> 
> where parse(open('path to my xml file')) should not fail as it currently
> does.

Trust me, you'll find that it is much easier to implement RFC 3986 sec. 5 than 
it is to work around bugs in urllib and urlparse. I suggest porting Absolutize()
and BaseJoin() from 4Suite's Ft.Lib.Uri.


-Mike
From mike at skew.org  Tue Feb  8 12:30:53 2005
From: mike at skew.org (Mike Brown)
Date: Tue Feb  8 12:30:57 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <1107819181.4527.151.camel@borgia>
Message-ID: <200502081130.j18BUr3r090319@chilled.skew.org>

Uche Ogbuji wrote:
> Bleah.  I guess that's why Mike Brown has had to create fixed versions
> of all the Python stdlib URI functions for 4Suite :-)

Yes. All of the URL functions in stdlib are either undocumented and for use
within stdlib only, or are about 8 years out of date. Or both.

I'm using Ft.Lib.Uri as proving grounds for APIs that I'll eventually propose
for inclusion in urllib2. I don't anticipate making any headway on such
proposals for a while, though.

In Ft.Lib.Uri everything is RFC 3986 compliant (I was tracking development
of the RFC), except for the percent-encoding APIs, which, like every other,
are fraught with various gotchas that I wouldn't want to have to explain to
anyone in any more detail than "everything you know is wrong" :) I hope to
have those looking better "soon" but it involves some serious brain twisting.

Relevant to this discussion, the API for resolution of URI references to
absolute form -- Ft.Lib.Uri.Absolutize() -- is stable, and the algorithm it
impements is well-defined by the RFC. The algorithm does not change for
different URI schemes; it works the same for 'file' as for 'http'.

It would not be too hard to copy Absolutize() and BaseJoin() from Ft.Lib.Uri
over into PyXML as a temporary workaround until urllib2 is knocked into shape.
I would just make it and its dependent functions semi-private, and change the
exceptions to ValueErrors.

> > > Then there's the matter of a base URI given as 
> > > 
> > > /base
> > > 
> > > in 4Suite we require all base URIs to be proper base URIs (so they must
> > > at least have a scheme).  I think this is a reasonable restriction based
> > > on RFC requirements.  Is there a valid user case where there would not
> > > be a proper base URI, anyway?
> > 
> > always having proper URI as base sounds like a reasonable restriction to
> > me too, and I can't see user case where it would not. But we may have
> > backward compat problem here if decide to care about it. Maybe
> > InputSource.setSystemId could check for scheme presence, and if not add
> > a file: and issue a deprecation warning ?

Adding 'file:' blindly can cause difficulties or unexpected results.
These are all very different things:

'xyz'         - relative URI reference (relative path)
'/xyz'        - relative URI reference (absolute path)
'file:xyz'    - absolute URI (undef authority, non-hierarchical path)
'file:/xyz'   - absolute URI (undef authority, absolute path; dubious usage)
'file://xyz'  - absolute URI (authority xyz; no path)
'file:///xyz' - absolute URI (empty authority, absolute path)

And then there's what happens when you start throwing in dot segments
('file:./xyz')... and people guessing at how to convert an OS path into a
URI reference... it gets ugly.

It is better to just check for the presence of a scheme and reject the
base if it doesn't have one. Or, if you can tolerate receiving a result
that has no scheme, prepend a dummy scheme, apply the proper resolution
algorithm, and strip the scheme from the result. Again this may not give
the results that the user expected, but IMHO there's no need to give the
user what they expect when what they expect is wrong :)

-Mike
From Sylvain.Thenault at logilab.fr  Tue Feb  8 16:42:10 2005
From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=)
Date: Tue Feb  8 16:42:13 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <200502081113.j18BD1Zw090277@chilled.skew.org>
References: <20050207111848.GA4540@logilab.fr>
	<200502081113.j18BD1Zw090277@chilled.skew.org>
Message-ID: <20050208154210.GA4113@logilab.fr>

On Tuesday 08 February ? 04:13, Mike Brown wrote:
> Sylvain Th?nault wrote:
> > - prepare_input_source('relative.xml', '/base') -> /base/relative.xml
> >   the sf submitted patch fix this one to.
> 
> Under no circumstances should '/base' + 'relative.xml' == '/base/relative.xml'.
> It would only be an acceptable result if you had '/base/' instead of '/base'.
> 
> 
> > - prepare_input_source('file:relative.xml', '/base') ->
> >   file:/base/relative.xml
> 
> Same here. This is incorrect.
 
yes, sorry for the wrong examples. Anyway in pyxml the base argument is
usually something like '/base/starthere.xml' so the patch fix correctly
this case. However you're right that it should probably be fixed to
handle the "no trailing slash" problem.
 
> > this allow to have a xml file containing relative system identifiers
> > such as:
> > 
> >   <!ENTITY  plans SYSTEM "file:plans.xml">
> 
> (1) 'file:plans.xml' is not a relative URI reference.
> 
> (2) The result of merging the reference 'file:plans.xml' with *any* base URI
>     must be 'file:plans.xml'.  RFC 3986 sec. 5 governs this resolution.
> 
> >   <!ENTITY  chatbot SYSTEM "chatbot.xml">
> > 
> > where parse(open('path to my xml file')) should not fail as it currently
> > does.
> 
> Trust me, you'll find that it is much easier to implement RFC 3986 sec. 5 than 
> it is to work around bugs in urllib and urlparse. I suggest porting Absolutize()
> and BaseJoin() from 4Suite's Ft.Lib.Uri.
 
I guess you're right. I wrote this patch because it was fixing my
problem. Now if it doesn't take too much time to have every cases
correctly fixed by implementing RFC 3986, I may take some time to do so
or to help having it done. And if parts of the job is already done in
4suite, that's great. However what's in 4suite, what's not and need to
be implemented is not yet clear to me.

-- 
Sylvain Th?nault                               LOGILAB, Paris (France).

http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org

From mike at skew.org  Wed Feb  9 03:01:35 2005
From: mike at skew.org (Mike Brown)
Date: Wed Feb  9 03:01:44 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <20050208154210.GA4113@logilab.fr>
Message-ID: <200502090201.j1921ZwP092976@chilled.skew.org>

Sylvain Th?nault wrote:
> I guess you're right. I wrote this patch because it was fixing my
> problem. Now if it doesn't take too much time to have every cases
> correctly fixed by implementing RFC 3986, I may take some time to do so
> or to help having it done. And if parts of the job is already done in
> 4suite, that's great. However what's in 4suite, what's not and need to
> be implemented is not yet clear to me.

The current version of Ft.Lib.Uri is here:
http://cvs.4suite.org/viewcvs/4Suite/Ft/Lib/Uri.py?view=markup [1]

If you see "rfc2396bis" in the doc strings, you may safely interpret
them to mean "RFC 3986".


The functions that you should look at are the following:

MakeUrllibSafe(uriRef)
======================
This exists in order to convert a proper URI reference into one that
can be handled by urllib.urlopen(). It does the following:
1. If the reference contains an Internationalized Domain Name,
   recodes it so that it is resolvable. (Py 2.3+ only)
2. Strips the fragment component, if any. 
3. Ensures that the reference is a byte string, not unicode.
4. On Windows, assumes that the first ':' appearing in the path
   component is part of a drivespec, and converts it to '|'.

If you port this function, the reference to PercentDecode() may be replaced 
with urllib.unquote(), but you must move the byte string check (#3, above) to 
occur before calling unquote. The references to the functions SplitUriRef and 
UnsplitUriRef can be replaced with urlsplit() and urlunsplit() from the 
urlparse module.


Absolutize(uriRef, baseUri)
===========================
This does strict merging of a URI reference and a base URI. The base URI 
*must* be absolute (must have a scheme). If you port this function, the
UriException may be replaced with a ValueError, and SplitUriRef &
UnsplitUriRef may be replaced with their urlparse equivalents, as
mentioned above. The RemoveDotSegments function must also be ported and
should be made semi-private because it is not for general use. I've
implemented it using two segment stacks, as alluded to in the spec,
rather than the explicit string-walking algorithm that would be too
inefficient.


BaseJoin(base, UriRef)
======================
This does lenient merging of a base URI and a URI reference (note the
argument order is different than that of Absolutize). It allows the base
URI to be a relative reference. In such cases, we use a dummy scheme
(we don't say "assume 'file:' because the spec says all schemes must be
resolved the same), run it through Absolutize, and then remove the scheme
from the result. If you port this function, you will need to port the
IsAbsolute function, which just checks to see if the URI has a scheme.
I prefer to use a regex for this, as it is fast and accurate (':' can
appear in more than one place in a URI reference, so it is not safe to
assume that its presence means there is a scheme).


-Mike

  [1] ...well, not really. The current version is on my hard drive :)

From Sylvain.Thenault at logilab.fr  Wed Feb  9 15:39:38 2005
From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=)
Date: Wed Feb  9 16:26:19 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <200502090201.j1921ZwP092976@chilled.skew.org>
References: <20050208154210.GA4113@logilab.fr>
	<200502090201.j1921ZwP092976@chilled.skew.org>
Message-ID: <20050209143938.GA4381@logilab.fr>

On Tuesday 08 February ? 19:01, Mike Brown wrote:
> Sylvain Th?nault wrote:
> > I guess you're right. I wrote this patch because it was fixing my
> > problem. Now if it doesn't take too much time to have every cases
> > correctly fixed by implementing RFC 3986, I may take some time to do so
> > or to help having it done. And if parts of the job is already done in
> > 4suite, that's great. However what's in 4suite, what's not and need to
> > be implemented is not yet clear to me.
> 
> The current version of Ft.Lib.Uri is here:
> http://cvs.4suite.org/viewcvs/4Suite/Ft/Lib/Uri.py?view=markup [1]
> 
> If you see "rfc2396bis" in the doc strings, you may safely interpret
> them to mean "RFC 3986".
> 
> 
> The functions that you should look at are the following:
> 
> MakeUrllibSafe(uriRef)
> ======================
> This exists in order to convert a proper URI reference into one that
> can be handled by urllib.urlopen(). It does the following:
> 1. If the reference contains an Internationalized Domain Name,
>    recodes it so that it is resolvable. (Py 2.3+ only)
> 2. Strips the fragment component, if any. 
> 3. Ensures that the reference is a byte string, not unicode.
> 4. On Windows, assumes that the first ':' appearing in the path
>    component is part of a drivespec, and converts it to '|'.
> 
> If you port this function, the reference to PercentDecode() may be replaced 
> with urllib.unquote(), but you must move the byte string check (#3, above) to 
> occur before calling unquote. The references to the functions SplitUriRef and 
> UnsplitUriRef can be replaced with urlsplit() and urlunsplit() from the 
> urlparse module.
> 
> 
> Absolutize(uriRef, baseUri)
> ===========================
> This does strict merging of a URI reference and a base URI. The base URI 
> *must* be absolute (must have a scheme). If you port this function, the
> UriException may be replaced with a ValueError, and SplitUriRef &
> UnsplitUriRef may be replaced with their urlparse equivalents, as
> mentioned above. The RemoveDotSegments function must also be ported and
> should be made semi-private because it is not for general use. I've
> implemented it using two segment stacks, as alluded to in the spec,
> rather than the explicit string-walking algorithm that would be too
> inefficient.
> 
> 
> BaseJoin(base, UriRef)
> ======================
> This does lenient merging of a base URI and a URI reference (note the
> argument order is different than that of Absolutize). It allows the base
> URI to be a relative reference. In such cases, we use a dummy scheme
> (we don't say "assume 'file:' because the spec says all schemes must be
> resolved the same), run it through Absolutize, and then remove the scheme
> from the result. If you port this function, you will need to port the
> IsAbsolute function, which just checks to see if the URI has a scheme.
> I prefer to use a regex for this, as it is fast and accurate (':' can
> appear in more than one place in a URI reference, so it is not safe to
> assume that its presence means there is a scheme).

thanks a lot. Actually almost all the work is already done right there. 
Here is what I've worked on. Once we'll reach a consensus, I'll add that
to pyxml. So I've joined to this mail:

- a light version of 4Suite Uri.py including the following functions:
  SplitUriRef, UnsplitUriRef (it was really less annoying to use those
  two functions than the equivalent urllib's ones), Absolutize,
  MakeUrllibSafe, _RemoveDotSegments, BaseJoin, GetScheme and
  IsAbsolute. With the presented solution, the 3 last ones are not used
  and could be removed, but I've kept them in for now. Every tests for
  Absolutize from 4suite are still passing.

- a modified version of saxutils, expecting the Uri module above to be
  in the _xmlplus directory (ie importable as xml.Uri). I've refactored
  prepare_input_source to ease testing of the URI merging stuff.

- a unittest file, which include some test cases for the URI merging
  function. Please take a look at the existant test cases to check
  everything looks fine to you. If you have other case to add, please let
  me know (or maybe can I add this file to the cvs first). Notice that
  to run the tests, you should have a "quotes.xml" file in the same
  directory as the test file (there is one in the test directory of
  pyxml). As a bonus, I've converted the escape function test from
  test_utils into a unittest in the same file.

Anyway, having SplitUriRef/UnsplitUriRef replacing 
urlparse.urlsplit/urlunsplit and Absolutize or BaseJoin replacing
urlparse.urljoin would definitly be the right thing.

-- 
Sylvain Th?nault                               LOGILAB, Paris (France).

http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_saxutils.py
Type: text/x-python
Size: 2062 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20050209/631a585c/test_saxutils-0001.py
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Uri.py
Type: text/x-python
Size: 16423 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20050209/631a585c/Uri-0001.py
-------------- next part --------------
A non-text attachment was scrubbed...
Name: saxutils.py
Type: text/x-python
Size: 24925 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20050209/631a585c/saxutils-0001.py
From john at nmt.edu  Wed Feb  9 20:46:19 2005
From: john at nmt.edu (John W. Shipman)
Date: Wed Feb  9 20:46:28 2005
Subject: [XML-SIG] Generating XML from scratch
Message-ID: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>

I've been all through python.org site and carefully read ``Python
& XML'' by Jones and Drake, but I can't find any body of practice
about the generation of XML files from scratch.  All the existing
practice seems to be about reading or modifying existing XML
documents.  I want to capture data from a GUI or other source and
store it as an XML document.

I've been doing this for a while, using the minidom in 2.2, but
apparently all the (admittedly undocumented) features I was using
went away in 2.3, and the new methods are a lot uglier.  This
means that when we upgrade to 2.3 or 2.4 locally, I have to go
back and rewrite a lot of existing, working scripts.

Here's the document I wrote that describes how I did it in 2.2:

    http://www.nmt.edu/tcc/help/pubs/pyxml/

Look under the last chapter, ``Creating a document from scratch.''
I use the constructors such as Document() and Element() in that
minidom version, but now they want me to use the .createElement()
and other factory methods from the Document object.

This is much more awkward.  Either I have to pass the Document
object to any piece of code that needs to create an Element
object, or the code needs to dig the .ownerDocument attribute out
of some handy Node object so it has access to the factory
methods.

There's one situation where even this approach doesn't work.
I have a script that generates a document fragment that gets
included in an XHTML page using server-side includes.
I can't instantiate it as a Document object, because then I
would get an <?xml...?> processing instruction at the top,
which is not something I want inside the <body> element of
a web page.

Previously I was getting around this problem by using a
DocumentFragment object, but such objects in the minidom have an
.ownerDocument attribute set to None.  So I have to instantiate
an empty Document object *just* to get access to the factory
methods.  This is what we software old-timers call a KLUGE. 

Comments?  Is there something out there I don't know about?

Best regards,
John Shipman (john@nmt.edu), Applications Specialist, NM Tech Computer Center,
Speare 119, Socorro, NM 87801, (505) 835-5950, http://www.nmt.edu/~john
  ``Let's go outside and commiserate with nature.''  --Dave Farber

From fredrik at pythonware.com  Wed Feb  9 20:59:45 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Feb  9 20:59:49 2005
Subject: [XML-SIG] Re: Generating XML from scratch
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
Message-ID: <cudq00$urr$1@sea.gmane.org>

John W. Shipman wrote:

> I want to capture data from a GUI or other source and store
> it as an XML document.
>
> I've been doing this for a while, using the minidom in 2.2, but
> apparently all the (admittedly undocumented) features I was using
> went away in 2.3, and the new methods are a lot uglier.  This
> means that when we upgrade to 2.3 or 2.4 locally, I have to go
> back and rewrite a lot of existing, working scripts.

> Comments?  Is there something out there I don't know about?

rule 1: don't use DOM, if you can avoid it.
rule 2: you can always avoid it.

some alternatives:

    http://www.xml.com/pub/a/2003/04/09/py-xml.html
    http://www.xml.com/pub/a/2003/10/15/py-xml.html
    http://effbot.org/zone/xml-writer.htm
    http://effbot.org/zone/element-index.htm

etc.

</F> 


From frans.englich at telia.com  Wed Feb  9 21:19:27 2005
From: frans.englich at telia.com (Frans Englich)
Date: Wed Feb  9 21:11:41 2005
Subject: [XML-SIG] Re: Generating XML from scratch
In-Reply-To: <cudq00$urr$1@sea.gmane.org>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
	<cudq00$urr$1@sea.gmane.org>
Message-ID: <200502092019.27937.frans.englich@telia.com>

On Wednesday 09 February 2005 19:59, Fredrik Lundh wrote:

> rule 1: don't use DOM, if you can avoid it.

		What's wrong with DOM?

What makes one want to avoid the DOM interface? Do you know any docs/links 
which discuss this further? How tied to Python is your opinion on DOM?


Cheers,

		Frans

From hostetlerm at gmail.com  Wed Feb  9 21:19:52 2005
From: hostetlerm at gmail.com (Mike Hostetler)
Date: Wed Feb  9 21:20:28 2005
Subject: [XML-SIG] Generating XML from scratch
In-Reply-To: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
Message-ID: <c60e627c05020912194df39d26@mail.gmail.com>

On Wed, 9 Feb 2005 12:46:19 -0700 (MST), John W. Shipman <john@nmt.edu> wrote:
> I've been all through python.org site and carefully read ``Python
> & XML'' by Jones and Drake, but I can't find any body of practice
> about the generation of XML files from scratch.  All the existing
> practice seems to be about reading or modifying existing XML
> documents.  I want to capture data from a GUI or other source and
> store it as an XML document.
[snip]

Here is a snippet of how I did it with the Sax parser a few years ago.
 At the time, minidom didn't do all I needed, but in Py > 2.1 minidom
has matured . . .

                from xml.dom.ext.reader import Sax

                dom = Sax.FromXml("<root />")
                assert dom.documentElement.tagName == 'root'


-- 
Mike Hostetler
http://www.binary.net/thehaas
From rsalz at datapower.com  Wed Feb  9 21:22:09 2005
From: rsalz at datapower.com (Rich Salz)
Date: Wed Feb  9 21:21:15 2005
Subject: [XML-SIG] Re: Generating XML from scratch
In-Reply-To: <200502092019.27937.frans.englich@telia.com>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>	<cudq00$urr$1@sea.gmane.org>
	<200502092019.27937.frans.englich@telia.com>
Message-ID: <420A70F1.8070903@datapower.com>

> What makes one want to avoid the DOM interface? Do you know any docs/links 
> which discuss this further? How tied to Python is your opinion on DOM?

I think that last question is the key point.

DOM is very much un-python.

If you are "just" generating XML, then you will probably go faster if 
you use things that naturally fit into the python programming idioms.
	/r$

-- 
Rich Salz, Chief Security Architect
DataPower Technology                           http://www.datapower.com
XS40 XML Security Gateway   http://www.datapower.com/products/xs40.html
XML Security Overview  http://www.datapower.com/xmldev/xmlsecurity.html
From frans.englich at telia.com  Wed Feb  9 21:35:38 2005
From: frans.englich at telia.com (Frans Englich)
Date: Wed Feb  9 21:27:52 2005
Subject: [XML-SIG] Re: Generating XML from scratch
In-Reply-To: <420A70F1.8070903@datapower.com>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
	<200502092019.27937.frans.englich@telia.com>
	<420A70F1.8070903@datapower.com>
Message-ID: <200502092035.38585.frans.englich@telia.com>

On Wednesday 09 February 2005 20:22, Rich Salz wrote:
> > What makes one want to avoid the DOM interface? Do you know any
> > docs/links which discuss this further? How tied to Python is your opinion
> > on DOM?
>
> I think that last question is the key point.
>
> DOM is very much un-python.

I would say so too; it follows the usual "function interfacing" which IMO is 
strongly present in languages like Java and C++. I'm wondering if there's any 
disadvantages beyond its un-pithonity(now _that's_ duck typing),  and/or if 
DOM should be avoided in other languages too.


Cheers,

		Frans

From mike at skew.org  Thu Feb 10 00:06:31 2005
From: mike at skew.org (Mike Brown)
Date: Thu Feb 10 00:07:32 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <20050209143938.GA4381@logilab.fr>
Message-ID: <200502092306.j19N6VJR003704@chilled.skew.org>

Sylvain Th?nault wrote:
> thanks a lot. Actually almost all the work is already done right there. 
> Here is what I've worked on. Once we'll reach a consensus, I'll add that
> to pyxml. So I've joined to this mail:
> 
> - a light version of 4Suite Uri.py including the following functions:
>   SplitUriRef, UnsplitUriRef (it was really less annoying to use those
>   two functions than the equivalent urllib's ones), Absolutize,
>   MakeUrllibSafe, _RemoveDotSegments, BaseJoin, GetScheme and
>   IsAbsolute. With the presented solution, the 3 last ones are not used
>   and could be removed, but I've kept them in for now. 

Doc strings will need to be updated to reflect the promotion from
"rfc2396bis" to RFC 3986. Also there's one place where I have "RFC
(newline)2396bis" which should also be fixed.

In MakeUrllibSafe, you should catch the UnicodeError that could result
from the attempt to force unicode to a byte string:

    if isinstance(uri, unicode):
        try:
            uri = uri.encode('us-ascii')
        except UnicodeError:
            raise ValueError("uri %r must consist of ASCII characters." % uri)

> Every tests for Absolutize from 4suite are still passing.

I forgot to point you to my tests. They do not use unittest, so they
would need to be adapted, but it would be easy since the comparisons
are string-in to string-out (or exception), and I've labeled them
pretty clearly:

  http://cvs.4suite.org/viewcvs/4Suite/test/Lib/test_uri.py?view=markup

As you will see, they are fairly comprehensive.

> - a modified version of saxutils, expecting the Uri module above to be
>   in the _xmlplus directory (ie importable as xml.Uri). I've refactored
>   prepare_input_source to ease testing of the URI merging stuff.

You might want to grep for "emacspymodestink" in your code. :)

> - a unittest file, which include some test cases for the URI merging
>   function. Please take a look at the existant test cases to check
>   everything looks fine to you. If you have other case to add, please let
>   me know (or maybe can I add this file to the cvs first). Notice that
>   to run the tests, you should have a "quotes.xml" file in the same
>   directory as the test file (there is one in the test directory of
>   pyxml). As a bonus, I've converted the escape function test from
>   test_utils into a unittest in the same file.
> 
> Anyway, having SplitUriRef/UnsplitUriRef replacing 
> urlparse.urlsplit/urlunsplit and Absolutize or BaseJoin replacing
> urlparse.urljoin would definitly be the right thing.

On python-dev in Sep 2004, I was discussing with Martin v. L?wi swhat 
principles we think should be embraced by urlparse, urllib and urllib2. He 
feels that we should simultaneously shoot for both URI and IRI support 
according to the RFCs (3986 and 3987), with unicode arguments being assumed to 
be IRIs.

I would hold off on any stdlib changes until the APIs can be discussed in 
more detail.
From Sylvain.Thenault at logilab.fr  Thu Feb 10 11:02:17 2005
From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=)
Date: Thu Feb 10 11:02:20 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <200502092306.j19N6VJR003704@chilled.skew.org>
References: <20050209143938.GA4381@logilab.fr>
	<200502092306.j19N6VJR003704@chilled.skew.org>
Message-ID: <20050210100217.GE3811@logilab.fr>

On Wednesday 09 February ? 16:06, Mike Brown wrote:
> Sylvain Th?nault wrote:
> > thanks a lot. Actually almost all the work is already done right there. 
> > Here is what I've worked on. Once we'll reach a consensus, I'll add that
> > to pyxml. So I've joined to this mail:
> > 
> > - a light version of 4Suite Uri.py including the following functions:
> >   SplitUriRef, UnsplitUriRef (it was really less annoying to use those
> >   two functions than the equivalent urllib's ones), Absolutize,
> >   MakeUrllibSafe, _RemoveDotSegments, BaseJoin, GetScheme and
> >   IsAbsolute. With the presented solution, the 3 last ones are not used
> >   and could be removed, but I've kept them in for now. 
> 
> Doc strings will need to be updated to reflect the promotion from
> "rfc2396bis" to RFC 3986. Also there's one place where I have "RFC
> (newline)2396bis" which should also be fixed.

done. However, does sections of rfc 2396bis match sections of rfc 3986 ?

> In MakeUrllibSafe, you should catch the UnicodeError that could result
> from the attempt to force unicode to a byte string:
> 
>     if isinstance(uri, unicode):
>         try:
>             uri = uri.encode('us-ascii')
>         except UnicodeError:
>             raise ValueError("uri %r must consist of ASCII characters." % uri)

done.
 
> > Every tests for Absolutize from 4suite are still passing.
> 
> I forgot to point you to my tests. They do not use unittest, so they
> would need to be adapted, but it would be easy since the comparisons
> are string-in to string-out (or exception), and I've labeled them
> pretty clearly:
> 
>   http://cvs.4suite.org/viewcvs/4Suite/test/Lib/test_uri.py?view=markup
> 
> As you will see, they are fairly comprehensive.

I did found them. As I said I've run relevant tests again the restricted 
version of Uri.py and all of them pass.

> > - a modified version of saxutils, expecting the Uri module above to be
> >   in the _xmlplus directory (ie importable as xml.Uri). I've refactored
> >   prepare_input_source to ease testing of the URI merging stuff.
> 
> You might want to grep for "emacspymodestink" in your code. :)

right, forgot that :) And I've also added the following modification to
prepare_input_source since I send it here:

@@ -510,7 +510,7 @@
         source = xmlreader.InputSource()
         source.setByteStream(f)
         if hasattr(f, "name"):
-            source.setSystemId(f.name)
+            source.setSystemId('file:%s' % f.name)
     if source.getByteStream() is None:
         sysid = absolute_system_id(source.getSystemId(), base)
         source.setSystemId(sysid)

 
> > - a unittest file, which include some test cases for the URI merging
> >   function. Please take a look at the existant test cases to check
> >   everything looks fine to you. If you have other case to add, please let
> >   me know (or maybe can I add this file to the cvs first). Notice that
> >   to run the tests, you should have a "quotes.xml" file in the same
> >   directory as the test file (there is one in the test directory of
> >   pyxml). As a bonus, I've converted the escape function test from
> >   test_utils into a unittest in the same file.

did you take a look at those tests ? Sounds good to anyone here ? More
tests to add ?

> > Anyway, having SplitUriRef/UnsplitUriRef replacing 
> > urlparse.urlsplit/urlunsplit and Absolutize or BaseJoin replacing
> > urlparse.urljoin would definitly be the right thing.
> 
> On python-dev in Sep 2004, I was discussing with Martin v. L?wi swhat 
> principles we think should be embraced by urlparse, urllib and urllib2. He 
> feels that we should simultaneously shoot for both URI and IRI support 
> according to the RFCs (3986 and 3987), with unicode arguments being assumed to 
> be IRIs.
> 
> I would hold off on any stdlib changes until the APIs can be discussed in 
> more detail.

ok.
-- 
Sylvain Th?nault                               LOGILAB, Paris (France).

http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org

From mike at skew.org  Thu Feb 10 21:15:25 2005
From: mike at skew.org (Mike Brown)
Date: Thu Feb 10 21:15:34 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <20050210100217.GE3811@logilab.fr>
Message-ID: <200502102015.j1AKFPsR009831@chilled.skew.org>

Sylvain Th?nault wrote:
> done. However, does sections of rfc 2396bis match sections of rfc 3986 ?

Yes. There were only very minor editorial changes in the last drafts before
rfc2396bis became RFC 3986.

> I did found them. As I said I've run relevant tests again the restricted 
> version of Uri.py and all of them pass.

Ah, OK. I wasn't sure what you meant at first.

> And I've also added the following modification to
> prepare_input_source since I send it here:
> 
> @@ -510,7 +510,7 @@
>          source = xmlreader.InputSource()
>          source.setByteStream(f)
>          if hasattr(f, "name"):
> -            source.setSystemId(f.name)
> +            source.setSystemId('file:%s' % f.name)
>      if source.getByteStream() is None:
>          sysid = absolute_system_id(source.getSystemId(), base)
>          source.setSystemId(sysid)

I'm not sure without seeing it in action, but this does not look
right to me (the change, as well as its context). I need to look at
what it's doing more closely.

If you need to be lenient, be lenient with the base URI. When you
prepend 'file:' to something, you're making it be absolute, which
probably isn't what you wanted, and probably won't be ideal.

> did you take a look at those tests ?

Not yet, sorry. :) Busy.
From postmaster at python.org  Fri Feb 11 03:21:22 2005
From: postmaster at python.org (MAILER-DAEMON)
Date: Fri Feb 11 03:31:23 2005
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <20050211023121.4A9BF1E400E@bag.python.org>

The original message was received at Thu, 10 Feb 2005 16:21:22 -1000
from python.org [5.86.142.153]

----- The following addresses had permanent fatal errors -----
xml-sig@python.org

----- Transcript of session follows -----
... while talking to python.org.:
>>> MAIL From:"MAILER-DAEMON" <postmaster@python.org>
<<< 509 "MAILER-DAEMON" <postmaster@python.org>... Domain blacklisted

-------------- next part --------------
Scanner: MMSMTP2.0

The message body part has been replaced with this note.

Problem description:

  Body part: 2 [file.zip]
    SAV sweep results:  A virus was detected.
         Virus found: W32/MyDoom-O
         Virus found: W32/MyDoom-O
         
    condition: virus infection
    action taken: disinfect
    
    condition: virus disinfection failed
    action taken: replace attachment
    

From john at nmt.edu  Fri Feb 11 01:27:16 2005
From: john at nmt.edu (John W. Shipman)
Date: Fri Feb 11 05:07:59 2005
Subject: [XML-SIG] More Pythonic XML creation
Message-ID: <Pine.LNX.4.44.0502101723330.17570-100000@minnie.tcct.nmt.edu>

Thanks for all the replies to my inquiry about creation
of documents from scratch using the DOM.

I've rewritten my document "Python and the XML DOM" to
conform to the way the Python 2.3 xml.dom.minidom module
wants you to use factory methods: see section 6, `Creating
a document from scratch' in this document:

    http://www.nmt.edu/tcc/help/pubs/pyxml/

Also included in this document is a module that makes
document creation more Pythonic.  It is described in
section 7 of the document, and section 8 contains a
"literate programming" presentation of the code of the
new module.

I would greatly appreciate any comments.

Best regards,
John Shipman (john@nmt.edu), Applications Specialist, NM Tech Computer Center,
Speare 119, Socorro, NM 87801, (505) 835-5950, http://www.nmt.edu/~john
  ``Let's go outside and commiserate with nature.''  --Dave Farber

From Sylvain.Thenault at logilab.fr  Fri Feb 11 09:46:31 2005
From: Sylvain.Thenault at logilab.fr (Sylvain =?iso-8859-1?Q?Th=E9nault?=)
Date: Fri Feb 11 09:46:35 2005
Subject: [XML-SIG] prepare_input_source and relative path
In-Reply-To: <200502102015.j1AKFPsR009831@chilled.skew.org>
References: <20050210100217.GE3811@logilab.fr>
	<200502102015.j1AKFPsR009831@chilled.skew.org>
Message-ID: <20050211084631.GA3844@logilab.fr>

On Thursday 10 February ? 13:15, Mike Brown wrote:
> Sylvain Th?nault wrote:
> 
> > And I've also added the following modification to
> > prepare_input_source since I send it here:
> > 
> > @@ -510,7 +510,7 @@
> >          source = xmlreader.InputSource()
> >          source.setByteStream(f)
> >          if hasattr(f, "name"):
> > -            source.setSystemId(f.name)
> > +            source.setSystemId('file:%s' % f.name)
> >      if source.getByteStream() is None:
> >          sysid = absolute_system_id(source.getSystemId(), base)
> >          source.setSystemId(sysid)
> 
> I'm not sure without seeing it in action, but this does not look
> right to me (the change, as well as its context). I need to look at
> what it's doing more closely.
> 
> If you need to be lenient, be lenient with the base URI. When you
> prepend 'file:' to something, you're making it be absolute, which
> probably isn't what you wanted, and probably won't be ideal.

To be honest, I don't feel really good with this either. What I wished
to solve here is the case where prepare_input_source get a opened file
as argument, which is a really common case since it's happen each time
we do parser.parse(open('myfile.xml')). If the parsed file contains any
reference to external resource, it's system id will be used as base uri,
and that may be a problem if it's just as in the example 'myfile.xml'.
Maybe adding "file:" if exists(abspath(f.name)) would be a good
compromise.
 
> > did you take a look at those tests ?
> 
> Not yet, sorry. :) Busy.

ok. It's just that since this is a sensitive part of pyxml, I wished to
get some code review before to check anything in. Now I guess that other
people on this list may also have an opinion on this... ;)

-- 
Sylvain Th?nault                               LOGILAB, Paris (France).

http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org

From dkuhlman at cutter.rexx.com  Fri Feb 11 18:48:26 2005
From: dkuhlman at cutter.rexx.com (Dave Kuhlman)
Date: Fri Feb 11 18:48:28 2005
Subject: [XML-SIG] Re: Generating XML from scratch
In-Reply-To: <420A70F1.8070903@datapower.com>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
	<cudq00$urr$1@sea.gmane.org>
	<200502092019.27937.frans.englich@telia.com>
	<420A70F1.8070903@datapower.com>
Message-ID: <20050211174826.GA29929@cutter.rexx.com>

On Wed, Feb 09, 2005 at 03:22:09PM -0500, Rich Salz wrote:
> >What makes one want to avoid the DOM interface? Do you know any docs/links 
> >which discuss this further? How tied to Python is your opinion on DOM?
> 
> I think that last question is the key point.
> 
> DOM is very much un-python.

OK, I'll bite.  Which characteristics make DOM un-pythonic?  Or
are we just talking about a general ick-factor here?  Maybe the
minidom API is somewhat of a mess, but then so are XML and the XML
documents that minidom must be able to represent.

> 
> If you are "just" generating XML, then you will probably go faster if 
> you use things that naturally fit into the python programming idioms.

Which things are those "that naturally fit into the python
programming idioms"?  Is it the writer idiom stuff?

My understanding is that ElementTree occupies the same niche and
satisfies the same needs as the Python implementation of DOM (for
example, minidom).  I'd like to see some sort of comparison of
minidom and ElementTree.  Are there some real reasons why I should
choose ElementTree over minidom for future work?

Is there a consensus that we should be using ElementTree instead
of minidom?  If so, it seems that this should be mentioned in the
standard "Python Library Reference" sections on DOM and minidom
rather than being a "secret known but to a few" on this list.

Dave


-- 
Dave Kuhlman
http://www.rexx.com/~dkuhlman
From rsalz at datapower.com  Fri Feb 11 21:26:22 2005
From: rsalz at datapower.com (Rich Salz)
Date: Fri Feb 11 21:25:20 2005
Subject: [XML-SIG] Re: Generating XML from scratch
In-Reply-To: <20050211174826.GA29929@cutter.rexx.com>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>	<cudq00$urr$1@sea.gmane.org>	<200502092019.27937.frans.englich@telia.com>	<420A70F1.8070903@datapower.com>
	<20050211174826.GA29929@cutter.rexx.com>
Message-ID: <420D14EE.80902@datapower.com>

> OK, I'll bite.  Which characteristics make DOM un-pythonic?

Quick reply, with some items off the top of my head.

XML says that the order of attributes and namespace nodes doesn't 
matter, just the name and value.  This maps naturally to Python 
dictionary.  On the other hand, the order of an element's children does 
matter.  This maps naturally to a Python list.

Starting from those two basic concepts, think about how simpler many 
things become -- no addBefore, addAfter, etc, just standard Python list 
slices.  Much other stuff can be thrown out.

The element object should have a "resolve_qname" method which takes a 
'foo:bar' qname and returns a (nsuri,localname) tuple.  This removes the 
need for many of the DOM get.../get...NS routines.
    for k,v in curelt.attributes.items():
        (ns,localname) = curelt.qname_resolve(k)
	... now look at all attriubtes, by qname, ns, or localname

And so on.

>>If you are "just" generating XML, then you will probably go faster if 
>>you use things that naturally fit into the python programming idioms.

Don't call complex API's.  Instead set attributes on objects.  That 
seems to be how ElementTree and amara work, for example.  But I think 
that generating XML is not a very hard or interesting problem, and that 
it is very application specific -- i..e, it depends too much on what the 
local object that you are trying to serialize is.  But I'm apparently in 
a real minority here, so don't listen to me.:)

	/r$

-- 
Rich Salz, Chief Security Architect
DataPower Technology                           http://www.datapower.com
XS40 XML Security Gateway   http://www.datapower.com/products/xs40.html
XML Security Overview  http://www.datapower.com/xmldev/xmlsecurity.html
From seethro at voila.fr  Fri Feb 11 21:51:28 2005
From: seethro at voila.fr (seethro@voila.fr)
Date: Fri Feb 11 21:56:13 2005
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <20050211205611.A6A281E4009@bag.python.org>

The message was undeliverable due to the following reason(s):

Your message could not be delivered because the destination server was
not reachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message could not be delivered within 8 days:
Host 126.220.158.127 is not responding.

The following recipients could not receive this message:
<xml-sig@python.org>

Please reply to postmaster@python.org
if you feel this message to be in error.

-------------- next part --------------
Scanner: MMSMTP2.0

The message body part has been replaced with this note.

Problem description:

  Body part: 2 [file.zip]
    SAV sweep results:  A virus was detected.
         Virus found: W32/MyDoom-O
         Virus found: W32/MyDoom-O
         Virus found: W32/MyDoom-O
         
    condition: virus infection
    action taken: disinfect
    
    condition: virus disinfection failed
    action taken: replace attachment
    

From nicolas.plourde at sympatico.ca  Fri Feb 11 22:29:26 2005
From: nicolas.plourde at sympatico.ca (nicolas.plourde@sympatico.ca)
Date: Fri Feb 11 22:31:29 2005
Subject: [XML-SIG] Mail System Error - Returned Mail
Message-ID: <200502112131.j1BLVQux027403@phoenix.szarvas.hu>


-------------- next part --------------
***************************************************************
** A csatolm�ny instruction.zip I-Worm.Mydoom.R virussal fert�z�tt,
** a csatolm�ny t�r�lve lett.
***************************************************************

From and-xml at doxdesk.com  Sat Feb 12 15:37:36 2005
From: and-xml at doxdesk.com (Andrew Clover)
Date: Sat Feb 12 15:33:40 2005
Subject: [XML-SIG] More Pythonic XML creation
In-Reply-To: <Pine.LNX.4.44.0502101723330.17570-100000@minnie.tcct.nmt.edu>
References: <Pine.LNX.4.44.0502101723330.17570-100000@minnie.tcct.nmt.edu>
Message-ID: <420E14B0.6040804@doxdesk.com>

John W. Shipman <john@nmt.edu> wrote:

> I've rewritten my document "Python and the XML DOM" to
> conform to the way the Python 2.3 xml.dom.minidom module
> wants you to use factory methods: see section 6, `Creating
> a document from scratch'

Actually you haven't quite gone far enough. Document and DocumentType 
should themselves be created from factory methods. You're supposed to 
use minidom.getDOMImplementation(), or the 'implementation' property of 
an existing Document to get a DOMImplementation object, then call 
createDocument() and createDocumentType() on it.

These constructors work for now, but can't be guaranteed; there are no 
constructors in the W3C DOM standard itself. Using the constructor for 
DocumentFragment, on the other hand, could well cause errors (like what 
you get with Element etc). Use Document.createDocumentFragment().

> I would greatly appreciate any comments.

 >> XML (eXtended Markup Language) and SGML (Standard General Markup 
Language)

eXtensible Markup Language and Standard Generalized Markup Language.

Possibly you wanted less nit-picky comments, but you've got to take what 
you can get eh?

-- 
Andrew Clover
mailto:and@doxdesk.com
http://www.doxdesk.com/
From and-xml at doxdesk.com  Sat Feb 12 15:49:12 2005
From: and-xml at doxdesk.com (Andrew Clover)
Date: Sat Feb 12 15:45:16 2005
Subject: [XML-SIG] Generating XML from scratch
In-Reply-To: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
Message-ID: <420E1768.1090402@doxdesk.com>

John W. Shipman <john@nmt.edu> wrote:

> Look under the last chapter, ``Creating a document from scratch.''
> I use the constructors such as Document() and Element() in that
> minidom version, but now they want me to use the .createElement()
> and other factory methods from the Document object.

They always did - it's the DOM standard. Minidom was just less fussy 
about it a long time ago; you're more likely to get errors about it 
these days.

> I can't instantiate it as a Document object, because then I
> would get an <?xml...?> processing instruction at the top,
> which is not something I want inside the <body> element of
> a web page.

That's not a good reason not to use a Document. An XML serializer *may* 
allow a Document to be output without the XML declaration (pxdom 
supports the DOM Level 3 LS parameter 'xml-declaration', for example). 
Alternatively, just serialise the Document.documentElement or its 
children instead of the Document object itself.

> Previously I was getting around this problem by using a
> DocumentFragment object, but such objects in the minidom have an
> .ownerDocument attribute set to None.

A DocumentFragment still has to have an owner Document. Minidom 
DocumentFragments only have a null ownerDocument if you have constructed 
them wrong, using minidom's own private constructors.

-- 
Andrew Clover
mailto:and@doxdesk.com
http://www.doxdesk.com/
From fredrik at pythonware.com  Sat Feb 12 18:22:22 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Feb 12 18:23:49 2005
Subject: [XML-SIG] Re: Re: Generating XML from scratch
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu><cudq00$urr$1@sea.gmane.org><200502092019.27937.frans.englich@telia.com><420A70F1.8070903@datapower.com>
	<20050211174826.GA29929@cutter.rexx.com>
Message-ID: <culds1$3tr$1@sea.gmane.org>

Dave Kuhlman wrote:

> Maybe the minidom API is somewhat of a mess, but then so are
> XML and the XML documents that minidom must be able to represent.

that's a popular myth.

other popular myths are that XML parsers have to be slow, because they
process Unicode; that XML DOM representations have to use tons of
memory, because they have to; and that tools that don't fully support all
kinds of XML processing are unusable for any kind of XML processing.

> I'd like to see some sort of comparison of minidom and ElementTree.
> Are there some real reasons why I should choose ElementTree over
> minidom for future work?

that's a "python vs. perl" or "static typing vs. dynamic typing" question.  I suggest
trying it, to see if it fits your brain, and the kind of XML programming you do.

> Is there a consensus that we should be using ElementTree instead
> of minidom?

if you ask toolmakers, they'll tell you that their own tool is the best one.  if you
ask users, you may get more consistent answers ;-)

</F> 


From wunder at verity.com  Sat Feb 12 20:07:43 2005
From: wunder at verity.com (Walter Underwood)
Date: Sat Feb 12 20:07:41 2005
Subject: [XML-SIG] Re: Re: Generating XML from scratch
In-Reply-To: <culds1$3tr$1@sea.gmane.org>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu><cudq00$urr$1@sea.gmane.org><200502092019.27937.frans.englich@telia.com><420A70F1.8070903@datapower.com>	<20050211174826.GA29929@cutter.rexx.com>
	<culds1$3tr$1@sea.gmane.org>
Message-ID: <01980FCE930EFC5AC55A83BD@adsl-64-166-133-243.dsl.snfc21.pacbell.net>

Might want to use something designed for generating XML. The DOM is really
designed for representing it, which isn't quite the same thing.

GenX:  <http://www.tbray.org/ongoing/When/200x/2004/02/20/GenxStatus>
Python wrapper:  <http://software.translucentcode.org/pygenx/>

wunder

--On February 12, 2005 6:22:22 PM +0100 Fredrik Lundh <fredrik@pythonware.com> wrote:

> Dave Kuhlman wrote:
> 
>> Maybe the minidom API is somewhat of a mess, but then so are
>> XML and the XML documents that minidom must be able to represent.
> 
> that's a popular myth.
> 
> other popular myths are that XML parsers have to be slow, because they
> process Unicode; that XML DOM representations have to use tons of
> memory, because they have to; and that tools that don't fully support all
> kinds of XML processing are unusable for any kind of XML processing.
> 
>> I'd like to see some sort of comparison of minidom and ElementTree.
>> Are there some real reasons why I should choose ElementTree over
>> minidom for future work?
> 
> that's a "python vs. perl" or "static typing vs. dynamic typing" question.  I suggest
> trying it, to see if it fits your brain, and the kind of XML programming you do.
> 
>> Is there a consensus that we should be using ElementTree instead
>> of minidom?
> 
> if you ask toolmakers, they'll tell you that their own tool is the best one.  if you
> ask users, you may get more consistent answers ;-)
> 
> </F> 
> 
> 
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig
> 


--
Walter Underwood
Principal Architect, Verity
From and-xml at doxdesk.com  Sun Feb 13 10:42:21 2005
From: and-xml at doxdesk.com (Andrew Clover)
Date: Sun Feb 13 10:38:25 2005
Subject: [XML-SIG] Generating XML from scratch
In-Reply-To: <Pine.LNX.4.44.0502121457540.20856-100000@minnie.tcct.nmt.edu>
References: <Pine.LNX.4.44.0502121457540.20856-100000@minnie.tcct.nmt.edu>
Message-ID: <420F20FD.1090908@doxdesk.com>

John W. Shipman <john@nmt.edu> wrote:

> True for Python 2.3, but my principal workstation still has
> Python 2.2, and even when I use the factory methods, the
> .ownerDocument attribute of the DocumentFragment is None.

Ugh, you're right, it's a typo in createDocumentFragment:

         d = DocumentFragment()
         d.ownerDoc = self

(instead of ownerDocument.)

Could you perhaps install PyXML on the 2.2 setup? The 2.2 minidom has a 
few other bugs you might also wish to avoid.

-- 
Andrew Clover
mailto:and@doxdesk.com
http://www.doxdesk.com/
From mvshah at tatanova.com  Mon Feb 14 07:31:13 2005
From: mvshah at tatanova.com (mvshah@tatanova.com)
Date: Mon Feb 14 07:37:03 2005
Subject: [XML-SIG] Re: Mail Delivery (failure mvshah@tatanova.com)
Message-ID: <20050214063113.9282.qmail@smtpmum3.tatanova.com>

hi, 

Thanks for your mail. I will open my mailbox in the evening and reply. 

Regards,
Maulik. 
From noreply at sourceforge.net  Mon Feb 14 12:04:00 2005
From: noreply at sourceforge.net (SourceForge.net)
Date: Mon Feb 14 12:04:02 2005
Subject: [XML-SIG] [ pyxml-Patches-1122297 ] ASP.NET interoperability
Message-ID: <E1D0e1I-0002ux-9d@sc8-sf-web4.sourceforge.net>

Patches item #1122297, was opened at 2005-02-14 12:04
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=306473&aid=1122297&group_id=6473

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: lode leroy (lode_leroy)
Assigned to: Nobody/Anonymous (nobody)
Summary: ASP.NET interoperability

Initial Comment:
# this patch adds composition of the SOAPAction header
# as expected by ASP.NET

--- SOAPpy/Client.py-0.11.6    2005-02-14
11:58:17.858539200 +0100
+++ SOAPpy/Client.py   2005-02-14 12:06:57.876288000 +0100
@@ -317,7 +317,10 @@
             if self.soapaction:
                 sa = self.soapaction
             else:
-                sa = ns + name
+               if ns and self.config.DotNetSoapAction:
+                       sa = ns + name
+               else:
+                       sa = name

         if hd: # Get header
             if type(hd) == TupleType:

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=306473&aid=1122297&group_id=6473
From ruger at freesurf.fr  Mon Feb 14 21:53:37 2005
From: ruger at freesurf.fr (ruger@freesurf.fr)
Date: Mon Feb 14 21:57:02 2005
Subject: [XML-SIG] Delivery reports about your e-mail
Message-ID: <20050214205700.5A6FC1E4007@bag.python.org>

The message was undeliverable due to the following reason:

Your message could not be delivered because the destination server was
unreachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message could not be delivered within 2 days:
Mail server 105.106.166.109 is not responding.

The following recipients did not receive this message:
<xml-sig@python.org>

Please reply to postmaster@python.org
if you feel this message to be in error.

-------------- next part --------------
Scanner: MMSMTP2.0

The message body part has been replaced with this note.

Problem description:

  Body part: 2 [message.pif]
    SAV sweep results:  A virus was detected.
         Virus found: W32/MyDoom-O
         
    condition: virus infection
    action taken: disinfect
    
    condition: virus disinfection failed
    action taken: replace attachment
    

From noreply at sourceforge.net  Mon Feb 14 23:05:51 2005
From: noreply at sourceforge.net (SourceForge.net)
Date: Mon Feb 14 23:05:54 2005
Subject: [XML-SIG] [ pyxml-Bugs-1122726 ] Cannot find ext.reader module
Message-ID: <E1D0oLn-0007ZL-Jo@sc8-sf-web1.sourceforge.net>

Bugs item #1122726, was opened at 2005-02-14 17:05
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1122726&group_id=6473

Category: DOM
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: wildsolution (mfjacobs)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cannot find ext.reader module

Initial Comment:
I was able to compile and install python 2.4
I built and installed  PyXML-0.8.4.
No errors.

When I try to  import the Sax2 module I get an error.

Does anyone have any suggestions on why this is 
happening?  Security Permissions, pathing?

I do not have much expereince with XML any suggstions 
would be appreciated.

Thanks,
Mike


Python 2.4 (#1, Feb 14 2005, 12:27:33) 
[GCC 3.2.2] on irix6
Type "help", "copyright", "credits" or "license" for more 
information.
>>> import os
>>> import sys
>>> from xml.dom.ext.reader import Sax2
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: No module named ext.reader
>>> 


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1122726&group_id=6473
From postmaster at python.org  Wed Feb 16 17:21:06 2005
From: postmaster at python.org (The Post Office)
Date: Wed Feb 16 17:23:54 2005
Subject: [XML-SIG] (no subject)
Message-ID: <200502161623.j1GGNpux007960@phoenix.szarvas.hu>

The original message was received at Wed, 16 Feb 2005 17:21:06 +0100
from 64.107.146.59

----- The following addresses had permanent fatal errors -----
xml-sig@python.org

----- Transcript of the session follows -----
... while talking to python.org.:
550 5.1.2 <xml-sig@python.org>... Host unknown (Name server: host not found)

-------------- next part --------------
***************************************************************
** A csatolm�ny letter.zip I-Worm.Mydoom.R virussal fert�z�tt,
** a csatolm�ny t�r�lve lett.
***************************************************************

From Uche.Ogbuji at fourthought.com  Wed Feb 16 19:55:59 2005
From: Uche.Ogbuji at fourthought.com (Uche Ogbuji)
Date: Wed Feb 16 19:56:03 2005
Subject: [XML-SIG] Generating XML from scratch
In-Reply-To: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
Message-ID: <1108580159.27858.24.camel@borgia>

On Wed, 2005-02-09 at 12:46 -0700, John W. Shipman wrote:
> I've been all through python.org site and carefully read ``Python
> & XML'' by Jones and Drake, but I can't find any body of practice
> about the generation of XML files from scratch.  All the existing
> practice seems to be about reading or modifying existing XML
> documents.  I want to capture data from a GUI or other source and
> store it as an XML document.

http://www.xml.com/pub/a/2002/11/13/py-xml.html
http://www.xml.com/pub/a/2003/03/12/py-xml.html
http://www.xml.com/pub/a/2003/10/15/py-xml.html
http://software.translucentcode.org/pygenx/

etc.

DOM can be a pretty awkward way to generate XML.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From rtomayko at gmail.com  Wed Feb 16 20:39:59 2005
From: rtomayko at gmail.com (Ryan Tomayko)
Date: Wed Feb 16 20:40:14 2005
Subject: [XML-SIG] Generating XML from scratch
In-Reply-To: <1108580159.27858.24.camel@borgia>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
	<1108580159.27858.24.camel@borgia>
Message-ID: <6e9a74e6bb0cc0df3236f750980fae84@gmail.com>

You may also want to consider using an XML aware template language:

<http://lesscode.org/projects/kid/>

Ryan

On Feb 16, 2005, at 1:55 PM, Uche Ogbuji wrote:
> On Wed, 2005-02-09 at 12:46 -0700, John W. Shipman wrote:
>> I've been all through python.org site and carefully read ``Python
>> & XML'' by Jones and Drake, but I can't find any body of practice
>> about the generation of XML files from scratch.  All the existing
>> practice seems to be about reading or modifying existing XML
>> documents.  I want to capture data from a GUI or other source and
>> store it as an XML document.
>
> http://www.xml.com/pub/a/2002/11/13/py-xml.html
> http://www.xml.com/pub/a/2003/03/12/py-xml.html
> http://www.xml.com/pub/a/2003/10/15/py-xml.html
> http://software.translucentcode.org/pygenx/
>
> etc.
>
> DOM can be a pretty awkward way to generate XML.
>
>
> -- 
> Uche Ogbuji                                    Fourthought, Inc.
> http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
> Use CSS to display XML - 
> http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
> Introducing the Amara XML Toolkit - 
> http://www.xml.com/pub/a/2005/01/19/amara.html
> Be humble, not imperial (in design) - 
> http://www.adtmag.com/article.asp?id=10286
> Querying WordNet as XML - 
> http://www.ibm.com/developerworks/xml/library/x-think29.html
> Manage XML collections with XAPI - 
> http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
> Default and error handling in XSLT lookup tables - 
> http://www.ibm.com/developerworks/xml/library/x-tiplook.html
> Packaging XSLT lookup tables as EXSLT functions - 
> http://www.ibm.com/developerworks/xml/library/x-tiplook2.html
>
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig
>

From uche.ogbuji at fourthought.com  Wed Feb 16 21:45:28 2005
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Wed Feb 16 21:45:42 2005
Subject: [XML-SIG] Generating XML from scratch
In-Reply-To: <6e9a74e6bb0cc0df3236f750980fae84@gmail.com>
References: <Pine.LNX.4.44.0502091234190.14929-100000@minnie.tcct.nmt.edu>
	<1108580159.27858.24.camel@borgia>
	<6e9a74e6bb0cc0df3236f750980fae84@gmail.com>
Message-ID: <1108586728.27858.46.camel@borgia>

On Wed, 2005-02-16 at 14:39 -0500, Ryan Tomayko wrote:
> You may also want to consider using an XML aware template language:
> 
> <http://lesscode.org/projects/kid/>

For my own preference, I really dislike hybrid XML template languages.
They seem hacky and too much of a blurring of the layers to me.  I
prefer a chain of Python feeding XSLT every time.

But to each his own, of course.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From junkc at fh-trier.de  Thu Feb 17 13:43:23 2005
From: junkc at fh-trier.de (Christian Junk)
Date: Thu Feb 17 13:43:05 2005
Subject: [XML-SIG] XBEL resource page updates
In-Reply-To: <1107537656.4527.74.camel@borgia>
References: <1106898215.8243.44.camel@borgia> <41FE8261.4020705@v.loewis.de>
	<1107537656.4527.74.camel@borgia>
Message-ID: <200502171343.23559.junkc@fh-trier.de>

Am Freitag, 4. Februar 2005 18:20 schrieb Uche Ogbuji:
> [..]
> Of this sounds good, I'll need some help getting it all set up.  My time
> is limited.  I'm OK making the basic SF project request, and some
> initial set-up.

Hi, there!

I would like to ask, if there is any interim development? Can we help? What is 
the next step?

Regards,
Christian
-- 
Christian Junk <junkc@fh-trier.de>
FH Trier, University of Applied Sciences
Faculty of Design and Applied Computer Science

http://christianjunk.webinternals.de
http://xbel.webinternals.de
From premium-server at thawte.com  Thu Feb 17 20:51:31 2005
From: premium-server at thawte.com (premium-server@thawte.com)
Date: Thu Feb 17 20:51:33 2005
Subject: [XML-SIG] Delivery reports about your e-mail
Message-ID: <20050217195132.BDB251E4002@bag.python.org>

The original message was received at Thu, 17 Feb 2005 11:51:31 -0800
from thawte.com [113.195.170.17]

----- The following addresses had permanent fatal errors -----
<xml-sig@python.org>

----- Transcript of session follows -----
... while talking to server 95.144.252.141:
550 5.1.2 <xml-sig@python.org>... Host unknown (Name server: host not found)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: message.zip
Type: application/octet-stream
Size: 26405 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20050217/a538aca5/message-0001.obj
From Uche.Ogbuji at fourthought.com  Fri Feb 18 00:45:43 2005
From: Uche.Ogbuji at fourthought.com (Uche Ogbuji)
Date: Fri Feb 18 00:46:10 2005
Subject: [XML-SIG] XBEL resource page updates
In-Reply-To: <200502171343.23559.junkc@fh-trier.de>
References: <1106898215.8243.44.camel@borgia> <41FE8261.4020705@v.loewis.de>
	<1107537656.4527.74.camel@borgia>
	<200502171343.23559.junkc@fh-trier.de>
Message-ID: <1108683943.27858.71.camel@borgia>

On Thu, 2005-02-17 at 13:43 +0100, Christian Junk wrote:
> Am Freitag, 4. Februar 2005 18:20 schrieb Uche Ogbuji:
> > [..]
> > Of this sounds good, I'll need some help getting it all set up.  My time
> > is limited.  I'm OK making the basic SF project request, and some
> > initial set-up.
> 
> Hi, there!
> 
> I would like to ask, if there is any interim development? Can we help? What is 
> the next step?

Well, I posted the idea, and you and Martin responded positively.
That's not (yet) an overwhelming endorsement, given the number of people
I've seen post on XBEL.  I thought it might be better to give people
time to mull it over before embarking on such a potentially disruptive
change.

Maybe I'm being too cautious?

Does anyone think that giving XBEL its own project space is *not* a good
idea?


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From postmaster at python.org  Fri Feb 18 07:53:45 2005
From: postmaster at python.org (The Post Office)
Date: Fri Feb 18 07:53:46 2005
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <20050218065344.E6E8C1E4002@bag.python.org>

The original message was received at Thu, 22 Jul 2004 22:48:37 -0700
from python.org [127.183.133.209]

----- The following addresses had permanent fatal errors -----
<xml-sig@python.org>

----- Transcript of session follows -----
  while talking to python.org.:
>>> MAIL From:"The Post Office" <postmaster@python.org>
<<< 501 "The Post Office" <postmaster@python.org>... Refused


-------------- next part --------------
Dangerous Attachment has been Removed.  The file "letter.zip" has been removed because of a virus.  It was infected with the "W32/Mydoom.M-mm" virus.  File quarantined as: "2821b1f0.letter.zip". http://www.fortinet.com/VirusEncyclopedia/search/encyclopediaSearch.do?method=quickSearchDirectly&virusName=W32%2FMydoom.M-mm
From Alexandre.Fayolle at logilab.fr  Fri Feb 18 08:34:56 2005
From: Alexandre.Fayolle at logilab.fr (Alexandre)
Date: Fri Feb 18 08:34:58 2005
Subject: [XML-SIG] XBEL resource page updates
In-Reply-To: <1108683943.27858.71.camel@borgia>
References: <1106898215.8243.44.camel@borgia> <41FE8261.4020705@v.loewis.de>
	<1107537656.4527.74.camel@borgia>
	<200502171343.23559.junkc@fh-trier.de>
	<1108683943.27858.71.camel@borgia>
Message-ID: <20050218073456.GB7309@crater.logilab.fr>

On Thu, Feb 17, 2005 at 04:45:43PM -0700, Uche Ogbuji wrote:
> On Thu, 2005-02-17 at 13:43 +0100, Christian Junk wrote:
> > Am Freitag, 4. Februar 2005 18:20 schrieb Uche Ogbuji:
> > > [..]
> > > Of this sounds good, I'll need some help getting it all set up.  My time
> > > is limited.  I'm OK making the basic SF project request, and some
> > > initial set-up.
> > 
> > Hi, there!
> > 
> > I would like to ask, if there is any interim development? Can we help? What is 
> > the next step?
> 
> Well, I posted the idea, and you and Martin responded positively.
> That's not (yet) an overwhelming endorsement, given the number of people
> I've seen post on XBEL.  I thought it might be better to give people
> time to mull it over before embarking on such a potentially disruptive
> change.
> 
> Maybe I'm being too cautious?
> 
> Does anyone think that giving XBEL its own project space is *not* a good
> idea?

I think it would be a *good* idea. 

-- 
Alexandre Fayolle                              LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mail.python.org/pipermail/xml-sig/attachments/20050218/790f68b7/attachment.pgp
From benn at cenix-bioscience.com  Fri Feb 18 13:20:04 2005
From: benn at cenix-bioscience.com (Neil Benn)
Date: Fri Feb 18 13:19:47 2005
Subject: [XML-SIG] SAX Events to DOM Tree
Message-ID: <4215DD74.8020606@cenix-bioscience.com>

Hello,

    I have a couple of questions :

          I'm looking for some code which takes SAX Events and converts 
them to a DOM Tree, the sax events don;t have a namespace decleration - 
however I'm lost, from this page 
(http://pyxml.sourceforge.net/topics/howto/node10.html) I can see :

dom.ext.reader
    Classes for building DOM trees from various input sources: SAX1 and
    SAX2 parsers, htmllib, and directly using Expat.

However running help from the command line bring back a reader which 
states that it can only receive streams of strings (I would expect it to 
implement ContentHandler).  I also found a class on ActiveState 
cookbooks but it only works with name spaced SAX events.  Before I go 
about writing something is there a class which can do what I need?

    The next bit is that the XMLGenerator only prints out code in one 
huge big contiguous sequence of chars, I would like to have something 
that pretty prints (with \t and \n) to the 'file-like' object (aka 
output stream ;-)).  Again this seems like a common thing so I was gonna 
check to see if I there is already a class that does this?

    As you can see I don't like writing code if it exists in a standard 
distribution, I've checked for the api docs with google (used 'pyxml 
doc', 'pyxml api doc', 'pyxml api docs', 'pyxml docs') and nada (only 
howtos and tutorials on xml).

    Thanks, in advance for your help.

Cheers,

Neil

-- 

Neil Benn
Senior Automation Engineer
Cenix BioScience
BioInnovations Zentrum
Tatzberg 46
D-01307
Dresden
Germany

Tel : +49 (0)351 4173 154
e-mail : benn@cenix-bioscience.com
Cenix Website : http://www.cenix-bioscience.com

From michael.prilla at iaw.rub.de  Fri Feb 18 13:55:43 2005
From: michael.prilla at iaw.rub.de (Michael Prilla)
Date: Fri Feb 18 13:51:52 2005
Subject: [XML-SIG] Trouble installing PyXML 
Message-ID: <1A1B1BAC56DD014CBF1E23CA6FBF782F1F6A@exchge-imtm.IAW.RUHR-UNI-BOCHUM.DE>

Hi,

I'm getting several errors while installing PyXML. I tried different
versions of PyXML (0.8.1, 0.8.3, 0.8.4) to verify the problem but the
errors are always the same.

The first problems arise when I start 'setup.py' by 'python setup.py
build' and it gives back: 

File "sysconfig.py", line 172, in customize_compiler
    cc_cmd = cc + ' ' + opt
TypeError: cannot concatenate 'str' and 'NoneType' objects

I solved these problems by checking if all the parts are not None. After
this part of the setup the process starts the gcc and hangs with the
next message:

extensions/pyexpat.c:2065: warning: excess elements in struct
initializer
extensions/pyexpat.c:2065: warning: (near initialization for
`handler_info[21]')
extensions/pyexpat.c:2065: warning: excess elements in array initializer
extensions/pyexpat.c:2065: warning: (near initialization for
`handler_info')
extensions/pyexpat.c:1998: error: storage size of `handler_info' isn't
known
error: command 'gcc' failed with exit status 1

A few lines before it produces several warnings and errors:

extensions/pyexpat.c:1664: warning: (near initialization for
`Xmlparsetype')
extensions/pyexpat.c:1664: error: parse error before "xmlparse_setattr"
extensions/pyexpat.c:1665: error: `cmpfunc' undeclared here (not in a
function)
extensions/pyexpat.c:1665: warning: excess elements in scalar
initializer
extensions/pyexpat.c:1665: warning: (near initialization for
`Xmlparsetype')
extensions/pyexpat.c:1665: error: parse error before numeric constant
extensions/pyexpat.c:1666: error: `reprfunc' undeclared here (not in a
function)
extensions/pyexpat.c:1666: warning: excess elements in scalar
initializer
extensions/pyexpat.c:1666: warning: (near initialization for
`Xmlparsetype')
extensions/pyexpat.c:1666: error: parse error before numeric constant
extensions/pyexpat.c:1667: warning: excess elements in scalar
initializer

This is the point where I can't get the installation any further. I'm
working on a SuSe Linux 9.1, the gcc is 3.3.4, Python is installed in
version 2.3.3.

Does anyone have an idea how to get the installation working or if it
might be a gcc-compatibility issue?


-- 
Michael Prilla
www.imtm-iaw.rub.de
From brian at sweetapp.com  Fri Feb 18 15:06:34 2005
From: brian at sweetapp.com (Brian Quinlan)
Date: Fri Feb 18 15:06:44 2005
Subject: [XML-SIG] SAX Events to DOM Tree
In-Reply-To: <4215DD74.8020606@cenix-bioscience.com>
References: <4215DD74.8020606@cenix-bioscience.com>
Message-ID: <4215F66A.9050300@sweetapp.com>

Neil Benn wrote:
> Hello,
> 
>    I have a couple of questions :
> 
>          I'm looking for some code which takes SAX Events and converts 
> them to a DOM Tree, the sax events don;t have a namespace decleration - 
> however I'm lost, from this page 
> (http://pyxml.sourceforge.net/topics/howto/node10.html) I can see :
> 
> dom.ext.reader
>    Classes for building DOM trees from various input sources: SAX1 and
>    SAX2 parsers, htmllib, and directly using Expat.
> 
> However running help from the command line bring back a reader which 
> states that it can only receive streams of strings (I would expect it to 
> implement ContentHandler).  I also found a class on ActiveState 
> cookbooks but it only works with name spaced SAX events.  Before I go 
> about writing something is there a class which can do what I need?

 >>> help('xml.dom.ext.reader.Sax')

This module might do what you want.

>    The next bit is that the XMLGenerator only prints out code in one 
> huge big contiguous sequence of chars, I would like to have something 
> that pretty prints (with \t and \n) to the 'file-like' object (aka 
> output stream ;-)).  Again this seems like a common thing so I was gonna 
> check to see if I there is already a class that does this?

I'd don't know anything about XMLGenerator but the problem that you are 
likely going to have is that the serializer doesn't where whitespace 
can be added without changing the semantics of your document.

For example, this:

<test><add>1</add><add>2</add><add>3</add></test>

and this:

<test>
    <add>1</add>
    <add>2</add>
    <add>3</add>
</test>

Would generate different DOMs (depending on the whitespace mode).
See here:
http://www.w3.org/TR/2000/REC-xml-20001006#sec-white-space

Maybe there is a flag to control this somewhere in the XMLGenerator 
(whatever that is) API.


Cheers,
Brian
From benn at cenix-bioscience.com  Fri Feb 18 16:15:47 2005
From: benn at cenix-bioscience.com (Neil Benn)
Date: Fri Feb 18 16:15:27 2005
Subject: [XML-SIG] XML stuff
Message-ID: <421606A3.2090102@cenix-bioscience.com>

Hello,

          Thanks for the response Brian:

---1
Cool - although I tested it straight off, binding to a Reader emitting 
start/end doc, start/end elements and no characters (both making a 
character call with an empty string and not making a character call at 
all).  Anyways, I get a traceback :

Traceback (most recent call last):
  File "CeLMA\Automation\Parsers\ParsingFramework.py", line 165, in ?
    objParser.parse(objTestFile)
  File "C:\Documents and Settings\benn.CENIX-SCIENCE\My 
Documents\svnfiles\CeLMA\Automation\Parsers\Implementation\HTDParser.py", 
line 85, in parse
    self.__startDoc()
  File "C:\Documents and Settings\benn.CENIX-SCIENCE\My 
Documents\svnfiles\CeLMA\Automation\Parsers\Implementation\HTDParser.py", 
line 193, in __startDoc
    self.__objHandler.startElement('data', AttributesImpl({}))
  File 
"C:\PROGRA~1\Python23\Lib\site-packages\_xmlplus\dom\ext\reader\Sax.py", 
line 73, in startElement
    self._completeTextNode()
  File 
"C:\PROGRA~1\Python23\Lib\site-packages\_xmlplus\dom\ext\reader\Sax.py", 
line 52, in _completeTextNode
    if self._currText:
AttributeError: XmlDomGenerator instance has no attribute '_currText'

    self,__startDoc() looks like

---
    def __startDoc(self):
        self.__objHandler.startDocument()
        self.__objHandler.startElement('data', AttributesImpl({}))
---

    How's that for a method!!  It looks to me like a charcters problem 
but I can't call charcters without calling startElement.  When I get 
time I'll dig around to look for a solution.  In the meantime, I've 
written a simple version meself.

---2
    For the XMLGenerator, there is not a flag in XMLGenerator that I can 
find - it doesn't appear in the dir and something like that would be in 
the dir as I would need to access it.  I ge teh point about the 
insignifcant white space and that is why a pretty print should be an 
option.  Although in most cases people don't care about insignificant 
whitespace (i.e. white space outside of an element) in fact I can't 
think of a _sensible_ reason to care about insignificant whitespace - 
can you (it's a Friday afternoon, go on wonder away!)?

    Have a good weekend all.

Cheers,

Neil


-- 

Neil Benn
Senior Automation Engineer
Cenix BioScience
BioInnovations Zentrum
Tatzberg 46
D-01307
Dresden
Germany

Tel : +49 (0)351 4173 154
e-mail : benn@cenix-bioscience.com
Cenix Website : http://www.cenix-bioscience.com

From brian at sweetapp.com  Fri Feb 18 16:39:49 2005
From: brian at sweetapp.com (Brian Quinlan)
Date: Fri Feb 18 16:39:51 2005
Subject: [XML-SIG] XML stuff
In-Reply-To: <421606A3.2090102@cenix-bioscience.com>
References: <421606A3.2090102@cenix-bioscience.com>
Message-ID: <42160C45.7050405@sweetapp.com>

Neil Benn wrote:
>    For the XMLGenerator, there is not a flag in XMLGenerator that I can 
> find - it doesn't appear in the dir and something like that would be in 
> the dir as I would need to access it.  I ge teh point about the 
> insignifcant white space and that is why a pretty print should be an 
> option.  Although in most cases people don't care about insignificant 
> whitespace (i.e. white space outside of an element) in fact I can't 
> think of a _sensible_ reason to care about insignificant whitespace - 
> can you (it's a Friday afternoon, go on wonder away!)?

But all whitespace (that appears in the DOM) is in an element, at least 
the document element, so what whitespace do you consider insignificant? 
For example, in XHTML, these two are different:

<p>
	<i>Neil</i>
         <b>Benn</b>
</p>

And:

<p>
	<i>Neil</i><b>Benn</b>
</p>


Cheers,
Brian
From Uche.Ogbuji at fourthought.com  Fri Feb 18 19:13:22 2005
From: Uche.Ogbuji at fourthought.com (Uche Ogbuji)
Date: Fri Feb 18 19:13:43 2005
Subject: [XML-SIG] SAX Events to DOM Tree
In-Reply-To: <4215DD74.8020606@cenix-bioscience.com>
References: <4215DD74.8020606@cenix-bioscience.com>
Message-ID: <1108750403.16835.69.camel@borgia>

On Fri, 2005-02-18 at 13:20 +0100, Neil Benn wrote:
> Hello,
> 
>     I have a couple of questions :
> 
>           I'm looking for some code which takes SAX Events and converts 
> them to a DOM Tree

See

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/298343

Available with updates in Amara XML Toolkit [1]

If you don't specify any chunking rules, it turns all SAX the SAX into
one DOM document node.

[1] http://www.xml.com/pub/a/2005/01/19/amara.html


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Use CSS to display XML - http://www.ibm.com/developerworks/edu/x-dw-x-xmlcss-i.html
Introducing the Amara XML Toolkit - http://www.xml.com/pub/a/2005/01/19/amara.html
Be humble, not imperial (in design) - http://www.adtmag.com/article.asp?id=10286
Querying WordNet as XML - http://www.ibm.com/developerworks/xml/library/x-think29.html
Manage XML collections with XAPI - http://www-106.ibm.com/developerworks/xml/library/x-xapi.html
Default and error handling in XSLT lookup tables - http://www.ibm.com/developerworks/xml/library/x-tiplook.html
Packaging XSLT lookup tables as EXSLT functions - http://www.ibm.com/developerworks/xml/library/x-tiplook2.html

From fjorback at users.multi-support.dk  Sat Feb 19 13:19:34 2005
From: fjorback at users.multi-support.dk (fjorback@users.multi-support.dk)
Date: Sat Feb 19 13:22:18 2005
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <200502191222.j1JCMFAT012708@phoenix.szarvas.hu>

Dear user xml-sig@python.org,

Your email account has been used to send a huge amount of junk email during this week.
Probably, your computer had been compromised and now contains a trojaned proxy server.

We recommend that you follow instructions in order to keep your computer safe.

Best regards,
The python.org team.

-------------- next part --------------
***************************************************************
** A csatolm�ny xtaqxw.exe I-Worm.Mydoom.R virussal fert�z�tt,
** a csatolm�ny t�r�lve lett.
***************************************************************

From usafis at usafisnews.org  Sat Feb 19 16:36:58 2005
From: usafis at usafisnews.org (usafis@usafisnews.org)
Date: Sat Feb 19 16:40:04 2005
Subject: [XML-SIG] Mail System Error - Returned Mail
Message-ID: <200502191540.j1JFe2AT013424@phoenix.szarvas.hu>

The original message was received at Sat, 19 Feb 2005 16:36:58 +0100
from usafisnews.org [179.73.237.135]

----- The following addresses had permanent fatal errors -----
xml-sig@python.org

----- Transcript of the session follows -----
... while talking to python.org.:
554 <xml-sig@python.org>... Message is too large
554 <xml-sig@python.org>... Service unavailable

-------------- next part --------------
***************************************************************
** A csatolm�ny eiei.zip I-Worm.Mydoom.R virussal fert�z�tt,
** a csatolm�ny t�r�lve lett.
***************************************************************

From mzhangyh at yahoo.com  Sun Feb 20 00:21:47 2005
From: mzhangyh at yahoo.com (Michael Zhang)
Date: Sun Feb 20 00:21:50 2005
Subject: [XML-SIG] xml parsing error
Message-ID: <20050219232148.3080.qmail@web53709.mail.yahoo.com>

Hi,

When I used the xml  to parse a document loaded from
server, I got the following error message.  Could
anybody tell what's wrong with that?

thanks,

File "ShowAllData.py", line 143, in ?
    main(sys.argv)
  File "ShowAllData.py", line 117, in main
    win = MainWindow()
  File "ShowAllData.py", line 48, in __init__
    videoInfo =
CaMLDocumentParser.getVideoInfo(GenericParser.parse(StringIO(result)))
  File
"/home/vraid1/mzhang/CaMLServer3/lib/GenericParser.py",
line 39, in parse
    xml.sax.parse (file, g)
  File
"/usr/lib/python2.2/site-packages/_xmlplus/sax/__init__.py",
line 31, in parse
    parser.parse(filename_or_stream)
  File
"/usr/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py",
line 109, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File
"/usr/lib/python2.2/site-packages/_xmlplus/sax/xmlreader.py",
line 123, in parse
    self.feed(buffer)
  File
"/usr/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py",
line 220, in feed
    self._err_handler.fatalError(exc)
  File
"/usr/lib/python2.2/site-packages/_xmlplus/sax/handler.py",
line 38, in fatalError
    raise exception
xml.sax._exceptions.SAXParseException: <unknown>:1:0:
not well-formed (invalid token)


__________________________________ 
Do you Yahoo!? 
Take Yahoo! Mail with you! Get it on your mobile phone. 
http://mobile.yahoo.com/maildemo 
From mzhang at cpsc.ucalgary.ca  Sun Feb 20 00:20:33 2005
From: mzhang at cpsc.ucalgary.ca (Yonghua (Michael) Zhang)
Date: Sun Feb 20 00:23:43 2005
Subject: [XML-SIG] xml parsing error
Message-ID: <4217C9C1.9000506@cpsc.ucalgary.ca>

Hi,

When I used the xml  to parse a document loaded from server, I got the 
following error message.  Could anybody tell what's wrong with that?

thanks,

File "ShowAllData.py", line 143, in ?
    main(sys.argv)
  File "ShowAllData.py", line 117, in main
    win = MainWindow()
  File "ShowAllData.py", line 48, in __init__
    videoInfo = 
CaMLDocumentParser.getVideoInfo(GenericParser.parse(StringIO(result)))
  File "/home/vraid1/mzhang/CaMLServer3/lib/GenericParser.py", line 39, 
in parse
    xml.sax.parse (file, g)
  File "/usr/lib/python2.2/site-packages/_xmlplus/sax/__init__.py", line 
31, in parse
    parser.parse(filename_or_stream)
  File "/usr/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py", 
line 109, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/lib/python2.2/site-packages/_xmlplus/sax/xmlreader.py", 
line 123, in parse
    self.feed(buffer)
  File "/usr/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py", 
line 220, in feed
    self._err_handler.fatalError(exc)
  File "/usr/lib/python2.2/site-packages/_xmlplus/sax/handler.py", line 
38, in fatalError
    raise exception
xml.sax._exceptions.SAXParseException: <unknown>:1:0: not well-formed 
(invalid token)

From and-xml at doxdesk.com  Sun Feb 20 01:32:02 2005
From: and-xml at doxdesk.com (Andrew Clover)
Date: Sun Feb 20 09:32:03 2005
Subject: [XML-SIG] xml parsing error
In-Reply-To: <20050219232148.3080.qmail@web53709.mail.yahoo.com>
References: <20050219232148.3080.qmail@web53709.mail.yahoo.com>
Message-ID: <4217DA82.6020201@doxdesk.com>

Michael Zhang <mzhangyh@yahoo.com> wrote:

> Could anybody tell what's wrong with that?

Not without seeing the file you're trying to parse. sax.handler says the 
document isn't well-formed - perhaps it's right?

-- 
Andrew Clover
mailto:and@doxdesk.com
http://www.doxdesk.com/
From mike at skew.org  Sun Feb 20 20:02:20 2005
From: mike at skew.org (Mike Brown)
Date: Sun Feb 20 20:02:23 2005
Subject: [XML-SIG] xml parsing error
In-Reply-To: <4217DA82.6020201@doxdesk.com>
Message-ID: <200502201902.j1KJ2Kq5001673@chilled.skew.org>

Andrew Clover wrote:
> Michael Zhang <mzhangyh@yahoo.com> wrote:
> 
> > Could anybody tell what's wrong with that?
> 
> Not without seeing the file you're trying to parse. sax.handler says the 
> document isn't well-formed - perhaps it's right?
> 

I think the error message said 1:0, which means it saw a problem at the very 
beginning of the document. Perhaps the file is empty or begins with something 
other than "<" or a BOM. He should check the file for extraneous whitespace at 
the top.
From gregoire.horkay at freemail.hu  Mon Feb 21 11:49:45 2005
From: gregoire.horkay at freemail.hu (gregoire.horkay@freemail.hu)
Date: Mon Feb 21 11:55:34 2005
Subject: [XML-SIG] {VIRUS?} xml-sig@python.org
Message-ID: <200502211053.j1LArYGS025378@hosp.ozd.hu>

Warning: This message has had one or more attachments removed.
Warning: Please read the "VirusWarning.txt" attachment(s) for more information.

The original message was received at Mon, 21 Feb 2005 11:49:45 +0100 from [159.48.65.226]

----- The following addresses had permanent fatal errors -----
xml-sig@python.org

----- Transcript of session follows -----
... while talking to host python.org.:
>>> DATA
<<< 400-aturner; %MAIL-E-OPENOUT, error opening !AS as output
<<< 400

-------------- next part --------------
This is a message from the MailScanner E-Mail Virus Protection Service
----------------------------------------------------------------------
The original e-mail attachment "text.zip"
was believed to be infected by a virus and has been replaced by this warning
message.

If you wish to receive a copy of the *infected* attachment, please
e-mail helpdesk and include the whole of this message
in your request. Alternatively, you can call them, with
the contents of this message to hand when you call.

At Mon Feb 21 11:53:54 2005 the virus scanner said:
   >>> Virus 'W32/MyDoom-O' found in file ./j1LArYGS025378/text.zip/text.scr
   >>> Virus 'W32/MyDoom-O' found in file ./j1LArYGS025378/text.zip

Note to Help Desk: Look on the MailScanner in /var/spool/MailScanner/quarantine (message j1LArYGS025378).
-- 
Postmaster
From users at openoffice.org  Tue Feb 22 13:48:41 2005
From: users at openoffice.org (users@openoffice.org)
Date: Tue Feb 22 11:56:52 2005
Subject: [XML-SIG] Returned mail: Data format error
Message-ID: <20050222105637.65D641E4004@bag.python.org>

The message was not delivered due to the following reason:

Your message could not be delivered because the destination computer was
not reachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message was not delivered within 8 days:
Server 152.22.230.105 is not responding.

The following recipients could not receive this message:
<xml-sig@python.org>

Please reply to postmaster@python.org
if you feel this message to be in error.

From jairo at jairoboudewyn.com  Thu Feb 24 06:02:07 2005
From: jairo at jairoboudewyn.com (jairo@jairoboudewyn.com)
Date: Thu Feb 24 06:02:10 2005
Subject: [XML-SIG] Delivery reports about your e-mail
Message-ID: <20050224050209.02BDD1E4005@bag.python.org>

The original message was received at Wed, 23 Feb 2005 21:02:07 -0800
from [104.153.210.66]

----- The following addresses had permanent fatal errors -----
<xml-sig@python.org>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: document.zip
Type: application/octet-stream
Size: 26015 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20050223/a266e6f3/document-0001.obj
From webdav at www.webdav.org  Thu Feb 24 19:43:49 2005
From: webdav at www.webdav.org (webdav@www.webdav.org)
Date: Thu Feb 24 19:43:03 2005
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <20050224184302.5B1C11E4005@bag.python.org>

The original message was received at Thu, 24 Feb 2005 20:43:49 +0200 from www.webdav.org [44.244.74.165]

----- The following addresses had permanent fatal errors -----
<xml-sig@python.org>

----- Transcript of session follows -----
... while talking to python.org.:
>>> MAIL From:webdav@www.webdav.org
<<< 505 Refused

-------------- next part --------------
A non-text attachment was scrubbed...
Name: message.zip
Type: application/octet-stream
Size: 29284 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20050224/c7090518/message-0001.obj
From postmaster at python.org  Mon Feb 28 15:33:27 2005
From: postmaster at python.org (Returned mail)
Date: Mon Feb 28 15:33:11 2005
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <20050228143307.JMNR5839.viefep18-int.chello.at@python.org>

The original message was received at Mon, 28 Feb 2005 15:33:27 +0100
from python.org [87.195.141.108]

----- The following addresses had permanent fatal errors -----
xml-sig@python.org

----- Transcript of session follows -----
... while talking to 200.59.53.196:
>>> MAIL FROM:"Returned mail" <postmaster@python.org>
<<< 504 Refused

-------------- next part --------------
--------  Virus Warning Message --------
The virus (W32/Mydoom.o@MM!zip) was detected in the attachment document.zip. The
attached File document.zip has been removed.

Nachfolgender Virus (W32/Mydoom.o@MM!zip) wurde im Attachment document.zip gefunden,
deshalb wurde das Attachment document.zip gel?scht.
F?r Fragen dazu steht Ihnen der chello Helpdesk sehr gerne zur Verf?gung.
Weitere Informationen zum Virenschutz: http://portal.chello.at/av-info.html

Le serveur de mail chello a d?tect? le virus W32/Mydoom.o@MM!zip dans le fichier
document.zip inclus dans ce mail. Ce fichier document.zip a donc ?t? supprim?e
pour en ?viter la diffusion. Pour plus d'information, merci de cliquer sur
le lien suivant  http://www.chello.fr

Az ?nnek k?zbes?tett lev?l mell?klet?ben a v?russz?r? rendszer a(z)
W32/Mydoom.o@MM!zip nev? v?rust tal?lta, ez?rt a(z) document.zip nev? mell?kletet biztons?gi okokb?l elt?vol?totta.
Tov?bbi inform?ci??rt, k?rj?k kattintson az al?bbi hivatkoz?sra:
http://home.hun.chello.hu/upcmnfc/start/tamogatas/virusszures/

V p??loze document.zip byl detekov?n virus W32/Mydoom.o@MM!zip. P??loha document.zip byla proto odstran?na.
Pro dotazy kontaktujte pros?m technickou podporu.

W za??czniku document.zip wykryto wirus W32/Mydoom.o@MM!zip. Plik document.zip zosta?
usuni?ty. Wi?cej informacji znajdziesz na stronie internetowej:
http://home.pol.chello.pl/upcmnfc/start/pomoc/wirusy/

V prilo?enom s?bore document.zip bol zisten? v?rus (W32/Mydoom.o@MM!zip).
S?bor document.zip bol odstr?nen?. V pr?pade ot?zok pros?m kontaktujte linku technickej podpory.
http://www.chello.sk
----------------------------------------