From uche.ogbuji@fourthought.com  Tue May  1 18:13:29 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 01 May 2001 11:13:29 -0600
Subject: [XML-SIG] Re: [4suite] PyChecker could help
References: <3aeee9f93d2396cb@amyris.wanadoo.fr> (added by amyris.wanadoo.fr)
Message-ID: <3AEEEEB9.29880F20@fourthought.com>

Sebastien Pierre wrote:

> Here are some errors that PyChecker has found with 4Suite 0.11:
> 
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> Document.py:124 No attribute (documentElement) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> Document.py:180 No attribute (documentElement) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> Document.py:211 No attribute (implementation) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> Document.py:242 No attribute (documentElement) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> Document.py:251 No attribute (childNodes) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> Document.py:299 No attribute (childNodes) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> Document.py:299 No attribute (doctype) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> Document.py:299 No attribute (documentElement) found
> 
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> FtNode.py:135 No global (XML_NAMESPACE) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> FtNode.py:271 No attribute (firstChild) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> FtNode.py:345 self is not first method argument
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> FtNode.py:346 No global (self) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> FtNode.py:362 No attribute (ownerDocument) found
> /boot/home/config/lib/python2.0/site-packages/_xmlplus/dom/
> FtNode.py:372 No attribute (ownerDocument) found
> 
> Using this tool could help you find out some bugs in the 4Suite.
> PyChecker is available at <pychecker.sourceforge.net>.
> Cheers!

Thanks, but note that 4DOM is no longer part of 4Suite.  I'll try to
look into this before the PyXML 0.6.6 release.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From rsalz@zolera.com  Wed May  2 18:16:51 2001
From: rsalz@zolera.com (Rich Salz)
Date: Wed, 02 May 2001 13:16:51 -0400
Subject: [XML-SIG] Proposing a web services SIG
Message-ID: <3AF04103.A7FA3F01@zolera.com>

I'd like to propose a new SIG, Web Services.  Web services uses XML and
related standards (schema, wsdl, soap, uddi) to provide a distributed
computing infrastructure.

There is a great deal of Python activity starting up here -- several
SOAP implementation, interop work, WSDL parsing, etc.  Much of the
information exchange has been late-night point-to-point email, and it's
time to provide a visible focal point for this activity.

Our feeling (a few of us have chatted about this) is that the web
services community generally takes Sax, DOM, etc., "for granted" and
that it makes more sense to create a new SIG rather than be part of
XML-SIG.  XML Schema is a likely area of overlap, and we'll work
together to handle that.

In terms of code, web pages, etc., we'd follow the (high) standards of
the XML Sig.

Comments, next steps?
	/r$


From Nicolas.Chauvat@logilab.fr  Wed May  2 18:36:06 2001
From: Nicolas.Chauvat@logilab.fr (Nicolas Chauvat)
Date: Wed, 2 May 2001 19:36:06 +0200 (CEST)
Subject: [XML-SIG] Proposing a web services SIG
In-Reply-To: <3AF04103.A7FA3F01@zolera.com>
Message-ID: <Pine.LNX.4.21.0105021934510.19401-100000@aries>

> Comments, next steps?

+1 for web-services-sig (and RDF tools in PyXML ;-)

--=20
Nicolas Chauvat

http://www.logilab.com - "Mais o=F9 est donc Ornicar ?" - LOGILAB, Paris (F=
rance)


From Mike.Olson@fourthought.com  Wed May  2 19:22:12 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Wed, 02 May 2001 12:22:12 -0600
Subject: [XML-SIG] Proposing a web services SIG
References: <Pine.LNX.4.21.0105021934510.19401-100000@aries>
Message-ID: <3AF05054.E803903D@FourThought.com>

Nicolas Chauvat wrote:
>=20
> > Comments, next steps?
>=20
> +1 for web-services-sig (and RDF tools in PyXML ;-)

+1 for me as well
Mike

>=20
> --
> Nicolas Chauvat
>=20
> http://www.logilab.com - "Mais o=F9 est donc Ornicar ?" - LOGILAB, Pari=
s (France)
>=20
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

--=20
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com=20
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From Cayce@actzero.com  Wed May  2 19:26:40 2001
From: Cayce@actzero.com (Cayce Ullman)
Date: Wed, 2 May 2001 11:26:40 -0700
Subject: [XML-SIG] Proposing a web services SIG
Message-ID: <F0D64494733BD411BB9A00D0B74A02640F1EC3@208-177-157-194.actzero.com>

>> Comments, next steps? 
>+1 for web-services-sig (and RDF tools in PyXML ;-) 

I would like to second this motion as well.  I'm aware of 5 implementations
of SOAP in Python (2 of which were created in the month of April, one of
which was mine),  so there is clearly some interest in Python+WS.  Plus I
think some open collaboration could go a long way towards making Python a
language of choice for web services work.

Cayce


From uche.ogbuji@fourthought.com  Wed May  2 19:40:06 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 02 May 2001 12:40:06 -0600
Subject: [XML-SIG] Proposing a web services SIG
In-Reply-To: Message from Cayce Ullman <Cayce@actzero.com>
 of "Wed, 02 May 2001 11:26:40 PDT." <F0D64494733BD411BB9A00D0B74A02640F1EC3@208-177-157-194.actzero.com>
Message-ID: <200105021840.f42Ie6D21877@localhost.local>

> >> Comments, next steps? 
> >+1 for web-services-sig (and RDF tools in PyXML ;-) 
> 
> I would like to second this motion as well.  I'm aware of 5 implementations
> of SOAP in Python (2 of which were created in the month of April, one of
> which was mine),  so there is clearly some interest in Python+WS.  Plus I
> think some open collaboration could go a long way towards making Python a
> language of choice for web services work.

Well, all very well, and I can go either way on new SIG vs. just use XML-SIG, 
but does anyone know how to expeditiously go about creating a Python SIG?  I 
suppose it involves some magic incantations on the meta-SIG, but I don't know 
the current state-of-the-SIGS.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From guido@digicool.com  Wed May  2 20:41:48 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 02 May 2001 14:41:48 -0500
Subject: [XML-SIG] Re: [meta-sig] Proposing a web services SIG
In-Reply-To: Your message of "Wed, 02 May 2001 13:16:51 -0400."
 <3AF04103.A7FA3F01@zolera.com>
References: <3AF04103.A7FA3F01@zolera.com>
Message-ID: <200105021941.OAA03587@cj20424-a.reston1.va.home.com>

> I'd like to propose a new SIG, Web Services.  Web services uses XML and
> related standards (schema, wsdl, soap, uddi) to provide a distributed
> computing infrastructure.
> 
> There is a great deal of Python activity starting up here -- several
> SOAP implementation, interop work, WSDL parsing, etc.  Much of the
> information exchange has been late-night point-to-point email, and it's
> time to provide a visible focal point for this activity.
> 
> Our feeling (a few of us have chatted about this) is that the web
> services community generally takes Sax, DOM, etc., "for granted" and
> that it makes more sense to create a new SIG rather than be part of
> XML-SIG.  XML Schema is a likely area of overlap, and we'll work
> together to handle that.
> 
> In terms of code, web pages, etc., we'd follow the (high) standards of
> the XML Sig.
> 
> Comments, next steps?

Read http://www.python.org/sigs/guidelines.html (all of it!).

Basically, you need to appoint a volunteer, write a mission statement,
and circulate the draft mission statement on the meta-sig.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From rsalz@zolera.com  Wed May  2 20:23:15 2001
From: rsalz@zolera.com (Rich Salz)
Date: Wed, 02 May 2001 15:23:15 -0400
Subject: [XML-SIG] Re: [meta-sig] Proposing a web services SIG
References: <3AF04103.A7FA3F01@zolera.com> <200105021941.OAA03587@cj20424-a.reston1.va.home.com>
Message-ID: <3AF05EA3.71F8B4F1@zolera.com>

> Read http://www.python.org/sigs/guidelines.html (all of it!).

I did.  The instructions at the end were fairly casual, and I thought my
note was good enough, sorry.  Let me try again...

> Basically, you need to appoint a volunteer, write a mission statement,
> and circulate the draft mission statement on the meta-sig.

I'm volunteering to coordinate webservices-sig.

Short blurb: make it easy for python programmers to provide and use web
services.

Longer blurb: Web services uses SOAP, WSDL, UDDI, other standards to
provide a distributed component infrastructure. The webservices-sig is
focused on providing implementations of these standards so that Python
programmers can easily write and use web services (i.e., both clients
and servers -- the latter includes HTTPServer, but also other servers
such as Apache, Zope, etc.)

The initial goal of the SIG will be to develop freely-usable
implementations of SOAP, WSDL, and probably UDDI. Some coordination with
XML Sig will be necessary, for example, because WSDL uses XML Schema. We
will develop a framework for supporting multiple implementations.

Thanks.
	/r$


From Juergen Hermann" <jh@web.de  Wed May  2 22:04:04 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Wed, 02 May 2001 23:04:04 +0200
Subject: [XML-SIG] SOAP for Python
In-Reply-To: <F0D64494733BD411BB9A00D0B74A02640F1EC3@208-177-157-194.actzero.com>
Message-ID: <m14v3lX-007WdvC@smtp.web.de>

On Wed, 2 May 2001 11:26:40 -0700, Cayce Ullman wrote:

>I would like to second this motion as well.  I'm aware of 5 implementat=
ions
>of SOAP in Python (2 of which were created in the month of April, one o=
f
>which was mine),  

Could you list those, together with a homepage URL? ;)


Ciao, J=FCrgen


From noreply@sourceforge.net  Wed May  2 23:31:23 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 02 May 2001 15:31:23 -0700
Subject: [XML-SIG] [ pyxml-Bugs-420882 ] no xpath, xslt install from CVS checkout
Message-ID: <E14v59X-0004vD-00@usw-sf-web2.sourceforge.net>

Bugs item #420882, was updated on 2001-05-02 15:31
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=420882&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Karl Anderson (karlanderson)
Assigned to: Nobody/Anonymous (nobody)
Summary: no xpath, xslt install from CVS checkout

Initial Comment:
I installed a CVS checkout from an hour or so ago
into a test directory with setup.py:

python setup.py build
python setup.py install --prefix=[dir]

This didn't copy the xpath or xslt dirs into the
/lib/python1.5/site-packages/xml subdirectory of my
install dir.  Once I copied them manually xpath
worked.

I expected setup.py to use everything that was built;
am I doing something weird?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=420882&group_id=6473


From Cayce@actzero.com  Thu May  3 01:17:02 2001
From: Cayce@actzero.com (Cayce Ullman)
Date: Wed, 2 May 2001 17:17:02 -0700
Subject: [XML-SIG] RE: SOAP for Python
Message-ID: <F0D64494733BD411BB9A00D0B74A02640F1EC5@208-177-157-194.actzero.com>

> >I would like to second this motion as well.  I'm aware of 5 
> implementations
> >of SOAP in Python (2 of which were created in the month of 
> April, one of
> >which was mine),  
> 
> Could you list those, together with a homepage URL? ;)
> 
SOAP.py (mine) : http://www.actzero.com  the leader in terms of
interoperability and features (as far as I know)

SOAP.py (part of Scarab) : http://www.casbah.org hasn't moved for over a
year, at a glance looks fairly unusable.

soaplib.py : http://www.pythonware.com by Fredrik Lundh, much in the style
of xmlrpclib

SOAPy : http://soapy.sourceforge.net by Adam Elman, new client
implementation supports WSDL

FT : http://www.fourthought.com It was my understanding that Fourthought
also is working on an impl, correct me if I'm wrong Mike.


From rsalz@zolera.com  Thu May  3 01:39:34 2001
From: rsalz@zolera.com (Rich Salz)
Date: Wed, 02 May 2001 20:39:34 -0400
Subject: [XML-SIG] RE: SOAP for Python
References: <F0D64494733BD411BB9A00D0B74A02640F1EC5@208-177-157-194.actzero.com>
Message-ID: <3AF0A8C6.BD3C239F@zolera.com>

> FT : http://www.fourthought.com It was my understanding that Fourthought
> also is working on an impl, correct me if I'm wrong Mike.

I think he's at the same stage as I am -- discussion.


From uche.ogbuji@fourthought.com  Thu May  3 03:24:34 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 02 May 2001 20:24:34 -0600
Subject: [XML-SIG] RE: SOAP for Python
In-Reply-To: Message from Rich Salz <rsalz@zolera.com>
 of "Wed, 02 May 2001 20:39:34 EDT." <3AF0A8C6.BD3C239F@zolera.com>
Message-ID: <200105030224.f432OYX01370@localhost.local>

> > FT : http://www.fourthought.com It was my understanding that Fourthought
> > also is working on an impl, correct me if I'm wrong Mike.
> 
> I think he's at the same stage as I am -- discussion.

Nope.  Way past discussion.  4Suite Server 0.11 (alpha) features a SOAP server.

Examples here

http://www-106.ibm.com/developerworks/webservices/library/ws-pyth3/


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From Mike.Olson@fourthought.com  Thu May  3 03:21:08 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Wed, 02 May 2001 20:21:08 -0600
Subject: [XML-SIG] RE: SOAP for Python
References: <F0D64494733BD411BB9A00D0B74A02640F1EC5@208-177-157-194.actzero.com>
Message-ID: <3AF0C094.3946A1AF@FourThought.com>

Cayce Ullman wrote:
> 
> 
> FT : http://www.fourthought.com It was my understanding that Fourthought
> also is working on an impl, correct me if I'm wrong Mike.

We have parts of an implementation but are looking to expand on it a lot
in the next month or so.

Mike

> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Thu May  3 04:11:25 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 02 May 2001 21:11:25 -0600
Subject: [XML-SIG] RE: SOAP for Python
In-Reply-To: Message from Mike Olson <Mike.Olson@fourthought.com>
 of "Wed, 02 May 2001 20:21:08 MDT." <3AF0C094.3946A1AF@FourThought.com>
Message-ID: <200105030311.f433BPV01587@localhost.local>

> Cayce Ullman wrote:
> > 
> > 
> > FT : http://www.fourthought.com It was my understanding that Fourthought
> > also is working on an impl, correct me if I'm wrong Mike.
> 
> We have parts of an implementation but are looking to expand on it a lot
> in the next month or so.

Ah.  Mike's more cautious than I.  I'll be explicit though: the only part 
we're "missing" is the SOAP serialization.  But as far as I'm concerned, we're 
not missing anything in that case.  The SOAP serialization, frankly stinks.  
I've already spat my venom at whomever didn't rip section 5 out of the SOAP 
spec after a second reading, but we'll see how that works out.

Until then, I rely on the fact that section 5 is explicitly optional.  There 
is no requirement for a SOAP implementation to use the SOAP serialization.  
I'm actually more interested in writing an RDF serialization, and with some 
support, it's not inconceivable that such a thing would oust section 5 before 
XML Protocol emerges.

So, I disagree that 4SS has just parts of an implementation.  We have a SOAP 
server according to the spec.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From rsalz@zolera.com  Thu May  3 04:56:51 2001
From: rsalz@zolera.com (Rich Salz)
Date: Wed, 02 May 2001 23:56:51 -0400
Subject: [XML-SIG] RE: SOAP for Python
References: <200105030311.f433BPV01587@localhost.local>
Message-ID: <3AF0D703.29636208@zolera.com>

> Until then, I rely on the fact that section 5 is explicitly optional.  There
> is no requirement for a SOAP implementation to use the SOAP serialization.

Technically right, but it would be *very* surprising and upsetting to
folks who naively used the 4SS implementation to talk to other web
services.  It might even cause them to spit venom at you.

> I'm actually more interested in writing an RDF serialization, and with some
> support, it's not inconceivable that such a thing would oust section 5 before
> XML Protocol emerges.

It's about as likely as someone accepting my DER encoding.
	/r$


From tpassin@home.com  Thu May  3 05:04:47 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Thu, 3 May 2001 00:04:47 -0400
Subject: [XML-SIG] Proposing a web services SIG
References: <3AF04103.A7FA3F01@zolera.com>
Message-ID: <003301c0d386$319fab80$7cac1218@reston1.va.home.com>

[Rich Salz]

> I'd like to propose a new SIG, Web Services.  Web services uses XML and
> related standards (schema, wsdl, soap, uddi) to provide a distributed
> computing infrastructure.
> 

I'd go +1 on this.

Cheers,

Tom P


From uche.ogbuji@fourthought.com  Thu May  3 05:23:42 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 02 May 2001 22:23:42 -0600
Subject: [XML-SIG] RE: SOAP for Python
In-Reply-To: Message from Rich Salz <rsalz@zolera.com>
 of "Wed, 02 May 2001 23:56:51 EDT." <3AF0D703.29636208@zolera.com>
Message-ID: <200105030423.f434Nga02058@localhost.local>

> > Until then, I rely on the fact that section 5 is explicitly optional.  There
> > is no requirement for a SOAP implementation to use the SOAP serialization.
> 
> Technically right, but it would be *very* surprising and upsetting to
> folks who naively used the 4SS implementation to talk to other web
> services.  It might even cause them to spit venom at you.

Usually that's when things become fun.

However, you'll have to explain yourself better.  What is this naivete you're 
talking about?  If they're using a "conformant" SOAP client, there should be 
little such "surprise".  And they certainly should not be upset.

Even Dave Reed of Miccrosoft at XML DevCon was very careful to point out that 
the success of SOAP interop would come with proper handling of SOAP's 
flexibility.  Check your assumptions at the door or prepare to crash and burn.

If the major champion of SOAP can say so, especially after cooking up five of 
their own SOAP implemnentations wand having to (admittedly) force-feed 
themselves interop, I don't see how I can credit your idea that anyone should 
be surprised or upset working with a system that doesn't implement section 5.

> > I'm actually more interested in writing an RDF serialization, and with some
> > support, it's not inconceivable that such a thing would oust section 5 before
> > XML Protocol emerges.
> 
> It's about as likely as someone accepting my DER encoding.

If you think you know the shape of what will come from XP, I think you have 
another thought coming.  The politics that are massed within this group are 
probably even more massed than those of XML Schema, and indeed the XP WG is 
larger than the Schema WG.

I can lay a solid bet that you won't recognize a significant amount of XP from 
what you see in SOAP.

But then again, anyone who followed XML-RPC -> SOAP should realize this isn't 
much of a prediction.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Thu May  3 05:40:28 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 02 May 2001 22:40:28 -0600
Subject: [XML-SIG] RE: SOAP for Python
In-Reply-To: Message from Rich Salz <rsalz@zolera.com>
 of "Wed, 02 May 2001 23:56:51 EDT." <3AF0D703.29636208@zolera.com>
Message-ID: <200105030440.f434eSb02098@localhost.local>

> > Until then, I rely on the fact that section 5 is explicitly optional.  There
> > is no requirement for a SOAP implementation to use the SOAP serialization.
> 
> Technically right, but it would be *very* surprising and upsetting to
> folks who naively used the 4SS implementation to talk to other web
> services.  It might even cause them to spit venom at you.

After I sent my last message another thought struck me.  You use the term "Web 
services" above.  Probably I have to understand what you mean by that before I 
understand why you think it would be surprising and upsetting to have SOAP 
systems that don't implement section 5.

The only reason everyone would want to "just stick to section 5" is for 
"transparent" API-type calls.  RPC all over again.  Basically CORBA with 
SOAP/HTTP over the wire rather than CDR/IIOP.

But what on earth is the use of such a thing?  Why not just use CORBA or DCOM 
or RMI, all of which are vastly more efficient than SOAP and can claim more 
pedigree and interop?

The answer is simple: because such tightly-coupled systems do not survive the 
boundary from one business technology and process to another.  Crossing such a 
boundary requires loosely-coupled systems, and that is the only reason there 
is any relevance to the buzzword "Web services".

Successful Web services will be message-oriented, loosely coupled systems with 
a great deal of flexibility that is handled through metadata management.  
Whether you're in the ebXML camp or the UDDI camp, you had better be taking 
those tModels, WSDL bindings and CPPs seriously, because if you just blindly 
write code that assumes that, say everyone uses SOAP serialization, you will 
be doing commerce with only a fraction of your brave new market.

This is why it was utter silliness for section 5 not to have been broken out 
of SOAP transport into a separate spec.  It encourages people to wrongly 
assume that SOAP implies section 5, and thereby condemn themselves to 
reinventing the RPC wheel all over again.

And I'll note that I'm not alone in this sentiment.  In past SOAP debates on 
XML-DEV, no lesser figures than Tim Bray and David Megginson have expressed 
similar annoyance at the conflation of transport and content model that mars 
SOAP.

So do I think it's realistic that section 5 will be put in its place before XP 
emerges?  Absolutely.  And in the unlikely event that this doesn't happen, Web 
services will pretty much drown in its own unfulfilled promises.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From martin@loewis.home.cs.tu-berlin.de  Wed May  2 23:53:07 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 3 May 2001 00:53:07 +0200
Subject: [XML-SIG] Proposing a web services SIG
In-Reply-To: <200105021840.f42Ie6D21877@localhost.local> (message from Uche
 Ogbuji on Wed, 02 May 2001 12:40:06 -0600)
References: <200105021840.f42Ie6D21877@localhost.local>
Message-ID: <200105022253.f42Mr7B01762@mira.informatik.hu-berlin.de>

> Well, all very well, and I can go either way on new SIG vs. just use
> XML-SIG, but does anyone know how to expeditiously go about creating
> a Python SIG?  I suppose it involves some magic incantations on the
> meta-SIG, but I don't know the current state-of-the-SIGS.

I just asked to close three of them, so it is probably time to fill
the empty space :-)

In any case, I think Rich's proposal is missing an expiration/review
date for the SIG, yet. Traditionally, SIGs used to expire after one
(?) year (after which they could be extended), but with the little
review they get after that time, reviewing them every two years is
probably as fine.

In any case, this is all meta-sig business.

Regards,
Martin

P.S. There is also the issue of the SIG web pages. I'm still looking
for comments on whether they ought to live in the Python CVS, or in a
separate SF project (which check-in-permissions for all SIG
coordinators).


From noreply@sourceforge.net  Thu May  3 09:58:41 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 03 May 2001 01:58:41 -0700
Subject: [XML-SIG] [ pyxml-Bugs-420977 ] 4XSLT traceback
Message-ID: <E14vEwb-0000Th-00@usw-sf-web3.sourceforge.net>

Bugs item #420977, was updated on 2001-05-03 01:58
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=420977&group_id=6473

Category: SAX
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: 4XSLT traceback

Initial Comment:
Hi,

I get a traceback when trying to process an XSLT
generated by schematron. The XSLT is attached to this
bug report. It could be a problem with the schematron
itself.

The document on which he xslt is applied is '<recipe
dacy="300"/>

The traceback is:

alf@lapinot:~/schematron$ 4xslt test.xml recipe.xsl
Traceback (innermost last):
  File "/usr/bin/4xslt", line 5, in ?
    _4xslt.Run(sys.argv)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/_4xslt.py",
line 113, in Run
    topLevelParams=top_level_params)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/Processor.py",
line 150, in runUri
    writer, uri, outputStream)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/Processor.py",
line 250, in execute
    self.applyTemplates(context, None)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/Processor.py",
line 267, in applyTemplates
    found = sty.applyTemplates(context, mode, self, params)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/Stylesheet.py",
line 430, in applyTemplates

patternInfo[PatternInfo.TEMPLATE].instantiate(context,
processor, params)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/TemplateElement.py",
line 114, in instantiate
    context = child.instantiate(context, processor)[0]
  File
"/usr/lib/python1.5/site-packages/xml/xslt/ApplyTemplatesElement.py",
line 93, in instantiate
    processor.applyTemplates(context, mode, params)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/Processor.py",
line 271, in applyTemplates
    self.applyBuiltins(context, mode)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/Processor.py",
line 284, in applyBuiltins
    self.applyTemplates(context, mode)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/Processor.py",
line 267, in applyTemplates
    found = sty.applyTemplates(context, mode, self, params)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/Stylesheet.py",
line 430, in applyTemplates

patternInfo[PatternInfo.TEMPLATE].instantiate(context,
processor, params)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/TemplateElement.py",
line 112, in instantiate
    new_level)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/ChooseElement.py",
line 61, in instantiate
    context, chosen, rec_tpl_params =
child.instantiate(context, processor, new_level)
  File
"/usr/lib/python1.5/site-packages/xml/xslt/WhenElement.py",
line 43, in instantiate
    result = self._expr.evaluate(context)
  File
"/usr/lib/python1.5/site-packages/xml/xpath/ParsedExpr.py",
line 369, in evaluate
    rt = Conversions.BooleanEvaluate(self._right, context)
  File
"/usr/lib/python1.5/site-packages/xml/xpath/Conversions.py",
line 33, in BooleanEvaluate
    rt = exp.evaluate(context)
  File
"/usr/lib/python1.5/site-packages/xml/xpath/ParsedExpr.py",
line 408, in evaluate
    lrt = self._left.evaluate(context)
  File
"/usr/lib/python1.5/site-packages/xml/xpath/ParsedExpr.py",
line 180, in evaluate
    return self._func(context, arg0)
  File
"/usr/lib/python1.5/site-packages/xml/xpath/CoreFunctions.py",
line 300, in Floor
    if int(number) == number:
TypeError: object can't be converted to int


This is with 4Suite-0.11a2.

Cheers

Alexandre

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=420977&group_id=6473


From noreply@sourceforge.net  Thu May  3 12:29:30 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 03 May 2001 04:29:30 -0700
Subject: [XML-SIG] [ pyxml-Bugs-421001 ] Undefined symbol XML_SetEntityDeclHandle
Message-ID: <E14vHIY-0005kW-00@usw-sf-web2.sourceforge.net>

Bugs item #421001, was updated on 2001-05-03 04:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421001&group_id=6473

Category: expat
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: Undefined symbol XML_SetEntityDeclHandle

Initial Comment:
On FreeBSD 4.2 i386, with Python 2.0, PyXML 0.6.5, 4Suite 0.11a2 and 4SS 0.11a2 I get the following error:

  File "/usr/local/bin/4ss", line 3, in ?
    from FtServer.Console import CommandLine
  File "/usr/local/lib/python2.0/site-packages/FtServer/Console/CommandLine.py", line 3, in ?
    from Commands import g_commands
  File "/usr/local/lib/python2.0/site-packages/FtServer/Console/Commands/__init__.py", line 2, in ?
    import Init
  File "/usr/local/lib/python2.0/site-packages/FtServer/Console/Commands/Init.py", line 15, in ?
    from FtServer.Core.Lib import ConfigFile
  File "/usr/local/lib/python2.0/site-packages/FtServer/Core/Lib/ConfigFile.py", line 2, in ?
    from Ft.Rdf.Serializers.Dom import Serializer
  File "/usr/local/lib/python2.0/site-packages/Ft/Rdf/Serializers/Dom.py", line 27, in ?
    from Ft.Lib import pDomlette
  File "/usr/local/lib/python2.0/site-packages/Ft/Lib/pDomlette.py", line 668, in ?
    from pDomletteReader import *
  File "/usr/local/lib/python2.0/site-packages/Ft/Lib/pDomletteReader.py", line 27, in ?
    from xml.parsers import expat
  File "/usr/local/lib/python2.0/site-packages/_xmlplus/parsers/expat.py", line 4, in ?
    from pyexpat import *
ImportError: /usr/local/lib/python2.0/site-packages/_xmlplus/parsers/pyexpat.so: Undefined symbol "XML_SetEntityDeclHandler"


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421001&group_id=6473


From rsalz@zolera.com  Thu May  3 14:58:45 2001
From: rsalz@zolera.com (Rich Salz)
Date: Thu, 03 May 2001 09:58:45 -0400
Subject: [XML-SIG] RE: SOAP for Python
References: <200105030423.f434Nga02058@localhost.local>
Message-ID: <3AF16415.A4BB8CD1@zolera.com>

> What is this naivete you're
> talking about?

If you asked 100 people who were building SOAP applications
	Did you know we could both be compliant but use different data
	transfers and therefore be unable to interoperate?
I'll bet more than half would be surprised, and more than 80% would say
"yes, but doesn't everyone at least support the common scheme."

I agree WSDL is way important, which is one of the motivators for a
web-services SIG.

I disagree that Sec5's inefficiencies doom it to failure, and it's
installed base will be enough to ensure it's viability.  But that's a
simple bet whose answer we'll know in a couple of years.  Not worth
arguing over.
	/r$


From uche.ogbuji@fourthought.com  Thu May  3 15:27:02 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 03 May 2001 08:27:02 -0600
Subject: [XML-SIG] RE: SOAP for Python
In-Reply-To: Message from Rich Salz <rsalz@zolera.com>
 of "Thu, 03 May 2001 09:58:45 EDT." <3AF16415.A4BB8CD1@zolera.com>
Message-ID: <200105031427.f43ER2004499@localhost.local>

> > What is this naivete you're
> > talking about?
> 
> If you asked 100 people who were building SOAP applications
> 	Did you know we could both be compliant but use different data
> 	transfers and therefore be unable to interoperate?
> I'll bet more than half would be surprised, and more than 80% would say
> "yes, but doesn't everyone at least support the common scheme."

As you said, time will tell, but you were talking about Web services, not 
applications.  I thought this is what the entire thread was about.  I can 
assure you that I have spoken to/worked with quite a few in the nascent Web 
services space, and that most have learned not to take anything for granted, 
as long as it is conformant.

You'd be surprised how much SOAP work is proceeding without Section 5.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From rsalz@zolera.com  Thu May  3 15:31:17 2001
From: rsalz@zolera.com (Rich Salz)
Date: Thu, 03 May 2001 10:31:17 -0400
Subject: [XML-SIG] RE: SOAP for Python
References: <200105031427.f43ER2004499@localhost.local>
Message-ID: <3AF16BB5.117B7D45@zolera.com>

> You'd be surprised how much SOAP work is proceeding without Section 5.

Life surprises me.
:)
	/r$


From stuff4gary@hotmail.com  Thu May  3 20:48:34 2001
From: stuff4gary@hotmail.com (gary cor)
Date: Thu, 03 May 2001 19:48:34
Subject: [XML-SIG] Deleting and appending of a file, without reading into memory
Message-ID: <F222KA0tjpaELFU5oot00012cc4@hotmail.com>

I want to add some text onto the end of an XML file just before the closing 
tag but I don't want to read the whole file into memory as it is quite a 
large file. I am trying to do the following:

1. delete 14 characters off the end of the file (the closing tag)
2. add some new data text from a cgi script onto this
     ie - file.append(cgi_resxml)
3.  - then add back on the closing tag (14 character '</root>')
     ie - file.append('</root>')

I can manage (2.) & (3.) no problems opening the file handler with append 
access ('a'), but I can't get into to do (1.) as well... does this append 
function have a reverse function and can I use that, or should I be doing 
this a differn't way?

Kind Regards

Gary

_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.


From uche.ogbuji@fourthought.com  Thu May  3 21:14:46 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 3 May 2001 14:14:46 -0600
Subject: [XML-SIG] ANN: 4Suite  and 4Suite Server 0.11
Message-ID: <200105032014.f43KEkv08659@localhost.local>

Fourthought, Inc. (http://Fourthought.com) announces the release of

                   4Suite 0.11 and 4Suite Server 0.11
                      ----------------------------
          Open source XML processing tools and an XML data server

                           http://4Suite.org
                  http://Fourthought.com/4SuiteServer


4Suite Server News
------------------

    Basically re-written from ground up.  CORBA is no longer required
    and is now just another way to access the server (along with HTTP,
    SOAP, WebDAV, Python API, etc).

    Many usability, documentation, performance and architectural
    improvements


4Suite News
-----------

    * Release 0.11.0 (Tag R20010501)
    * pDomlette: XInclude implemented directly into parse for efficiency
    * pDomlette: better modularized
    * cDomlette: memory leaks squashed
    * RDF: add command line
    * RDF: major serialization and deserialization fixes
    * RDF: Work access-control directly into RDF model
    * RDF: API tweaks: use user flags for query flexibility
    * XSLT: Many speedups
    * XSLT: xsl:variable and xsl:param conformance fixes
    * ODS: Many bugs fixes in the DbmAdapter
    * Lib: Many bugs fixes in the DbmDriver
    * Many misc optimizations and bug-fixes


4Suite is a collection of Python tools for XML processing and object
database management.  It provides support for XML parsing, several
transient and persistent DOM implementations, XPath expressions,
XPointer, XSLT transforms, XLink, RDF and ODMG object databases.

4Suite Server is a platform for XML processing.  It features an XML data
repository, metadata management, a rules-based engine, XSLT transforms,
XPath and RDF-based indexing and query, XLink resolution and many other
XML services.  It also provides transactions and access control features.
Along with basic console and command-line management, it supports remote,
cross-platform and cross-language access through CORBA, WebDAV,
HTTP and other request protocols.

4Suite Server is not meant to be a full-blown application server.  It
provides highly-specialized services for XML processing that can be used
with other application servers.

All the software is open-source and free to download.  Priority support
and customization is available from Fourthought, Inc.  For more
information on this, see the http://FourThought.com, or contact
Fourthought at info@fourthought.com or +1 303 583 9900

More info and Obtaining 4Suite and 4Suite Server
------------------------------------------------

Please see

        http://4Suite.org
        http://Fourthought.com/4SuiteServer

>From where you can download source, Windows and Linux binaries.

4Suite is distributed under a license similar to that of the
Apache Web Server.


From akuchlin@mems-exchange.org  Thu May  3 21:19:19 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 3 May 2001 16:19:19 -0400
Subject: [XML-SIG] Deleting and appending of a file, without reading into memory
In-Reply-To: <F222KA0tjpaELFU5oot00012cc4@hotmail.com>; from stuff4gary@hotmail.com on Thu, May 03, 2001 at 07:48:34PM +0000
References: <F222KA0tjpaELFU5oot00012cc4@hotmail.com>
Message-ID: <20010503161919.A3785@ute.cnri.reston.va.us>

On Thu, May 03, 2001 at 07:48:34PM +0000, gary cor wrote:
>I want to add some text onto the end of an XML file just before the closing 
>tag but I don't want to read the whole file into memory as it is quite a 
>large file. I am trying to do the following:
>
>1. delete 14 characters off the end of the file (the closing tag)
  ...

This is fragile; what if there is trailing whitespace at the end of
the file?  What if the closing tag is written strangely, as '< /
closing >' or something like that?  

Now, what's the best way to do this?  You could write a simple SAX
handler where startElement() and characters() printed their input to a
file or to standard output, and then have an endElement() that outputs
a closing tag, first checking if it's the root element and inserting
the extra content.  Is there a better way?

--amk


From Mike.Olson@fourthought.com  Thu May  3 21:31:10 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Thu, 03 May 2001 14:31:10 -0600
Subject: [XML-SIG] Deleting and appending of a file, without reading into
 memory
References: <F222KA0tjpaELFU5oot00012cc4@hotmail.com> <20010503161919.A3785@ute.cnri.reston.va.us>
Message-ID: <3AF1C00E.83423203@FourThought.com>

Andrew Kuchling wrote:
> 
> On Thu, May 03, 2001 at 07:48:34PM +0000, gary cor wrote:
> >I want to add some text onto the end of an XML file just before the closing
> >tag but I don't want to read the whole file into memory as it is quite a
> >large file. I am trying to do the following:
> >
> >1. delete 14 characters off the end of the file (the closing tag)
>   ...
> 
> This is fragile; what if there is trailing whitespace at the end of
> the file?  What if the closing tag is written strangely, as '< /
> closing >' or something like that?
> 
> Now, what's the best way to do this?  You could write a simple SAX
> handler where startElement() and characters() printed their input to a
> file or to standard output, and then have an endElement() that outputs
> a closing tag, first checking if it's the root element and inserting
> the extra content.  Is there a better way?

If the doc is that big, what about breaking it into smaller docs and
using XInclude?

Then to add a new section, load the "hub" document (which will be pretty
small now) and add a new include tag.  Then write the new content to the
referenced file.

Mike


> 
> --amk
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From martin@loewis.home.cs.tu-berlin.de  Thu May  3 23:04:22 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 4 May 2001 00:04:22 +0200
Subject: [XML-SIG] Deleting and appending of a file, without reading into memory
In-Reply-To: <20010503161919.A3785@ute.cnri.reston.va.us> (message from Andrew
 Kuchling on Thu, 3 May 2001 16:19:19 -0400)
References: <F222KA0tjpaELFU5oot00012cc4@hotmail.com> <20010503161919.A3785@ute.cnri.reston.va.us>
Message-ID: <200105032204.f43M4M401839@mira.informatik.hu-berlin.de>

> This is fragile; what if there is trailing whitespace at the end of
> the file?  What if the closing tag is written strangely, as '< /
> closing >' or something like that?  

If this CGI script is the only application that ever modifies the
document, the approach seems fine to me - although it is certainly
questionable why to use XML in the first place, here.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Thu May  3 23:03:22 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 4 May 2001 00:03:22 +0200
Subject: [XML-SIG] Deleting and appending of a file, without reading into memory
In-Reply-To: <F222KA0tjpaELFU5oot00012cc4@hotmail.com>
 (stuff4gary@hotmail.com)
References: <F222KA0tjpaELFU5oot00012cc4@hotmail.com>
Message-ID: <200105032203.f43M3Mi01837@mira.informatik.hu-berlin.de>

> 1. delete 14 characters off the end of the file (the closing tag)
> 2. add some new data text from a cgi script onto this
>      ie - file.append(cgi_resxml)
> 3.  - then add back on the closing tag (14 character '</root>')
>      ie - file.append('</root>')
> 
> I can manage (2.) & (3.) no problems opening the file handler with append 
> access ('a'), but I can't get into to do (1.) as well... does this append 
> function have a reverse function and can I use that, or should I be doing 
> this a differn't way?

What kind of file object do you have that has an append function?

I'd use f.seek to go 14 characters before the end, and start writing
there. Some operating systems don't even support truncation to a
certain size; they all support positioning to a given offset, though.

Regards,
Martin


From uche.ogbuji@fourthought.com  Fri May  4 02:42:31 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 03 May 2001 19:42:31 -0600
Subject: [XML-SIG] A bit o' challenge
Message-ID: <200105040142.f441gVD10270@localhost.local>

OK, so the conventional wisdom lately has been that the Java processors such 
as Xalan and Saxon cream 4XSLT for performance across the board.  Alexandre 
Fayolle said that he thought they were "orders of magnitude" faster.

Well, I know that one always does better in his own benchmarking, but I have 
been working with 4XSLT quite heavily in the time leading up to the 0.11 
release, and I'm having trouble crediting this impression.  4XSLT is to my 
observations (and measurements using the time command-line timer) a good 25% 
faster than Saxon and faster by an even greater proportion than Xalan for most 
small to medium tasks.

I have indeed noticed that on huge documents, such as the "Cemetary" benchmark 
(3MB source), Saxon 6.0.2 is up to 4 times faster than 4XSLT (similar for 
Xalan), but this is still not "orders of magnitude" faster, and this only 
seems to be true for the size and type of document I'd only expect to process 
in benchmarks.

Now one note: I *always* use cDomlette.  It is much faster than pDomlette, and 
that is why I've declared that I'll be working to make it the default in 
4Suite as of 0.11.1.  Once again, I encourage everyone to help shake out any 
remnant bugs in cDomlette.  See this posting for more info:

http://lists.fourthought.com/pipermail/4suite/2001-April/001780.html

So here's the bit o' challenge.  I'm looking for regular-sized, real-world 
transforms in which Saxon or Xalan smoke 4XSLT.  If you have such test cases, 
and can reliably reproduce 4XSLT's lassitude using cDomlette, please send it 
my way so I can have a look (and maybe find the performance bugs that I'm too 
close to see).

I'm also interested, of course, in hearing positive reports about 4XSLT's 
performance.

So I say 4XSLT is competitive, and as far as I can tell, is usually faster 
than the Java processors (though we can't touch MSXML yet).

P.S. What got me starting to ponder was DataPower's benchmark that showed 
4XSLT some 20 times slower than the group of Java processors.  The nonsense 
behind this I was able to grasp with one glance at their tortured "driver" for 
4XSLT.   I've made my complaints about their incompetence, but here's your 
chance to show I'm all wet.

Thanks, all.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From noreply@sourceforge.net  Fri May  4 03:04:21 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 03 May 2001 19:04:21 -0700
Subject: [XML-SIG] [ pyxml-Patches-421217 ] ImportError shoudl be AttributeError
Message-ID: <E14vUxB-0005Dn-00@usw-sf-web3.sourceforge.net>

Patches item #421217, was updated on 2001-05-03 19:04
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=421217&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Karl Anderson (karlanderson)
Assigned to: Nobody/Anonymous (nobody)
Summary: ImportError shoudl be AttributeError

Initial Comment:
I'm running a recent CVS checkout of PyXML, and have
4Suite 0.11a2 installed (I haven't installed 4Suite
0.11, but it doesn't look to be different in this
regard).
 
I'm running Python 1.5.2 under Redhat 6.2.

I get an attribute error when I try to import xslt.
StylesheetReader.py seems to be catching ImportError
when it should catch AttributeError for me.

The intended import seems to be
Ft.Lib.Error.XML_PARSE_ERROR, not
Ft.Lib.XML_PARSE_ERROR.

This happens before my patch:

>>> import sys
>>> sys.path.insert(0,
'/home/karl/zope/dist/xml/PyXML-cvs-install/lib/python1.5/site-packages')
# where I installed from CVS
>>> import xml
>>> from xml.xslt import Processor
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File
"/home/karl/zope/dist/xml/PyXML-cvs-install/lib/python1.5/site-packages/xml/xslt/Processor.py",
line 24, in ?
    from xml.xslt import StylesheetReader, ReleaseNode
  File
"/home/karl/zope/dist/xml/PyXML-cvs-install/lib/python1.5/site-packages/xml/xslt/StylesheetReader.py",
line 67, in ?
    XML_PARSE_ERROR = Ft.Lib.XML_PARSE_ERROR
AttributeError: XML_PARSE_ERROR

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=421217&group_id=6473


From cperez@zulunet.net  Fri May  4 18:21:51 2001
From: cperez@zulunet.net (Carlos Perez)
Date: Fri, 4 May 2001 13:21:51 -0400
Subject: [XML-SIG] Looking for XML to Python sequence code.
In-Reply-To: <E14vi0v-00016h-00@mail.python.org>
Message-ID: <001b01c0d4be$b5c2a140$fd0aa8c0@CPEREZ>

I'm looking for some Python code that convert XML to a Python native
sequence object.
Does anyone know where to get it?

Thanks in advance...


From dieter@handshake.de  Fri May  4 19:19:41 2001
From: dieter@handshake.de (Dieter Maurer)
Date: Fri, 4 May 2001 20:19:41 +0200 (CEST)
Subject: [XML-SIG] Re: [4suite] A bit o' challenge
In-Reply-To: <782485507@toto.iv>
Message-ID: <15090.62141.831609.704073@lindm.dm>

Uche Ogbuji writes:
 > ...
 > Well, I know that one always does better in his own benchmarking, but I have 
 > been working with 4XSLT quite heavily in the time leading up to the 0.11 
 > release, and I'm having trouble crediting this impression.  4XSLT is to my 
 > observations (and measurements using the time command-line timer) a good 25% 
 > faster than Saxon and faster by an even greater proportion than Xalan for most 
 > small to medium tasks.
When I used 4XSLT for the last time, it was version 0.9.

I transformed a 240 kb DocBook/XML file into HTML using Norman Walsh's
DocBook stylesheets.

4XSLT needed about 50 MB memory and about 30 min CPU time (slow
Pentium 100 MHZ with 64 MB main memory).

A colleague of mine used Saxon for his DocBook/XML documentation,
also with Normal Walsh's stylesheets. Runtime was in the order
of a minute. I should say, it was a very different machine (Sun E450
with 256MB memory).
But nevertheless, I expect that after normalization Saxon
was several times faster than 4XSLT.

I was especially horrified by the high memory requirements.
The mentioned document is one out of eight chapters of a book.
In the final production, the complete book must be processed
together (to get correct links, table of contents, indexes,...).
I fear, I would need 200 MB memory and several hours of processing
time ....

 > ....
 > So here's the bit o' challenge.  I'm looking for regular-sized, real-world 
 > transforms in which Saxon or Xalan smoke 4XSLT.  If you have such test cases, 
 > and can reliably reproduce 4XSLT's lassitude using cDomlette, please send it 
 > my way so I can have a look (and maybe find the performance bugs that I'm too 
 > close to see).
I will give it a try, when 0.11 is released and report back.


Dieter


From rsalz@zolera.com  Fri May  4 20:36:27 2001
From: rsalz@zolera.com (Rich Salz)
Date: Fri, 04 May 2001 15:36:27 -0400
Subject: [XML-SIG] xmlproc bug?
Message-ID: <3AF304BB.D5ECB468@zolera.com>

If you feed() a unicode string into an xmlproc parser, Python barfs at
line 234
     # ignore unusal byte orders 2143 and 3412
     elif new_data[:2] == '\xfe\xff':
         enc = "utf-16-be" # with BOM

because apparently it is trying to convert the string to unicode and
it's got 8bit characters.

Not sure what the right thing to do is.  here's a three-line script that
shows the fault

from xml.parsers.xmlproc import xmlproc
z = xmlproc.XMLProcessor()
z.feed(u'<foo/>')

	/r$


From larsga@garshol.priv.no  Fri May  4 21:22:07 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 04 May 2001 22:22:07 +0200
Subject: [XML-SIG] xmlproc bug?
In-Reply-To: <3AF304BB.D5ECB468@zolera.com>
References: <3AF304BB.D5ECB468@zolera.com>
Message-ID: <m38zkckgpc.fsf@lambda.garshol.priv.no>

* Rich Salz
|
| If you feed() a unicode string into an xmlproc parser, Python barfs at
| line 234
|      # ignore unusal byte orders 2143 and 3412
|      elif new_data[:2] == '\xfe\xff':
|          enc = "utf-16-be" # with BOM
| 
| because apparently it is trying to convert the string to unicode and
| it's got 8bit characters.

The problem here is that we are trying to autodetect the encoding of a
Unicode string, but a Unicode string is already in Unicode and so
needs no decoding.

You can solve this by setting the decoded parameter to feed to 1, but
it would be better if you did not have to.

Fixed it by doing the following:

Index: xml/parsers/xmlproc/xmlutils.py
===================================================================
RCS file: /cvsroot/pyxml/xml/xml/parsers/xmlproc/xmlutils.py,v
retrieving revision 1.16
diff -c -r1.16 xmlutils.py
***************
*** 285,290 ****
--- 285,295 ----
  
          new_data = new_data+self.encoded_data
          self.encoded_data = ""
+ 
+         if not decoded and using_unicode and \
+            type(new_data) == types.UnicodeType:
+             decoded = 1
+         
          if not decoded and not self.charset_converter:
              self.autodetect_encoding(new_data)
              # If this returns with no auto-detected encoding, i.e.  if

I need to check it first before committing it, but this should solve
the problem. (Am waiting for glibc to download, so that I can compile
Python 2.1, so that I can actually test this. The download is going
slowly, so I am posting before the commit.)

--Lars M.


From noreply@sourceforge.net  Fri May  4 21:36:15 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 04 May 2001 13:36:15 -0700
Subject: [XML-SIG] [ pyxml-Bugs-421488 ] xslt processor stylesheet reader error
Message-ID: <E14vmJD-0003qf-00@usw-sf-web3.sourceforge.net>

Bugs item #421488, was updated on 2001-05-04 13:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421488&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Karl Anderson (karlanderson)
Assigned to: Nobody/Anonymous (nobody)
Summary: xslt processor stylesheet reader error

Initial Comment:
Can't append stylesheet.  Stylesheet reader wants to
call initParser().

I'm not giving the processor a reader, it's using the
default.

When I run without Ft installed, the reader is
MinidomReader, which doesn't define this.

When I run with Ft installed from 4Suite 0.11, the
reader is DomletteReader, which also gives this error.

initParser is defined on the pDomletteReader
ReaderMixin class, but not defined anywhere on a reader
that the processor gets by default, AFAICT.

>>> p = Processor.Processor()
p.appendStylesheetString(sheet_4)

>>> Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File
"/home/karl/zope/dist/xml/PyXML-cvs-install/lib/python1.5/site-packages/xml/xslt/Processor.py",
line 106, in appendStylesheetString
    sty = self._styReader.fromString(text, baseUri)
  File
"/home/karl/zope/dist/xml/PyXML-cvs-install/lib/python1.5/site-packages/xml/xslt/minisupport.py",
line 62, in fromString
    return self.fromStream(st, baseUri, ownerDoc,
stripElements)
  File
"/home/karl/zope/dist/xml/PyXML-cvs-install/lib/python1.5/site-packages/xml/xslt/StylesheetReader.py",
line 305, in fromStream
    self.initParser()
AttributeError: initParser


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421488&group_id=6473


From uche.ogbuji@fourthought.com  Fri May  4 21:52:25 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Fri, 04 May 2001 14:52:25 -0600
Subject: [XML-SIG] Re: [4suite] A bit o' challenge
References: <15090.62141.831609.704073@lindm.dm>
Message-ID: <3AF31689.A8895445@fourthought.com>

Dieter Maurer wrote:
> 
> Uche Ogbuji writes:
>  > ...
>  > Well, I know that one always does better in his own benchmarking, but I have
>  > been working with 4XSLT quite heavily in the time leading up to the 0.11
>  > release, and I'm having trouble crediting this impression.  4XSLT is to my
>  > observations (and measurements using the time command-line timer) a good 25%
>  > faster than Saxon and faster by an even greater proportion than Xalan for most
>  > small to medium tasks.
> When I used 4XSLT for the last time, it was version 0.9.
> 
> I transformed a 240 kb DocBook/XML file into HTML using Norman Walsh's
> DocBook stylesheets.
> 
> 4XSLT needed about 50 MB memory and about 30 min CPU time (slow
> Pentium 100 MHZ with 64 MB main memory).

I did specifically mention working with cDomlette.  Is that what you
were using?

> A colleague of mine used Saxon for his DocBook/XML documentation,
> also with Normal Walsh's stylesheets. Runtime was in the order
> of a minute. I should say, it was a very different machine (Sun E450
> with 256MB memory).
> But nevertheless, I expect that after normalization Saxon
> was several times faster than 4XSLT.
> 
> I was especially horrified by the high memory requirements.
> The mentioned document is one out of eight chapters of a book.
> In the final production, the complete book must be processed
> together (to get correct links, table of contents, indexes,...).
> I fear, I would need 200 MB memory and several hours of processing
> time ....

cDomlette takes up about half the memory as pDomlette.  In some cases
(since it uses string pooling) this might be more or less the
proportion.

When I checked with the 3MB cemetary demo, 4XSLT+cDom 0.11a2 took up
42MB and Saxon 6.0.2 took up 33MB of RAM.

>  > ....
>  > So here's the bit o' challenge.  I'm looking for regular-sized, real-world
>  > transforms in which Saxon or Xalan smoke 4XSLT.  If you have such test cases,
>  > and can reliably reproduce 4XSLT's lassitude using cDomlette, please send it
>  > my way so I can have a look (and maybe find the performance bugs that I'm too
>  > close to see).
> I will give it a try, when 0.11 is released and report back.

0.11 was released yesterday.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From martin@loewis.home.cs.tu-berlin.de  Fri May  4 22:39:33 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 4 May 2001 23:39:33 +0200
Subject: [XML-SIG] xmlproc bug?
In-Reply-To: <3AF304BB.D5ECB468@zolera.com> (message from Rich Salz on Fri, 04
 May 2001 15:36:27 -0400)
References: <3AF304BB.D5ECB468@zolera.com>
Message-ID: <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de>

> If you feed() a unicode string into an xmlproc parser, Python barfs at
> line 234
>      # ignore unusal byte orders 2143 and 3412
>      elif new_data[:2] == '\xfe\xff':
>          enc = "utf-16-be" # with BOM
> 
> because apparently it is trying to convert the string to unicode and
> it's got 8bit characters.
> 
> Not sure what the right thing to do is.

My intuition is that feeding Unicode objects is an error, but that may
be debatable.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Fri May  4 23:28:08 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 5 May 2001 00:28:08 +0200
Subject: [XML-SIG] Looking for XML to Python sequence code.
In-Reply-To: <001b01c0d4be$b5c2a140$fd0aa8c0@CPEREZ>
References: <001b01c0d4be$b5c2a140$fd0aa8c0@CPEREZ>
Message-ID: <200105042228.f44MS8m02386@mira.informatik.hu-berlin.de>

> I'm looking for some Python code that convert XML to a Python native
> sequence object.
> Does anyone know where to get it?

Are you looking for a specific structure of the sequence? If not,
try

seq = open(filename).read()

seq will be a Python native sequence object representing the XML
document :-)

Regards,
Martin


From noreply@sourceforge.net  Sat May  5 02:06:10 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Fri, 04 May 2001 18:06:10 -0700
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires '' NSURI
Message-ID: <E14vqWQ-0003ET-00@usw-sf-web1.sourceforge.net>

Bugs item #421553, was updated on 2001-05-04 18:06
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421553&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Karl Anderson (karlanderson)
Assigned to: Nobody/Anonymous (nobody)
Summary: stylesheet node reader requires '' NSURI

Initial Comment:

I'm unable to use ParsedXML's DOM as a stylesheet node,
and I think
it's because of a bug in StylesheetReader.py.

The problem is at StylesheetReader.py line 186:

        if not sheet.getAttributeNS('', 'version'):
            raise
XsltException(Error.STYLESHEET_MISSING_VERSION)

...where the NamespaceURI given to getAttributeNS is
''.  This is
supposed to find the namespace-free version attribute
of the
stylesheet documentElement, such as
"""
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">
""".

ParsedXML's DOM builder gives this attribute a
NamespaceURI of None
when we parse.

I don't think that you can use the DOM methods to
create a node with a
NamespaceURI of "", since the NamespaceURI is supposed
to be a URI
reference.  Is the empty string a valid URI reference? 
Well, maybe -
the DOM level 2 rec says:
"""
Note that because the DOM does no lexical checking, the
empty string
will be treated as a real namespace URI in DOM Level 2
methods.
Applications must use the value null as the
namespaceURI parameter for
methods if they wish to have no namespace.
"""
But anyway, this indicates that when using DOM creation
methods, a
None should be used as the NamespaceURI for
namespaceless nodes such
as "version", and I think that the stylesheet reader
should accept
that.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421553&group_id=6473


From larsga@garshol.priv.no  Sat May  5 10:26:45 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 05 May 2001 11:26:45 +0200
Subject: [XML-SIG] xmlproc bug?
In-Reply-To: <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de>
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de>
Message-ID: <m31yq4jgdm.fsf@lambda.garshol.priv.no>

* Martin v. Loewis
| 
| My intuition is that feeding Unicode objects is an error, but that may
| be debatable.

I see no reason why it should be. If the application is converting to
Unicode itself, or if it got the data from somewhere as Unicode, there
is no reason why it should not be allowed to parse those data.

--Lars M.


From martin@loewis.home.cs.tu-berlin.de  Sat May  5 14:12:01 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 5 May 2001 15:12:01 +0200
Subject: [XML-SIG] xmlproc bug?
In-Reply-To: <m31yq4jgdm.fsf@lambda.garshol.priv.no> (message from Lars Marius
 Garshol on 05 May 2001 11:26:45 +0200)
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no>
Message-ID: <200105051312.f45DC1401103@mira.informatik.hu-berlin.de>

> I see no reason why it should be. If the application is converting to
> Unicode itself, or if it got the data from somewhere as Unicode, there
> is no reason why it should not be allowed to parse those data.

I agree in principle. However, just allowing to call feed with a
Unicode object is too permissive: What if you had previously called it
with a string?

So if this is allowed, care should be taken that a sensible thing
happens when somebody mixes byte and unicode strings (signalling a
fatal error might be sensible).

Regards,
Martin


From tpassin@home.com  Sat May  5 17:11:46 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sat, 5 May 2001 12:11:46 -0400
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no> <200105051312.f45DC1401103@mira.informatik.hu-berlin.de>
Message-ID: <000b01c0d57e$15cfa820$7cac1218@reston1.va.home.com>

I've been able to get 4SuiteServer working on Windows98/Me, but it doesn't
quite work right out of the box as downloaded.  Here's what I needed to do
to get it working.

I did all the steps in the installation and quickstart guide.  Once it's
installed using the WIndows installer, you set up your environmental
variables, then you are told to run

4ss init

1) 4ss.bat is in the python\scripts directory, so you have to add it to your
path or have to be running in that directory.

2) The init command fails because the file "core.odl" is not installed into
the "generated" directory (or anywhere else) by the installer.  I downloaded
the source distribution, found the file, and copied into the generated
directory.

Now init works.

3) init works, but when it asks if you want to wipe out the old data, it
wants you to answer "yes" or "no".  Most Windows users are used to being
able to answer 'y' or 'n' to those questions.  I did, and didn't even notice
that I hadn't literally done what the prompt said.  Very Unix-like.  Very
unforgiving.  This code should be changed to allow "y" and "n" as well.

4) The quick start guide has you run the script populate.py in the
python\docs\4SuiteServer-0.11\demo directory.  But it fails, looking for a
unix file, something like /etc/mime.types.  The script has a test for this
file and an except branch to run in case it doesn't exist (which it doesn't
on a Windows machine).  But the except branch incorrectly has a "raise"
statement which terminates the script.

Get rid of this line, which is line 66 of populate.py. Now the script runs.

5) At this point, populate installed its downloaded files but failed when it
tried to modify "docdefs".  It turns out you have to be running as superuser
to change docdefs.  The guide doesn't tell you, but implies that you should
have run as the new user it just had you create.  Otherwise, why create that
user just before running populate.py?

I deleted the whole "gems" container and went through the steps again as
superuser.

6) Then I tried to install and run the guestbook.  You have to run the
"bootstrap.py" script in the demo\GuestBook directory.  This failed.  It
turned out that you have to change to the GuestBook directory and run from
that, otherwise the script can't find the files it needs.

7) The Guestbook works until you try to submit the form for your first
guest.  Then it fails, but in a strange way.  With IE, I got an error
message saying it couldn't find the server or there was a DNS error.  This
must be an incorrect message since the form uses a relative path, but anyway
something isn't working that I haven't tracked down.

8) The docs give examples of looking at various properties by their path, as
in

4ss show acl /localhost/index.html

None of these commands have worked for me.  I had to remove the /localhost/
part.   My server was running at the time.

I think there was one more change I made to get init to work - there is a
path with a unix-style "/" hardcoded somewhere - but unfortunately I forget
just where and can't find it right now.  If this strikes, you should be able
to find it from the error message.

It runs now - I have it on port 8090 to avoid colliding with Zope in 8080.
Good luck!

Cheers,

Tom P


From Mike.Olson@fourthought.com  Sat May  5 23:43:44 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sat, 05 May 2001 16:43:44 -0600
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no> <200105051312.f45DC1401103@mira.informatik.hu-berlin.de> <000b01c0d57e$15cfa820$7cac1218@reston1.va.home.com>
Message-ID: <3AF48220.B9E97B23@FourThought.com>

"Thomas B. Passin" wrote:
> 

Thomas,  thanks for all of the work, I'm working today on getting these
straigtened out.

> 
> 4ss init
> 
> 1) 4ss.bat is in the python\scripts directory, so you have to add it to your
> path or have to be running in that directory.

This is in the Windows Installation guide.

see

http://4suite.org/4Suite.org/documents/guides/4SuiteServer/Windows_Installation

Towards the end of the Installing 4SuiteServer section.

> 
> 2) The init command fails because the file "core.odl" is not installed into
> the "generated" directory (or anywhere else) by the installer.  I downloaded
> the source distribution, found the file, and copied into the generated
> directory.

This was a packaging bug.  We will be putting out new Windows packages
today.

> 
> Now init works.
> 
> 3) init works, but when it asks if you want to wipe out the old data, it
> wants you to answer "yes" or "no".  Most Windows users are used to being
> able to answer 'y' or 'n' to those questions.  I did, and didn't even notice
> that I hadn't literally done what the prompt said.  Very Unix-like.  Very
> unforgiving.  This code should be changed to allow "y" and "n" as well.

I'll make this more forgiving, and more informative.

> 
> 4) The quick start guide has you run the script populate.py in the
> python\docs\4SuiteServer-0.11\demo directory.  But it fails, looking for a
> unix file, something like /etc/mime.types.  The script has a test for this
> file and an except branch to run in case it doesn't exist (which it doesn't
> on a Windows machine).  But the except branch incorrectly has a "raise"
> statement which terminates the script.
> 
> Get rid of this line, which is line 66 of populate.py. Now the script runs.

Fixed in CVS, thanks.

> 
> 5) At this point, populate installed its downloaded files but failed when it
> tried to modify "docdefs".  It turns out you have to be running as superuser
> to change docdefs.  The guide doesn't tell you, but implies that you should
> have run as the new user it just had you create.  Otherwise, why create that
> user just before running populate.py?

I updated the docs to say that populate needs to be run as super user. 
I might change it so that any user can create a document definition
though.

> 
> 6) Then I tried to install and run the guestbook.  You have to run the
> "bootstrap.py" script in the demo\GuestBook directory.  This failed.  It
> turned out that you have to change to the GuestBook directory and run from
> that, otherwise the script can't find the files it needs.

I updated the README

> 
> 7) The Guestbook works until you try to submit the form for your first
> guest.  Then it fails, but in a strange way.  With IE, I got an error
> message saying it couldn't find the server or there was a DNS error.  This
> must be an incorrect message since the form uses a relative path, but anyway
> something isn't working that I haven't tracked down.

I'll have to look into this one a bit closer....

> 
> 8) The docs give examples of looking at various properties by their path, as
> in
> 
> 4ss show acl /localhost/index.html

Did you get an error of:
Uri /localhost/index.html, is unknown

> 
> None of these commands have worked for me.  I had to remove the /localhost/
> part.   My server was running at the time.

Did it work when you removed the localhost part?  Then you probably have
a document in your root called index.html.  Probably from the Guestbook
example.  That souldn't put things in your root.


thanks again for your help.  Hopfully it is getting easier to install.

Mike


-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From tpassin@home.com  Sun May  6 05:02:00 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sun, 6 May 2001 00:02:00 -0400
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no> <200105051312.f45DC1401103@mira.informatik.hu-berlin.de> <000b01c0d57e$15cfa820$7cac1218@reston1.va.home.com> <3AF48220.B9E97B23@FourThought.com>
Message-ID: <000c01c0d5e1$4d89ac80$7cac1218@reston1.va.home.com>

[Tom]
> >
> > 8) The docs give examples of looking at various properties by their
path, as
> > in
> >
> > 4ss show acl /localhost/index.html
>

[Mike Olson]
> Did you get an error of:
> Uri /localhost/index.html, is unknown
>
Yes

[Tom]
> >
> > None of these commands have worked for me.  I had to remove the
/localhost/
> > part.   My server was running at the time.

[Mike]
> Did it work when you removed the localhost part?  Then you probably have
> a document in your root called index.html.  Probably from the Guestbook
> example.  That souldn't put things in your root.
>
No, same message with or without /localhost.

Here are two screen captures:

D:>4ss show acl /localhost/gems/
"d:\program files\python\python" -c "from FtServer.Console import
CommandLine; C
ommandLine.Run()" show acl /localhost/gems/
4SS User Name: dba
Uri /localhost/gems, is unknown


D:>4ss show acl gems/
"d:\program files\python\python" -c "from FtServer.Console import
CommandLine; C
ommandLine.Run()" show acl gems/
4SS User Name: dba
Resource: gems/
----------

Read ACL: ['dba']
Write ACL: ['admin']
You can read this object
You can modify this object


As for an index.html in the root:

D:>4ss fetch document /localhost/index.html
"d:\program files\python\python" -c "from FtServer.Console import
CommandLine; C
ommandLine.Run()" fetch document /localhost/index.html
4SS User Name: dba
Uri /localhost/index.html, is unknown


I just noticed that, at http://localhost:8090/ ( my 4ss site), there is a
"folder" called localhost/.  Is that how it's supposed to be?  If so, I'd
suggest changing the name because it could get confused (by a user - me for
example!) with the "localhost" alias for 127.0.0.1.

Thanks for your help.

Tom P


From Mike.Olson@fourthought.com  Sun May  6 05:13:24 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sat, 05 May 2001 22:13:24 -0600
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no> <200105051312.f45DC1401103@mira.informatik.hu-berlin.de> <000b01c0d57e$15cfa820$7cac1218@reston1.va.home.com> <3AF48220.B9E97B23@FourThought.com> <000c01c0d5e1$4d89ac80$7cac1218@reston1.va.home.com>
Message-ID: <3AF4CF64.A0B32045@FourThought.com>

"Thomas B. Passin" wrote:
> 
> 
> 
> I just noticed that, at http://localhost:8090/ ( my 4ss site), there is a
> "folder" called localhost/.  Is that how it's supposed to be?  If so, I'd
> suggest changing the name because it could get confused (by a user - me for
> example!) with the "localhost" alias for 127.0.0.1.

Yes we need to change the default name of the SystemHost directory.  It
was less confusing when we put http infront of all of the URIs, now it
is just confusing.  I was thinking of callint it "etc" but I think
windows folks might not like that.

Thoughts?


Mike


> 
> Thanks for your help.
> 
> Tom P
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From tpassin@home.com  Sun May  6 05:26:01 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sun, 6 May 2001 00:26:01 -0400
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no> <200105051312.f45DC1401103@mira.informatik.hu-berlin.de> <000b01c0d57e$15cfa820$7cac1218@reston1.va.home.com> <3AF48220.B9E97B23@FourThought.com>
Message-ID: <001001c0d5e4$a896f8a0$7cac1218@reston1.va.home.com>

Another problem, the 4ss test_suite fails with this message:

D:>test.py
4SS User Name: dba
==== D:\Program Files\Python\Doc\4SuiteServer-0.11\test_suite ===
==== D:\Program Files\Python\Doc\4SuiteServer-0.11\test_suite\Core ===
Traceback (innermost last):
  File "D:\Program Files\Python\Doc\4SuiteServer-0.11\test_suite\test.py",
line
29, in ?
    test(tester)
  File "D:\Program Files\Python\Doc\4SuiteServer-0.11\test_suite\test.py",
line
18, in test
    m.test(tester)
  File "D:\Program Files\Python\Doc\4SuiteServer-0.11\test_suite\test.py",
line
14, in test
    os.chdir(dir)
OSError: [Errno 2] No such file or directory: 'Core'


I ran this script from the test_suite directory.  Note that I added an extra
print statement to see what directory it couldn't find.  It looks as if the
test.py script calls itself the second time rather than calling the test.py
located in the test_suite\Core directory.  I'm sure this wasn't intended.

This would be a good time for me to put in a plug to make scripts that
depend on knowing where other files are relative to themselves, detect their
own location.  You may have to make the script a module to do this reliably
(I'm not fully up on all the ins and outs, but if you do it right then
__file__ gives you the full path to the script).

Cheers,

Tom P


From tpassin@home.com  Sun May  6 05:29:28 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sun, 6 May 2001 00:29:28 -0400
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no> <200105051312.f45DC1401103@mira.informatik.hu-berlin.de> <000b01c0d57e$15cfa820$7cac1218@reston1.va.home.com> <3AF48220.B9E97B23@FourThought.com> <000c01c0d5e1$4d89ac80$7cac1218@reston1.va.home.com> <3AF4CF64.A0B32045@FourThought.com>
Message-ID: <001701c0d5e5$23fa30c0$7cac1218@reston1.va.home.com>

[Mike Olson]"
> "Thomas B. Passin" wrote:
> >
> >
> >
> > I just noticed that, at http://localhost:8090/ ( my 4ss site), there is
a
> > "folder" called localhost/.  Is that how it's supposed to be?  If so,
I'd
> > suggest changing the name because it could get confused (by a user - me
for
> > example!) with the "localhost" alias for 127.0.0.1.
>
> Yes we need to change the default name of the SystemHost directory.  It
> was less confusing when we put http infront of all of the URIs, now it
> is just confusing.  I was thinking of callint it "etc" but I think
> windows folks might not like that.
>

[Tom]
Depends on what you intend it to be for.  It should have an evocative name.
I see the docdefs and acl stuff in mine.  How about sscfg?

Cheers,

Tom P


From Mike.Olson@fourthought.com  Sun May  6 09:11:06 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sun, 06 May 2001 02:11:06 -0600
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no> <200105051312.f45DC1401103@mira.informatik.hu-berlin.de> <000b01c0d57e$15cfa820$7cac1218@reston1.va.home.com> <3AF48220.B9E97B23@FourThought.com> <000c01c0d5e1$4d89ac80$7cac1218@reston1.va.home.com> <3AF4CF64.A0B32045@FourThought.com> <001701c0d5e5$23fa30c0$7cac1218@reston1.va.home.com>
Message-ID: <3AF5071A.7AE1CA56@FourThought.com>

"Thomas B. Passin" wrote:
> >
> > Yes we need to change the default name of the SystemHost directory.  It
> > was less confusing when we put http infront of all of the URIs, now it
> > is just confusing.  I was thinking of callint it "etc" but I think
> > windows folks might not like that.
> >
> 
> [Tom]
> Depends on what you intend it to be for.  It should have an evocative name.
> I see the docdefs and acl stuff in mine.  How about sscfg?

Maybe just system

Mike

> 
> Cheers,
> 
> Tom P
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Sun May  6 14:50:42 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 06 May 2001 07:50:42 -0600
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
In-Reply-To: Message from Mike Olson <Mike.Olson@fourthought.com>
 of "Sun, 06 May 2001 02:11:06 MDT." <3AF5071A.7AE1CA56@FourThought.com>
Message-ID: <200105061350.f46Dog503966@localhost.local>

> "Thomas B. Passin" wrote:
> > >
> > > Yes we need to change the default name of the SystemHost directory.  It
> > > was less confusing when we put http infront of all of the URIs, now it
> > > is just confusing.  I was thinking of callint it "etc" but I think
> > > windows folks might not like that.
> > >
> > 
> > [Tom]
> > Depends on what you intend it to be for.  It should have an evocative name.
> > I see the docdefs and acl stuff in mine.  How about sscfg?
> 
> Maybe just system

Well, I still think it should be configurable (as it used to be, if only 
through the host-name).  Don't forget our non-english speaking friends, and 
others who might want a user folder called "system"

For a default, I favor "4sssystem", "sys4ss" or something like that which is 
unlikely to clash with user need.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From tpassin@home.com  Sun May  6 15:17:36 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sun, 6 May 2001 10:17:36 -0400
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <200105061350.f46Dog503966@localhost.local>
Message-ID: <000d01c0d637$4cfd7c00$7cac1218@reston1.va.home.com>

[Uche Ogbuji]
> Well, I still think it should be configurable (as it used to be, if only
> through the host-name).  Don't forget our non-english speaking friends,
and
> others who might want a user folder called "system"
>
> For a default, I favor "4sssystem", "sys4ss" or something like that which
is
> unlikely to clash with user need.
>

Right on.

Tom P


From Mike.Olson@fourthought.com  Sun May  6 19:24:22 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sun, 06 May 2001 12:24:22 -0600
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <200105061350.f46Dog503966@localhost.local>
Message-ID: <3AF596D6.5AACFD17@FourThought.com>

Uche Ogbuji wrote:
> 
> > "Thomas B. Passin" wrote:
> > > >
> > > > Yes we need to change the default name of the SystemHost directory.  It
> > > > was less confusing when we put http infront of all of the URIs, now it
> > > > is just confusing.  I was thinking of callint it "etc" but I think
> > > > windows folks might not like that.
> > > >
> > >
> > > [Tom]
> > > Depends on what you intend it to be for.  It should have an evocative name.
> > > I see the docdefs and acl stuff in mine.  How about sscfg?
> >
> > Maybe just system
> 
> Well, I still think it should be configurable (as it used to be, if only
> through the host-name).  Don't forget our non-english speaking friends, and
> others who might want a user folder called "system"

It would still be configurable through the SystemHost parameters.  Maybe
this should be renamed to the SystemContainer parameter.

> 
> For a default, I favor "4sssystem", "sys4ss" or something like that which is
> unlikely to clash with user need.

i like sys4ss then.

Mike

> 
> --
> Uche Ogbuji                               Principal Consultant
> uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
> Fourthought, Inc.                         http://Fourthought.com
> 4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
> Software-engineering, knowledge-management, XML, CORBA, Linux, Python

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From tpassin@home.com  Sun May  6 20:05:01 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Sun, 6 May 2001 15:05:01 -0400
Subject: [XML-SIG] Getting 4SuiteServer 0.11 Working on Windows
References: <200105061350.f46Dog503966@localhost.local> <3AF596D6.5AACFD17@FourThought.com>
Message-ID: <002901c0d65f$73fbe8a0$7cac1218@reston1.va.home.com>

[Mike Olson]

> > For a default, I favor "4sssystem", "sys4ss" or something like that
which is
> > unlikely to clash with user need.
>
> i like sys4ss then.
>
Suits me (or is that "4suites me"???)

Tom P


From noreply@sourceforge.net  Mon May  7 10:46:17 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Mon, 07 May 2001 02:46:17 -0700
Subject: [XML-SIG] [ pyxml-Bugs-421978 ] pDomlette reader bug
Message-ID: <E14whar-0004Yn-00@usw-sf-web3.sourceforge.net>

Bugs item #421978, was updated on 2001-05-07 02:46
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421978&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: pDomlette reader bug

Initial Comment:
Hi,

I'm trying to build a pDomlette with a custom Sax
parser, and it looks like the provided handler expects
the parser to implement a SetBase method. I could not
find it in the Sax documentation.

Providing an empty SetBase() method leads to errors
when accessing to parseFile() (instead of parse()), and
further errors in the except clause.

Did I miss something or is this a bug?

Alexandre Fayolle


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421978&group_id=6473


From larsga@garshol.priv.no  Mon May  7 13:27:47 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 07 May 2001 14:27:47 +0200
Subject: [XML-SIG] xmlproc bug?
In-Reply-To: <200105051312.f45DC1401103@mira.informatik.hu-berlin.de>
References: <3AF304BB.D5ECB468@zolera.com> <200105042139.f44LdXN01714@mira.informatik.hu-berlin.de> <m31yq4jgdm.fsf@lambda.garshol.priv.no> <200105051312.f45DC1401103@mira.informatik.hu-berlin.de>
Message-ID: <m37kztl4xo.fsf@lambda.garshol.priv.no>

* Lars Marius Garshol
|
| I see no reason why it should be. If the application is converting
| to Unicode itself, or if it got the data from somewhere as Unicode,
| there is no reason why it should not be allowed to parse those data.

* Martin v. Loewis
| 
| I agree in principle. However, just allowing to call feed with a
| Unicode object is too permissive: What if you had previously called it
| with a string?

Good point. One should have to stick to either Unicode or byte strings
throughout a single parse.

Looking at the code I think it makes sense to require client code to
also be consistent in its use of the 'decoded' flag. That is, decoded
should always have the same value throughout an entire parse.
 
| So if this is allowed, 

It is allowed now, since I've committed my change.

| care should be taken that a sensible thing happens when somebody
| mixes byte and unicode strings (signalling a fatal error might be
| sensible).

I agree.

I am working on the modification now and will commit it shortly.

--Lars M.


From akrug@mps.de  Mon May  7 17:31:53 2001
From: akrug@mps.de (Arne Krug)
Date: Mon, 7 May 2001 18:31:53 +0200
Subject: [XML-SIG] sample code - msxml
Message-ID: <3AF6EA19.6350.2FF8E8@localhost>

Has anyone sample code for using the 
SAXXMLReader of the Microsoft Parser msxml 
in Python.

arne

--- Arne Krug:                          ---
---            ufcx@rz.uni-karlsruhe.de --- 
---            akrug@mps.de             ---


From uche.ogbuji@fourthought.com  Tue May  8 00:16:34 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Mon, 07 May 2001 17:16:34 -0600
Subject: [XML-SIG] Curiouser and curiouser
Message-ID: <200105072316.f47NGYU31415@localhost.local>

Quote from anonymous source in http://xmlhack.com/read.php?item=1203

"The charter of the XML Protocols WG isn't to invent anything new."

I don't know how solid this particular source is, but this comment would seem to support Rich Salz's position in our debate from last week.

However, one of the secret weapons I had in that debate was that I'd happened to attend the W3C Web Services workshop last month, and I can certainly say the the above quote completely contradicts every sense I got from that meeting.

I think the politics of XML protocols and Web services will be white hot.  It might even be a bloodier battlefield than the notorious XML Schemas.  The camps appear to be roughly:

* Just use SOAP as-is and rubber-stamp WSDL and UDDI to boot (the camp that seems to be represented in the above quote)
* Take the good parts of SOAP, mix in a bit of "transactions" here, a dash of PKI there, a smidgen of EAI voodoo, and...
* This is EDI + Internet transport + XML payload + semantic Web, folks: quit reinventing wheels (the camp I occupy)

I think I can say from first hand that all camps have powerful adherents.

Don't ask me what the hell this means for Python efforts...


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From rsalz@zolera.com  Tue May  8 04:59:00 2001
From: rsalz@zolera.com (Rich Salz)
Date: Mon, 07 May 2001 23:59:00 -0400
Subject: [XML-SIG] Curiouser and curiouser
References: <200105072316.f47NGYU31415@localhost.local>
Message-ID: <3AF76F04.E38CC24D@zolera.com>

> "The charter of the XML Protocols WG isn't to invent anything new."

According to the XP charter
(http://www.w3.org/2000/09/XML-Protocol-Charter), "The Working group
shall start by developing a requirements document, and then evaluate the
technical solutions proposed in the SOAP/1.1 submission against these
requirements. If in this process the Working Group finds solutions that
are agreed to be improvements over solutions suggested by SOAP 1.1,
those improved solutions should be used."

Now I find that phrase "agreed to be improvements" rife with all sorts
of potential.  Certainly one could make a case that a new preferred
encoding that is non-interoperable with the deployed base of Sec 5
encodings is NOT an improvement, overall. :)

I knew you were at the WS workshop, and that I was basing my opinions
solely on the public record, but that's okay.  I've served my time in
standards activities and consortia, and I can hazard a guess as to what
will happen.  The same thing that always happens:  folks want holes put
in so they can plug in their own "embrace and extend" or "optimized"
version of the current protocol.  Well, since the encodings are
specified by namespace, the holes are already there. :)  So XP will
tighten up the wording, remove ambiguity, and not break interop.


> I think the politics of XML protocols and Web services will be white hot.

I don't disagree.

> The camps appear to be roughly:

Interesting analysis, thanks!

> * Just use SOAP as-is and rubber-stamp WSDL and UDDI to boot ...
> * Take the good parts of SOAP, mix in a bit of "transactions" here, a dash of PKI there, a smidgen of EAI voodoo, and...

These aren't mutually exclusive, since #2 is presumably a subset of #1.

As a security expert, I question the need for signed soap, especially in
the presence of actors.  I think applications will want to do their own
signing/encryption.

> * This is EDI + Internet transport + XML payload + semantic Web, folks: quit reinventing wheels (the camp I occupy)

I got a bit lost in your sentence syntax.  Can you explain what you mean
here?  Tnx.

> Don't ask me what the hell this means for Python efforts...

Quoting an old colleague "with freedom comes choices, and with choices
comes more lines of code." :)
	/r$


From laurent_fontanel@globalcrossing.com  Tue May  8 17:22:06 2001
From: laurent_fontanel@globalcrossing.com (Laurent Fontanel)
Date: Tue, 08 May 2001 12:22:06 -0400
Subject: [XML-SIG] Re: sample code - msxml
References: <E14x9v6-0002Vl-00@mail.python.org>
Message-ID: <3AF81D2E.9EF58A50@globalcrossing.com>

This is a multi-part message in MIME format.
--------------5BD27586D3B9A79D098870E9
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Arne,

I've never used MSXML to process XML SAX-style, but I've used it to apply XSL stylesheets.
It's really easy with the win32com interface:

import win32com.client

def xml2htm(xmlFile):
   source = win32com.client.Dispatch("Microsoft.xmldom")
   source.async = 0
   source.load(xmlFile)

   style = win32com.client.Dispatch("Microsoft.xmldom")
   style.async = 0
   style.load("mystylesheet.xsl")

   return source.transformNode(style)

if __name__ == '__main__':
   # xml2htm("myfile.xml")

Note also that after source.load(), you can manipulate the whole document tree using DOM calls,
which is pretty neat.

Hope this helps,

Laurent.

--------------5BD27586D3B9A79D098870E9
Content-Type: text/x-vcard; charset=us-ascii;
 name="laurent_fontanel.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Laurent Fontanel
Content-Disposition: attachment;
 filename="laurent_fontanel.vcf"

begin:vcard 
n:Fontanel;Laurent
tel;work:(716) 777-2752
x-mozilla-html:TRUE
org:<IMG SRC="http://www.globalcrossing.com/images/gc_logo4mainmenu.gif">;Systems and Product Development
adr:;;180 S. Clinton Ave.;Rochester;NY;14646;
version:2.1
email;internet:laurent_fontanel@globalcrossing.com
fn:Laurent Fontanel
end:vcard

--------------5BD27586D3B9A79D098870E9--


From karl@digicool.com  Tue May  8 21:12:23 2001
From: karl@digicool.com (Karl Anderson)
Date: 08 May 2001 13:12:23 -0700
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires '' NSURI
In-Reply-To: noreply@sourceforge.net's message of "Fri, 04 May 2001 18:06:10 -0700"
References: <E14vqWQ-0003ET-00@usw-sf-web1.sourceforge.net>
Message-ID: <m1itjb61nc.fsf@localhost.localdomain>

Can anyone shine some light on which DOM implementation is right here?
After parsing an attribute with no namespace prefix, what namespace
URIs should it be possible to retrieve that attribute with?

For example, after parsing "<spam version="1.0"/>" in a namespace
aware way, which should return "1.0":

getAttributeNS(None, 'version')
getAttributeNS('', 'version')

Only the URI of '' works for Domlette.  Only the URI of None works for
ParsedXML.  I think that ParsedXML's restriction is morally better
because of this line from the DOM rec:

> Note that because the DOM does no lexical checking, the
> empty string
> will be treated as a real namespace URI in DOM Level 2
> methods.
> Applications must use the value null as the
> namespaceURI parameter for
> methods if they wish to have no namespace.

OTOH, I've lost arguments when it was pointed out that you don't have
to use DOM methods when you're parsing, and in fact can't parse
everything if you're restricted to them.  OTOH again, using None would
make parsing consistent with setting namespaceless names using DOM
methods.

ParsedXML doesn't work for the XSLT modules in the current PyXML
checkout because they use '' as the NSURI to use to retrieve NSless
attributes.

Should ParsedXML allow names parsed without a NS to be retrieved
with a NSURI of '' as well as None?  Should Domlette allow None?
Should None be used in getAttributeNS calls like these, regardless?

noreply@sourceforge.net writes:

> Bugs item #421553, was updated on 2001-05-04 18:06
> You can respond by visiting: 
> http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421553&group_id=6473
> 
> Category: 4Suite
> Group: None
> Status: Open
> Resolution: None
> Priority: 5
> Submitted By: Karl Anderson (karlanderson)
> Assigned to: Nobody/Anonymous (nobody)
> Summary: stylesheet node reader requires '' NSURI
> 
> Initial Comment:
> 
> I'm unable to use ParsedXML's DOM as a stylesheet node,
> and I think
> it's because of a bug in StylesheetReader.py.
> 
> The problem is at StylesheetReader.py line 186:
> 
>         if not sheet.getAttributeNS('', 'version'):
>             raise
> XsltException(Error.STYLESHEET_MISSING_VERSION)
> 
> ...where the NamespaceURI given to getAttributeNS is
> ''.  This is
> supposed to find the namespace-free version attribute
> of the
> stylesheet documentElement, such as
> """
> <xsl:stylesheet
>   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>   version="1.0">
> """.
> 
> ParsedXML's DOM builder gives this attribute a
> NamespaceURI of None
> when we parse.
> 
> I don't think that you can use the DOM methods to
> create a node with a
> NamespaceURI of "", since the NamespaceURI is supposed
> to be a URI
> reference.  Is the empty string a valid URI reference? 
> Well, maybe -
> the DOM level 2 rec says:
> """
> Note that because the DOM does no lexical checking, the
> empty string
> will be treated as a real namespace URI in DOM Level 2
> methods.
> Applications must use the value null as the
> namespaceURI parameter for
> methods if they wish to have no namespace.
> """
> But anyway, this indicates that when using DOM creation
> methods, a
> None should be used as the NamespaceURI for
> namespaceless nodes such
> as "version", and I think that the stylesheet reader
> should accept
> that.
> 
> 
> ----------------------------------------------------------------------
> 
> You can respond by visiting: 
> http://sourceforge.net/tracker/?func=detail&atid=106473&aid=421553&group_id=6473
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Karl Anderson                          karl@digicool.com


From fdrake@acm.org  Tue May  8 21:15:29 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 8 May 2001 16:15:29 -0400 (EDT)
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires '' NSURI
In-Reply-To: <m1itjb61nc.fsf@localhost.localdomain>
References: <E14vqWQ-0003ET-00@usw-sf-web1.sourceforge.net>
 <m1itjb61nc.fsf@localhost.localdomain>
Message-ID: <15096.21473.817904.541038@cj42289-a.reston1.va.home.com>

Karl Anderson writes:
 > Can anyone shine some light on which DOM implementation is right here?
 > After parsing an attribute with no namespace prefix, what namespace
 > URIs should it be possible to retrieve that attribute with?
 > 
 > For example, after parsing "<spam version="1.0"/>" in a namespace
 > aware way, which should return "1.0":
 > 
 > getAttributeNS(None, 'version')
 > getAttributeNS('', 'version')

  The former is correct according to past discussions in this mailing
list.

 > Only the URI of '' works for Domlette.  Only the URI of None works for
 > ParsedXML.  I think that ParsedXML's restriction is morally better
 > because of this line from the DOM rec:

  Domlette is broke!

 > OTOH, I've lost arguments when it was pointed out that you don't have
 > to use DOM methods when you're parsing, and in fact can't parse
 > everything if you're restricted to them.  OTOH again, using None would
 > make parsing consistent with setting namespaceless names using DOM
 > methods.

  Using None would be the right thing because that's the Python DOM
binding.

 > ParsedXML doesn't work for the XSLT modules in the current PyXML
 > checkout because they use '' as the NSURI to use to retrieve NSless
 > attributes.

  That stinks!

 > Should ParsedXML allow names parsed without a NS to be retrieved
 > with a NSURI of '' as well as None?  Should Domlette allow None?
 > Should None be used in getAttributeNS calls like these, regardless?

  Only None needs to be supported as an indication of "no namespace";
"" is different.  (And probably broken.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From uche.ogbuji@fourthought.com  Tue May  8 21:53:02 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 08 May 2001 14:53:02 -0600
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires
 '' NSURI
In-Reply-To: Message from "Fred L. Drake, Jr." <fdrake@acm.org>
 of "Tue, 08 May 2001 16:15:29 EDT." <15096.21473.817904.541038@cj42289-a.reston1.va.home.com>
Message-ID: <200105082053.f48Kr2K10012@localhost.local>

> 
> Karl Anderson writes:
>  > Can anyone shine some light on which DOM implementation is right here?
>  > After parsing an attribute with no namespace prefix, what namespace
>  > URIs should it be possible to retrieve that attribute with?
>  > 
>  > For example, after parsing "<spam version="1.0"/>" in a namespace
>  > aware way, which should return "1.0":
>  > 
>  > getAttributeNS(None, 'version')
>  > getAttributeNS('', 'version')
> 
>   The former is correct according to past discussions in this mailing
> list.

Yes, and I was a proponent of the former as well, but we just haven't had a 
chance to go throughout XSLT and make the needed changes.  It's on our to-do 
list, but any contributed patches can make this happen more quickly.  To be 
clear: changing pDomlette and cDomlette themselves is quite easy: it's 4XPath 
and 4XSLT that will eat up the sweat equity.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From dieter@handshake.de  Tue May  8 22:21:06 2001
From: dieter@handshake.de (Dieter Maurer)
Date: Tue, 8 May 2001 23:21:06 +0200 (CEST)
Subject: [XML-SIG] 4xslt: bug and patch: variable import order
Message-ID: <15096.25410.753829.204197@lindm.dm>

--Multipart_Tue_May__8_23:21:06_2001-1
Content-Type: text/plain; charset=US-ASCII

The XSLT spec specifies that definitions and template rules
in an importing stylesheet take precedence over those from
an imported stylesheet. This is essential for easy customization
of imported stylesheets.


"4xslt" implements this feature only partially:

   Top level variables in an importing stylesheet do not
   take precedence over imported ones.


The attached patch hopefully fixes the problem.
It ensures that variables in importing style sheets
take precedence over those defined in imported style sheets
and that all style sheets use the same top level variables.


Dieter

--Multipart_Tue_May__8_23:21:06_2001-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="var_import_order.pat"
Content-Transfer-Encoding: 7bit

--- :Stylesheet.py	Thu May  3 01:29:05 2001
+++ Stylesheet.py	Tue May  8 23:19:29 2001
@@ -398,8 +398,16 @@
         self._primedContext = context
         #Note: key expressions can't have var refs, so we needn't worry about imports
         self._updateKeys(contextNode, processor)
+        # DM: imported variables have lower precedence than that from
+        #     the main style sheet.
+        d= {}
         for imp in self._imports:
-            self._primedContext.varBindings.update(imp.stylesheet._primedContext.varBindings)
+            d.update(imp.stylesheet._primedContext.varBindings)
+        d.update(self._primedContext.varBindings)
+        self._primedContext.varBindings= d
+        # DM: all use the same set of top level variables
+        for imp in self._imports:
+            imp.stylesheet._primedContext.varBindings= d
         return topLevelParams
 

--Multipart_Tue_May__8_23:21:06_2001-1--


From stuff4gary@hotmail.com  Tue May  8 23:52:28 2001
From: stuff4gary@hotmail.com (gary cor)
Date: Tue, 08 May 2001 22:52:28
Subject: [XML-SIG] What are the limits of soap and python?
Message-ID: <F97PiyAphwFutJLtosJ0000033f@hotmail.com>

I am pretty confused by the SOAP discussion (it hasn't any connection with 
the operas browser movement has it!!).

I imagine it is like OCX, Dynamic Data Exchange, Windows Script, applescript 
!!  Can it push buttons on the system and fill out text fields, with text 
and automate through applications?

Can I set it up hot folders with it to send pictures through photoshop, OCR 
and databases?? is that even possible in python?

Many thanks for any simple explanations!

Gary
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.


From karl@digicool.com  Wed May  9 00:10:21 2001
From: karl@digicool.com (Karl Anderson)
Date: 08 May 2001 16:10:21 -0700
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires  '' NSURI
In-Reply-To: Uche Ogbuji's message of "Tue, 08 May 2001 14:53:02 -0600"
References: <200105082053.f48Kr2K10012@localhost.local>
Message-ID: <m18zk75teq.fsf@localhost.localdomain>

Uche Ogbuji <uche.ogbuji@fourthought.com> writes:

[*NS('', ...)]

> Yes, and I was a proponent of the former as well, but we just haven't had a 
> chance to go throughout XSLT and make the needed changes.  It's on our to-do 
> list, but any contributed patches can make this happen more quickly.  To be 
> clear: changing pDomlette and cDomlette themselves is quite easy: it's 4XPath 
> and 4XSLT that will eat up the sweat equity.

Well, a quick grep-find shows that they're all in XSLT, and they're
all getAttributeNS or setAttributeNS calls with actual empty strings,
nothing fancy.

Are there tests for 4XSLT?  My install from a PyXML checkout didn't
install any, and I'm an XSLT newbie, so my testing is pretty limited.

I could supply patches, would they be useful without real testing at
this stage of 4XSLT development?

I'd love for this to be usable with our DOM.

-- 
Karl Anderson                          karl@digicool.com


From uche.ogbuji@fourthought.com  Wed May  9 00:18:30 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 08 May 2001 17:18:30 -0600
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires
 '' NSURI
In-Reply-To: Message from Karl Anderson <karl@digicool.com>
 of "08 May 2001 16:10:21 PDT." <m18zk75teq.fsf@localhost.localdomain>
Message-ID: <200105082318.f48NIVb19293@localhost.local>

> Uche Ogbuji <uche.ogbuji@fourthought.com> writes:
> 
> [*NS('', ...)]
> 
> > Yes, and I was a proponent of the former as well, but we just haven't had a 
> > chance to go throughout XSLT and make the needed changes.  It's on our to-do 
> > list, but any contributed patches can make this happen more quickly.  To be 
> > clear: changing pDomlette and cDomlette themselves is quite easy: it's 4XPath 
> > and 4XSLT that will eat up the sweat equity.
> 
> Well, a quick grep-find shows that they're all in XSLT, and they're
> all getAttributeNS or setAttributeNS calls with actual empty strings,
> nothing fancy.
> 
> Are there tests for 4XSLT?  My install from a PyXML checkout didn't
> install any, and I'm an XSLT newbie, so my testing is pretty limited.

The test suite is in the documentation directory (e.g. 
/usr/doc/4Suite-0.11.1a0/test_suite/4XSLT on my machine)

There are 160 test scripts, many of which have multiple test each.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From noreply@sourceforge.net  Wed May  9 03:08:53 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 08 May 2001 19:08:53 -0700
Subject: [XML-SIG] [ pyxml-Bugs-422528 ] can't import xpath
Message-ID: <E14xJPJ-0002ik-00@usw-sf-web2.sourceforge.net>

Bugs item #422528, was updated on 2001-05-08 19:08
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=422528&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: can't import xpath

Initial Comment:
Installed PyXML 0.6.5 and 4Suite checkout with
setup.py install.

Can't import xpath, or run the test suites:

>>> import xml.xpath
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File
"/usr/lib/python1.5/site-packages/xml/xpath/__init__.py",
line 107, in ?
    import Context, XPathParser
  File
"/usr/lib/python1.5/site-packages/xml/xpath/Context.py",
line 16, in ?
    import CoreFunctions
  File
"/usr/lib/python1.5/site-packages/xml/xpath/CoreFunctions.py",
line 18, in ?
    from xml.xpath import ExpandedNameWrapper
ImportError: cannot import name ExpandedNameWrapper


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=422528&group_id=6473


From karl@digicool.com  Wed May  9 03:27:56 2001
From: karl@digicool.com (Karl Anderson)
Date: 08 May 2001 19:27:56 -0700
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires  '' NSURI
In-Reply-To: Uche Ogbuji's message of "Tue, 08 May 2001 17:18:30 -0600"
References: <200105082318.f48NIVb19293@localhost.local>
Message-ID: <m1vgnb45oz.fsf@localhost.localdomain>

Uche Ogbuji <uche.ogbuji@fourthought.com> writes:

> > Are there tests for 4XSLT?  My install from a PyXML checkout didn't
> > install any, and I'm an XSLT newbie, so my testing is pretty limited.
> 
> The test suite is in the documentation directory (e.g. 
> /usr/doc/4Suite-0.11.1a0/test_suite/4XSLT on my machine)

Oh, there's my problem, I was using a PyXML checkout.

I admit that I'm unclear about the relationship between 4Suite and
PyXML - I thought that once a module was added to PyXML, that
checking out PyXML would give you sufficiently bleeding edge code to
develop with and prod for bugs.

I'm also looking for the most vanilla version that I can tell users to
install and use with our code, when appropriate.

Once a module from 4Suite is added to PyXML, is the PyXML version a
checkout from the 4Suite CVS tree?  Or is development moved to the
PyXML tree?

Why aren't the test suites part of PyXML?  Do they rely on more of
4Suite?

-- 
Karl Anderson                          karl@digicool.com


From Mike.Olson@fourthought.com  Wed May  9 05:31:58 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Tue, 08 May 2001 22:31:58 -0600
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires  ''
 NSURI
References: <200105082053.f48Kr2K10012@localhost.local> <m18zk75teq.fsf@localhost.localdomain>
Message-ID: <3AF8C83E.35A48C05@FourThought.com>

Karl Anderson wrote:
> 
> Uche Ogbuji <uche.ogbuji@fourthought.com> writes:
> 
> [*NS('', ...)]
> 
> > Yes, and I was a proponent of the former as well, but we just haven't had a
> > chance to go throughout XSLT and make the needed changes.  It's on our to-do
> > list, but any contributed patches can make this happen more quickly.  To be
> > clear: changing pDomlette and cDomlette themselves is quite easy: it's 4XPath
> > and 4XSLT that will eat up the sweat equity.
> 
> Well, a quick grep-find shows that they're all in XSLT, and they're
> all getAttributeNS or setAttributeNS calls with actual empty strings,
> nothing fancy.

Nope.  XPath uses these in the ParsedAxisSpecified, and decent hand full
of functions.  You would need to fix these as well.

> 
> Are there tests for 4XSLT?  My install from a PyXML checkout didn't
> install any, and I'm an XSLT newbie, so my testing is pretty limited.

You might need to install 4Suite to get the tests but I'm not sure.

> 
> --
> Karl Anderson                          karl@digicool.com
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From martin@loewis.home.cs.tu-berlin.de  Wed May  9 07:59:16 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 9 May 2001 08:59:16 +0200
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires  '' NSURI
In-Reply-To: <m18zk75teq.fsf@localhost.localdomain> (message from Karl
 Anderson on 08 May 2001 16:10:21 -0700)
References: <200105082053.f48Kr2K10012@localhost.local> <m18zk75teq.fsf@localhost.localdomain>
Message-ID: <200105090659.f496xGN00940@mira.informatik.hu-berlin.de>

> Well, a quick grep-find shows that they're all in XSLT, and they're
> all getAttributeNS or setAttributeNS calls with actual empty strings,
> nothing fancy.
> 
> Are there tests for 4XSLT?  My install from a PyXML checkout didn't
> install any, and I'm an XSLT newbie, so my testing is pretty limited.

I think a major source of confusion is that the xpath/xslt
directories, as checked-out from PyXML CVS at the moment, are good for
any purpose. This is not they case: They don't work, and we know it.

If you want to *use* 4XSLT, you should install 4Suite, and not install
the xpath/xslt directories from PyXML (indeed, unless you modify
setup.py, they won't be installed).

> I could supply patches, would they be useful without real testing at
> this stage of 4XSLT development?

That said, if you want to contribute patches to make the xpath/xslt
packages useful, they are always appreciated. Of course, since you are
new to these packages, you might first want to look at how they are
supposed to function in 4Suite before fixing them in PyXML.

As for test suites: 4Suite does include test suites for these
packages.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Wed May  9 08:08:00 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 9 May 2001 09:08:00 +0200
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires  '' NSURI
In-Reply-To: <m1vgnb45oz.fsf@localhost.localdomain> (message from Karl
 Anderson on 08 May 2001 19:27:56 -0700)
References: <200105082318.f48NIVb19293@localhost.local> <m1vgnb45oz.fsf@localhost.localdomain>
Message-ID: <200105090708.f49780n00963@mira.informatik.hu-berlin.de>

> I admit that I'm unclear about the relationship between 4Suite and
> PyXML - I thought that once a module was added to PyXML, that
> checking out PyXML would give you sufficiently bleeding edge code to
> develop with and prod for bugs.

It is absolutely bleeding edge, yes, and bug reports are welcome.

However, until PyXML is released with these packages, you should not
assume that they actually work. Indeed, one possible scenario is that
the next PyXML release does *not* included these subdirectories.

> Once a module from 4Suite is added to PyXML, is the PyXML version a
> checkout from the 4Suite CVS tree?  Or is development moved to the
> PyXML tree?

Neither, nor. 4XSLT uses a different XPath expression parser than the
copy in PyXML; the 4XSLT one is based on BisonGen/SWIG/bison/flex; the
PyXML one (dubbed PyXPath) uses Yapps/(s)re. The port to the other
parser, as well as other DOM implementations, is not complete.

> Why aren't the test suites part of PyXML?

A number of reasons. First of all, Fourthought has not contributed
them (although they might if asked). Then, the tests do require a
4Suite installation at the moment. Finally, the tests don't pass
without modifications; I'd like to minimize the necessary changes
before incorporating tests.

Regards,
Martin


From noreply@sourceforge.net  Wed May  9 14:41:02 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 09 May 2001 06:41:02 -0700
Subject: [XML-SIG] [ pyxml-Patches-422641 ] NameError in RilParserImp.py
Message-ID: <E14xUD8-0002Xq-00@usw-sf-web2.sourceforge.net>

Patches item #422641, was updated on 2001-05-09 06:41
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=422641&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Alexandre Fayolle (afayolle)
Assigned to: Nobody/Anonymous (nobody)
Summary: NameError in RilParserImp.py

Initial Comment:
The parser uses undefined constants to report errors.
The attached patch adds definition of these constants
(and uses them properly).

Cheers 

Alexandre Fayolle

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=422641&group_id=6473


From uche.ogbuji@fourthought.com  Wed May  9 15:02:52 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 09 May 2001 08:02:52 -0600
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires
 '' NSURI
In-Reply-To: Message from "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
 of "Wed, 09 May 2001 09:08:00 +0200." <200105090708.f49780n00963@mira.informatik.hu-berlin.de>
Message-ID: <200105091402.f49E2qG06632@localhost.local>

> > Why aren't the test suites part of PyXML?
> 
> A number of reasons. First of all, Fourthought has not contributed
> them (although they might if asked).

Of course.

> Then, the tests do require a 4Suite installation at the moment.

We have discused making them use PyUnit, but this, as all other such good 
intentions, are obstructed by time limitations.

> Finally, the tests don't pass
> without modifications; I'd like to minimize the necessary changes
> before incorporating tests.

We're hammering at the test suites all the while to fix and tweak it into 
submission.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Wed May  9 15:13:42 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 09 May 2001 08:13:42 -0600
Subject: [XML-SIG] Curiouser and curiouser
In-Reply-To: Message from Rich Salz <rsalz@zolera.com>
 of "Mon, 07 May 2001 23:59:00 EDT." <3AF76F04.E38CC24D@zolera.com>
Message-ID: <200105091413.f49EDgM06672@localhost.local>

> > The camps appear to be roughly:
> 
> Interesting analysis, thanks!
> 
> > * Just use SOAP as-is and rubber-stamp WSDL and UDDI to boot ...
> > * Take the good parts of SOAP, mix in a bit of "transactions" here, a dash of PKI there, a smidgen of EAI voodoo, and...
> 
> These aren't mutually exclusive, since #2 is presumably a subset of #1.

No.  Some want to change SOAP, which is different from #1.  Also while the #1 
folks want to call it a day when WSDL and UDDI are stabilized, the #2 folk 
want much more.

> > * This is EDI + Internet transport + XML payload + semantic Web, folks: quit reinventing wheels (the camp I occupy)
> 
> I got a bit lost in your sentence syntax.  Can you explain what you mean
> here?  Tnx.

Basically:

* take the business process, internationalization and authority-of-record work 
hammered out in EDI.
* Use Internet transport (HTTP/SMTP) rather than VAN/BBS, Use XML as the 
payload for human readibility, inexpensive app integration and extensibility
* Use a unified structured meta-data model for decription and modeling.

I think this is the most attainable and viable approach to XML-based business 
transactions, mostly because it avoids reinventing wheels as much as possible.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From rsalz@zolera.com  Wed May  9 15:39:47 2001
From: rsalz@zolera.com (Rich Salz)
Date: Wed, 09 May 2001 10:39:47 -0400
Subject: [XML-SIG] Curiouser and curiouser
References: <200105091413.f49EDgM06672@localhost.local>
Message-ID: <3AF956B3.1146CC7C@zolera.com>

> * take the business process, internationalization and authority-of-record work
> hammered out in EDI.
> * Use Internet transport (HTTP/SMTP) rather than VAN/BBS, Use XML as the
> payload for human readibility, inexpensive app integration and extensibility
> * Use a unified structured meta-data model for decription and modeling.

So you must be a big fan of ebXML.

Me, too. :)


From uche.ogbuji@fourthought.com  Wed May  9 16:18:38 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Wed, 09 May 2001 09:18:38 -0600
Subject: [XML-SIG] Curiouser and curiouser
In-Reply-To: Message from Rich Salz <rsalz@zolera.com>
 of "Wed, 09 May 2001 10:39:47 EDT." <3AF956B3.1146CC7C@zolera.com>
Message-ID: <200105091518.f49FIcg07346@localhost.local>

> > * take the business process, internationalization and authority-of-record work
> > hammered out in EDI.
> > * Use Internet transport (HTTP/SMTP) rather than VAN/BBS, Use XML as the
> > payload for human readibility, inexpensive app integration and extensibility
> > * Use a unified structured meta-data model for decription and modeling.
> 
> So you must be a big fan of ebXML.
> 
> Me, too. :)

You got it.  I should rather clarigy that I prefer ebXML to the WSDL/UDDI 
camp, because they are standing on the shoulders of the EDI giants.  I have a 
*great* deal of respect for EDI in general, and I think that the main problem 
with it was the unfortunate power that the main VANs such as GEIS, Sterling 
and Harbinger acquired, which strangled innovation and evolution.

I think that the UDDI camp's insistence on reinventing it all is horrid form, 
and at the WSWS it looked to me as if the impetus behind reinventing it all 
was for each vendor to make as much of a land grab as possible on 
B2B-next-generation.  I have no problem with the profit motive, but hypocrisy 
sucks.

BTW, any luck on setting up that Web services SIG?  We're pretty close to 
off-topic in this discussion, but I'd like it to continue, especially with 
regard to coordingating Python efforts in Web services.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From rsalz@zolera.com  Wed May  9 16:44:05 2001
From: rsalz@zolera.com (Rich Salz)
Date: Wed, 09 May 2001 11:44:05 -0400
Subject: [XML-SIG] Curiouser and curiouser
References: <200105091518.f49FIcg07346@localhost.local>
Message-ID: <3AF965C5.73BF4B@zolera.com>

> BTW, any luck on setting up that Web services SIG?  We're pretty close to
> off-topic in this discussion, but I'd like it to continue, especially with
> regard to coordingating Python efforts in Web services.

Sending a "can we create it now" note to the meta-sig was on my todo
list.  I'll send it now.
	/r$


From Alexandre.Fayolle@logilab.fr  Wed May  9 16:56:58 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Wed, 9 May 2001 17:56:58 +0200 (CEST)
Subject: [XML-SIG] clarification request about Sax/Sax2 mappings
Message-ID: <Pine.LNX.4.21.0105091746560.12943-100000@leo.logilab.fr>

Hello,

I would appreciate if someone could provide information about Sax/Sax2
interface in pyxml (or provide some pointer on some documentation).

Specifically, my understanding is that the prototype of the startElement
method of the ContentHandler interface in Sax2 is supposed to take 4
arguments (nsUri, localName, qName, attributes). However, in
xml.sax.handler, ContentHandler's startElement method has the same
prototype as xml.sax.saxlib's DocumentHandler (which should be used with a
SAX 1 parser), i.e. name, attributes.

I'm trying to write a parser for a non-xml document, that should behave as
a sax parser for the external world, especially to the various DOM reader
classes available around here. Some of these seem to be expecting calls to
startElementNS (I'm thinking specifically of FT's pDomletteReader), with
a signature similar to Java's SAX2 ContentHandler.startElement method.

Any help appreciated.

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From larsga@garshol.priv.no  Wed May  9 17:23:57 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 09 May 2001 18:23:57 +0200
Subject: [XML-SIG] clarification request about Sax/Sax2 mappings
In-Reply-To: <Pine.LNX.4.21.0105091746560.12943-100000@leo.logilab.fr>
References: <Pine.LNX.4.21.0105091746560.12943-100000@leo.logilab.fr>
Message-ID: <m31ypy4hk2.fsf@lambda.garshol.priv.no>

* Alexandre Fayolle
|
| Specifically, my understanding is that the prototype of the startElement
| method of the ContentHandler interface in Sax2 is supposed to take 4
| arguments (nsUri, localName, qName, attributes). 

This is not correct. In SAX 2.0 there are two startElement methods:

  startElement(name, attributes)
  startElementNS(name, qname, attributes)

In the latter, name is a (nsuri, localname) tuple.

| However, in xml.sax.handler, ContentHandler's startElement method
| has the same prototype as xml.sax.saxlib's DocumentHandler (which
| should be used with a SAX 1 parser), i.e. name, attributes.
 
That is correct. This is used when the XML processor is not in
namespace mode.

| I'm trying to write a parser for a non-xml document, that should behave as
| a sax parser for the external world, especially to the various DOM reader
| classes available around here. Some of these seem to be expecting calls to
| startElementNS (I'm thinking specifically of FT's pDomletteReader), with
| a signature similar to Java's SAX2 ContentHandler.startElement method.

A good DOM builder should accept calls to both startElement and
startElementNS. It should also require applications to be consistent
and only use one or the other throughout a single document.

I hope this helps.

--Lars M.


From noreply@sourceforge.net  Wed May  9 17:29:00 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 09 May 2001 09:29:00 -0700
Subject: [XML-SIG] [ pyxml-Patches-422689 ] RIL parser fixes
Message-ID: <E14xWpg-0004xU-00@usw-sf-web2.sourceforge.net>

Patches item #422689, was updated on 2001-05-09 09:28
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=422689&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Alexandre Fayolle (afayolle)
Assigned to: Nobody/Anonymous (nobody)
Summary: RIL parser fixes

Initial Comment:
The RilParser class makes some strange calls to
construct new Predicate classes, some of which do not
exist. 

Here's an attempt to fix this. Not much tested. Please
examine carefully before applying. (the diff is against
a version patched with the patch I submitted earlier
today).
Cheers Alexandre Fayolle

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=422689&group_id=6473


From fdrake@acm.org  Wed May  9 17:35:23 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 9 May 2001 12:35:23 -0400 (EDT)
Subject: [XML-SIG] clarification request about Sax/Sax2 mappings
In-Reply-To: <m31ypy4hk2.fsf@lambda.garshol.priv.no>
References: <Pine.LNX.4.21.0105091746560.12943-100000@leo.logilab.fr>
 <m31ypy4hk2.fsf@lambda.garshol.priv.no>
Message-ID: <15097.29131.401159.457645@cj42289-a.reston1.va.home.com>

Lars Marius Garshol writes:
 > A good DOM builder should accept calls to both startElement and
 > startElementNS. It should also require applications to be consistent
 > and only use one or the other throughout a single document.

  This is not clear; does the DOM specification indicate that only one
or the other can be used?  I think it seems very careful to indicate
that both can be used, as long as expectations are limited.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From Alexandre.Fayolle@logilab.fr  Wed May  9 17:47:38 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Wed, 9 May 2001 18:47:38 +0200 (CEST)
Subject: [XML-SIG] clarification request about Sax/Sax2 mappings
In-Reply-To: <m31ypy4hk2.fsf@lambda.garshol.priv.no>
Message-ID: <Pine.LNX.4.21.0105091842450.12943-100000@leo.logilab.fr>

On 9 May 2001, Lars Marius Garshol wrote:

> 
> * Alexandre Fayolle
> |
> | Specifically, my understanding is that the prototype of the startElement
> | method of the ContentHandler interface in Sax2 is supposed to take 4
> | arguments (nsUri, localName, qName, attributes). 
> 
> This is not correct. In SAX 2.0 there are two startElement methods:
> 
>   startElement(name, attributes)
>   startElementNS(name, qname, attributes)
> 
> In the latter, name is a (nsuri, localname) tuple.

I use http://www.megginson.com/SAX/Java/Javadoc/ as a reference. In this
documentation, the ContentHandler interface has no startElementNS method,
only startElement(java.lang.String namespaceURI, java.lang.String
localName, java.lang.String qName, Attributes atts). 

If I'm not using the right reference, could someone please give me a
pointer to the right one. Otherwise, I do not understand where the
startElementNS method comes from. 

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From fdrake@acm.org  Wed May  9 17:59:03 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 9 May 2001 12:59:03 -0400 (EDT)
Subject: [XML-SIG] clarification request about Sax/Sax2 mappings
In-Reply-To: <Pine.LNX.4.21.0105091842450.12943-100000@leo.logilab.fr>
References: <m31ypy4hk2.fsf@lambda.garshol.priv.no>
 <Pine.LNX.4.21.0105091842450.12943-100000@leo.logilab.fr>
Message-ID: <15097.30551.770652.554305@cj42289-a.reston1.va.home.com>

Alexandre Fayolle writes:
 > If I'm not using the right reference, could someone please give me a
 > pointer to the right one. Otherwise, I do not understand where the
 > startElementNS method comes from. 

  Documentation for the Python SAX2 bindings is given in the Python
Library Reference:

	http://www.python.org/doc/current/lib/markup.html


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From Alexandre.Fayolle@logilab.fr  Wed May  9 18:17:18 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Wed, 9 May 2001 19:17:18 +0200 (CEST)
Subject: [XML-SIG] clarification request about Sax/Sax2 mappings
In-Reply-To: <15097.30551.770652.554305@cj42289-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.21.0105091913530.12943-100000@leo.logilab.fr>

On Wed, 9 May 2001, Fred L. Drake, Jr. wrote:

>   Documentation for the Python SAX2 bindings is given in the Python
> Library Reference:
> 
> 	http://www.python.org/doc/current/lib/markup.html

Thanks. This is what I was looking for. (I'm still stuck with python 1.52,
and this is not part of the Python doc I'm used to read daily). 

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From larsga@garshol.priv.no  Wed May  9 18:19:00 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 09 May 2001 19:19:00 +0200
Subject: [XML-SIG] clarification request about Sax/Sax2 mappings
In-Reply-To: <15097.29131.401159.457645@cj42289-a.reston1.va.home.com>
References: <Pine.LNX.4.21.0105091746560.12943-100000@leo.logilab.fr> 	<m31ypy4hk2.fsf@lambda.garshol.priv.no> <15097.29131.401159.457645@cj42289-a.reston1.va.home.com>
Message-ID: <m3r8xy30fv.fsf@lambda.garshol.priv.no>

* Lars Marius Garshol
|
| A good DOM builder should accept calls to both startElement and
| startElementNS. It should also require applications to be consistent
| and only use one or the other throughout a single document.

* Fred L. Drake, Jr.
| 
| This is not clear; does the DOM specification indicate that only one
| or the other can be used? I think it seems very careful to indicate
| that both can be used, as long as expectations are limited.

You are right about that, but SAX makes it clear that you must
consistently use either startElement or startElementNS, so this is a
SAX 2.0 issue more than a DOM issue. I wouldn't get too upset if the
DOM doesn't check this, though.

--Lars M.


From karl@digicool.com  Wed May  9 19:23:32 2001
From: karl@digicool.com (Karl Anderson)
Date: 09 May 2001 11:23:32 -0700
Subject: [XML-SIG] [ pyxml-Bugs-421553 ] stylesheet node reader requires  '' NSURI
In-Reply-To: "Martin v. Loewis"'s message of "Wed, 9 May 2001 09:08:00 +0200"
References: <200105082318.f48NIVb19293@localhost.local> <m1vgnb45oz.fsf@localhost.localdomain> <200105090708.f49780n00963@mira.informatik.hu-berlin.de>
Message-ID: <m1r8xy4c0r.fsf@localhost.localdomain>

Martin v. Loewis <martin@loewis.home.cs.tu-berlin.de> writes:

> However, until PyXML is released with these packages, you should not
> assume that they actually work. Indeed, one possible scenario is that
> the next PyXML release does *not* included these subdirectories.

Yeah, CVS checkouts and all, I know :)  FYI, my motivation is estimating
the chances of a stable release in the near future that works with
ParsedXML.

-- 
Karl Anderson                          karl@digicool.com


From Christine Hall" <christine@trafficmagnet.net  Wed May  9 23:08:05 2001
From: Christine Hall" <christine@trafficmagnet.net (Christine Hall)
Date: Thu, 10 May 2001 06:08:05 +0800
Subject: [XML-SIG] GLORY.PYTHON.OR.KR
Message-ID: <200105092153.FAA01147@localhost.localdomain>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE></TITLE>
<META content="text/html; charset=iso-8859-1" http-equiv=Content-Type>
<META content="MSHTML 5.00.2314.1000" name=GENERATOR></HEAD>
<BODY bgColor=#ffffff><img src="http://report.trafficmagnet.net/report/tr/get.php?referemail=
xml-sig@python.org
" width=1 height=1 border=0><FONT face=Arial size=2>
<P>Hello,<BR></P>
<P>I visited 
<A href=http://glory.python.or.kr>glory.python.or.kr</A>
</A> and I 
noticed that you are not listed on some search engines. I am sure you can 
increase the number of people who visit 
<A href=http://glory.python.or.kr>glory.python.or.kr</A>
</A>. Do you know TrafficMagnet? TrafficMagnet is a unique technology that instantly submits your 
web site to over 300,000+ search engines and directories every month. This is a 
very low-cost and effective way of advertising your site. </P>
<P>To check our prices and submit 
<A href=http://glory.python.or.kr>glory.python.or.kr</A>
</A> to 
300,000+ search engines, go to <a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
">TrafficMagnet.net</A></P>
<P>I would love to hear from you. </P>
<P>Best Regards,<BR>Christine Hall<BR>Sales &amp; Marketing<BR><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
">www.TrafficMagnet.net</A> 
<STYLE type=text/css>.button {
	BACKGROUND-COLOR: #cc0000; COLOR: #ffffff; FONT-WEIGHT: bold
}
</STYLE>
</P>
<P>
<TABLE border=0 cellPadding=0 cellSpacing=0 width=507>
  <TBODY>
  <TR>
    <TD width=149><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG border=0 
      height=159 src="http://image4.trafficmagnet.net/trafficmagnet/magnet.jpg" 
      width=149></A></TD>
    <TD width=59><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG border=0 
      height=159 src="http://image4.trafficmagnet.net/trafficmagnet/blitz.gif" 
      width=59></A></TD>
    <TD width=253>
      <TABLE border=0 cellPadding=0 cellSpacing=0>
        <TBODY>
        <TR>
          <TD height=194 rowSpan=3 width=21><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG border=0 height=194 
            src="http://image4.trafficmagnet.net/trafficmagnet/frameleft.jpg" 
            width=21></A></TD>
          <TD height=24 width=210><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG 
            border=0 height=24 
            src="http://image4.trafficmagnet.net/trafficmagnet/frametop.jpg" 
            width=210></A></TD>
          <TD height=194 rowSpan=3 width=22><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG border=0 height=194 
            src="http://image4.trafficmagnet.net/trafficmagnet/frameright.jpg" 
            width=22></A></TD></TR>
        <TR>
          <TD>
<A Href = http://glory.python.or.kr><IMG Src = http://image7.trafficmagnet.net/image65/2/75/img409.jpg Border=0 width="210" height="141"></A>
</A>
          </TD></TR>
        <TR>
          <TD height=25 width=210><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG 
            border=0 height=25 
            src="http://image4.trafficmagnet.net/trafficmagnet/framebottom.jpg" 
            width=210></A></TD></TR></TBODY></TABLE></TD>
    <TD width=239><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG border=0 
      height=175 src="http://image4.trafficmagnet.net/trafficmagnet/people.gif" 
      width=239></A></TD></TR>
  <TR>
    <TD width=149><a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG border=0 
      height=48 src="http://image4.trafficmagnet.net/trafficmagnet/brand.jpg" 
      width=149></A></TD>
    <TD width=59>&nbsp;</TD>
    <TD align=middle vAlign=bottom width=253>
	<a href="http://report.trafficmagnet.net/report/tr/click.php?referemail=
xml-sig@python.org
"><IMG border=0 src="http://image4.trafficmagnet.net/trafficmagnet/signup.gif"></A>
    </TD>
    <TD align=right vAlign=bottom 
width=239>&nbsp;</TD></TR></TBODY></TABLE></FONT></P></BODY></HTML>


From noreply@sourceforge.net  Wed May  9 23:41:27 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 09 May 2001 15:41:27 -0700
Subject: [XML-SIG] [ pyxml-Patches-422801 ] CoreFunctions misusing ExpandedName
Message-ID: <E14xce7-0005a4-00@usw-sf-web1.sourceforge.net>

Patches item #422801, was updated on 2001-05-09 15:41
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=422801&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Karl Anderson (karlanderson)
Assigned to: Nobody/Anonymous (nobody)
Summary: CoreFunctions misusing ExpandedName

Initial Comment:
Using 4Suite cvs checkout.

xpath.CoreFunctions.py was looking at the wrong attrs
of ExpandedName.ExpandedName, causing tests to fail. 
This patch uses the attrs in ExpandedName.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=422801&group_id=6473


From fdrake@acm.org  Thu May 10 03:03:52 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 9 May 2001 22:03:52 -0400 (EDT)
Subject: [XML-SIG] clarification request about Sax/Sax2 mappings
In-Reply-To: <m3r8xy30fv.fsf@lambda.garshol.priv.no>
References: <Pine.LNX.4.21.0105091746560.12943-100000@leo.logilab.fr>
 <m31ypy4hk2.fsf@lambda.garshol.priv.no>
 <15097.29131.401159.457645@cj42289-a.reston1.va.home.com>
 <m3r8xy30fv.fsf@lambda.garshol.priv.no>
Message-ID: <15097.63240.342240.594303@cj42289-a.reston1.va.home.com>

Lars Marius Garshol writes:
 > You are right about that, but SAX makes it clear that you must
 > consistently use either startElement or startElementNS, so this is a
 > SAX 2.0 issue more than a DOM issue. I wouldn't get too upset if the
 > DOM doesn't check this, though.

  I'm not going to worry about it, either.  I think their are two
problems here:  that the Namespaces in XML specification is poorly
written and does not cover everything it should (interaction with DTDs
being a major issue in my book, though part of that may be a lack of
clarity in the text rather than the issues not having been
approached), and the conflation of NS and non-NS documents in the
DOM.
  But neither of those issues is directly related to the Python
bindings for the APIs, so I guess we've strayed a little.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From martin@loewis.home.cs.tu-berlin.de  Thu May 10 06:56:45 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 10 May 2001 07:56:45 +0200
Subject: [XML-SIG] What are the limits of soap and python?
In-Reply-To: <F97PiyAphwFutJLtosJ0000033f@hotmail.com>
 (stuff4gary@hotmail.com)
References: <F97PiyAphwFutJLtosJ0000033f@hotmail.com>
Message-ID: <200105100556.f4A5ujB01569@mira.informatik.hu-berlin.de>

> I am pretty confused by the SOAP discussion (it hasn't any connection with 
> the operas browser movement has it!!).

I don't know what the operas browswer movement is, but I guess it has
no connection to SOAP, no.

> I imagine it is like OCX, Dynamic Data Exchange, Windows Script, applescript 
> !!

Not really. SOAP messages are typically received by Web servers; I
don't think anybody uses them to control applications on the same
machine.

> Can it push buttons on the system and fill out text fields, with
> text and automate through applications?

SOAP, on its own, is just a protocol for access to objects. Whether
the objects, when accessed, fill out text fields - that is up to the
objects being accessed. You cannot push buttons using SOAP.

> Can I set it up hot folders with it to send pictures through
> photoshop, OCR and databases??

I'm not sure what a hot folder is, but I guess photoshop would not
react to or emit SOAP messages; nor do I know any database system that
supports SOAP directly (although many web servers may give indirect
access to a database through SOAP).

> is that even possible in python?

Doing all these things is possible in Python, I believe - but you'ld
have to do them without SOAP.

Regards,
Martin


From mike@pdc.kth.se  Thu May 10 13:25:41 2001
From: mike@pdc.kth.se (Mike Hammill)
Date: Thu, 10 May 2001 14:25:41 +0200
Subject: [XML-SIG] Help with removeChild()
Message-ID: <200105101225.f4ACPN8181353@ratatosk.pdc.kth.se>

Dear xml-sig,

I hope someone with a bit more experience can help me.  I'm trying to use 
xml.minidom to clean up an XML file.  In brief, how does one walk through the 
DOM tree and remove certain children using recursion?  My attempt walks the 
tree, but some children are skipped.  I believe this is because when children 
are removed, it not reflected in the calling program list of children.  Here 
is a simplified version of the problem.

XML file:
<slideshow>
<a>
</a>
<b>
</b>
    <c>
    </c>
    <d>
    </d>
<e>
</e>
<f></f>
</slideshow>

I would like to get rid of any element that has no attributes and who's text 
element is just whitespace, tabs, or linefeeds.  I wrote a little tree walker
the reduces the above to:

<?xml version="1.0" ?>
<slideshow><a/><b/><c/><d/><e/><f/></slideshow>

So far, so good.  When I apply the following code, however, the result is:
<?xml version="1.0" ?>
<slideshow><b/><d/><f/></slideshow>

That is only elements a, c, and e are eliminated.  The code is:

def trim_dom_more(node):
    if node.hasChildNodes():
        for child in node.childNodes:
            trim_dom_more(child)
    else:
        if node.nodeType == node.ELEMENT_NODE:
            if (not node.hasAttributes()) and (not node.hasChildNodes()):
                node.parentNode.removeChild(node)

I think I understand that the problem is that node.childNodes gets evaluated 
and put on the stack, but then after the removeChild, this stacked list is not 
re-evaluated so not all children are iterated through.  But how to solve that?

Any advice welcome!
Thanks
Mike


From tpassin@home.com  Thu May 10 13:59:19 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Thu, 10 May 2001 08:59:19 -0400
Subject: [XML-SIG] Help with removeChild()
References: <200105101225.f4ACPN8181353@ratatosk.pdc.kth.se>
Message-ID: <000a01c0d951$076f7fe0$7cac1218@reston1.va.home.com>

I think your problem is the inverse- the childNodes list ***is*** getting
updated by the DOM after each removal.

[Mike Hammill]

> That is only elements a, c, and e are eliminated.  The code is:
>
> def trim_dom_more(node):
>     if node.hasChildNodes():
>         for child in node.childNodes:
>             trim_dom_more(child)
>     else:
>         if node.nodeType == node.ELEMENT_NODE:
>             if (not node.hasAttributes()) and (not node.hasChildNodes()):
>                 node.parentNode.removeChild(node)
>
> I think I understand that the problem is that node.childNodes gets
evaluated
> and put on the stack, but then after the removeChild, this stacked list is
not
> re-evaluated so not all children are iterated through.  But how to solve
that?

 Try this:

 def trim_dom_more(node):
     if node.hasChildNodes():
         children=node.childNodes[:]
         for child in children:
             trim_dom_more(child)

Now you are iterating through a static copy of the list.  It wouldn't work
if the child nodes could get changed by another thread, but I don't suppose
that's going to happen here.

Or you could do

while node.hasChildNodes():
    trim_dom_more(node.childNodes[0])

That would execute slower, though.  But it wouldn't get fooled by any other
activity in the DOM.

Cheers,

Tom P


From noreply@sourceforge.net  Thu May 10 14:53:47 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 10 May 2001 06:53:47 -0700
Subject: [XML-SIG] [ pyxml-Bugs-423027 ] startElementNS bug in pDomletteReader
Message-ID: <E14xqt1-0004Kp-00@usw-sf-web2.sourceforge.net>

Bugs item #423027, was updated on 2001-05-10 06:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=423027&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Logilab (ornicar)
Assigned to: Nobody/Anonymous (nobody)
Summary: startElementNS bug in pDomletteReader

Initial Comment:
Hi,
I've been trying to make a custom Sax parser work using
the startElementNS() method... No way, this function
needs some updates, and I don't exactly know how to fix
it. In fact endElementNS() tries to pop elements from
internal stacks which have not been pushed in before,
especially namespaces...

Cheers,

       Bruno Van Frachem, Logilab.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=423027&group_id=6473


From mike@pdc.kth.se  Thu May 10 15:03:40 2001
From: mike@pdc.kth.se (Michael Hammill)
Date: Thu, 10 May 2001 16:03:40 +0200
Subject: [XML-SIG] Help with removeChild()
In-Reply-To: <000a01c0d951$076f7fe0$7cac1218@reston1.va.home.com>
References: <200105101225.f4ACPN8181353@ratatosk.pdc.kth.se>
Message-ID: <5.1.0.14.2.20010510155431.02d383d0@localhost>

Dear Thomas,

Your solution below works great!   I have discovered something else quite 
instructive (at least to me).  When I first saw your solution, I thought 
"oh, I've tried that already".  Silly of me.  What I had tried was not 
exactly the same, but seemingly close.  I had set children = 
node.childNodes *without* the final '[:]'.  In testing the solution below, 
I found that if the [:] is left out, the result is the same as I got before 
(an incorrect trimming); however, with the [:] it works fine.  I'm sorry if 
this is a newbe kind of confusion.  I had always thought "list" was 
equivalent to "list[:]", but apparently not.

Thank you again!
Mike

[...]
>  Try this:
>
>  def trim_dom_more(node):
>      if node.hasChildNodes():
>          children=node.childNodes[:]
>          for child in children:
>              trim_dom_more(child)
>
>Now you are iterating through a static copy of the list.  It wouldn't work
>if the child nodes could get changed by another thread, but I don't suppose
>that's going to happen here.
[...]


From noreply@sourceforge.net  Thu May 10 17:44:33 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 10 May 2001 09:44:33 -0700
Subject: [XML-SIG] [ pyxml-Bugs-423086 ] xml.xpath cannot be imported
Message-ID: <E14xtYH-0004wb-00@usw-sf-web3.sourceforge.net>

Bugs item #423086, was updated on 2001-05-10 09:44
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=423086&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 9
Submitted By: Lars Marius Garshol (larsga)
Assigned to: Nobody/Anonymous (nobody)
Summary: xml.xpath cannot be imported

Initial Comment:
When importing xml.xpath, xml.xpath.Conversions gets sucked in, and that attempts to import xml.utils.boolean, which does not exist. The result is that any attempt to import xml.xpath fails. Did someone forget to commit something?

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=423086&group_id=6473


From martin@loewis.home.cs.tu-berlin.de  Thu May 10 18:08:35 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 10 May 2001 19:08:35 +0200
Subject: [XML-SIG] [ pyxml-Bugs-423086 ] xml.xpath cannot be imported
In-Reply-To: <E14xtYH-0004wb-00@usw-sf-web3.sourceforge.net>
 (noreply@sourceforge.net)
References: <E14xtYH-0004wb-00@usw-sf-web3.sourceforge.net>
Message-ID: <200105101708.f4AH8Zu01871@mira.informatik.hu-berlin.de>

> When importing xml.xpath, xml.xpath.Conversions gets sucked in, and
> that attempts to import xml.utils.boolean, which does not exist. The
> result is that any attempt to import xml.xpath fails. Did someone
> forget to commit something?

xml.utils.boolean should be compiled from extensions/boolean.c, and
installed in xml/utils. Did you perform a 'setup.py install', and it
still did not work?

Regards,
Martin


From larsga@garshol.priv.no  Thu May 10 18:20:45 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 10 May 2001 19:20:45 +0200
Subject: [XML-SIG] [ pyxml-Bugs-423086 ] xml.xpath cannot be imported
In-Reply-To: <200105101708.f4AH8Zu01871@mira.informatik.hu-berlin.de>
References: <E14xtYH-0004wb-00@usw-sf-web3.sourceforge.net> <200105101708.f4AH8Zu01871@mira.informatik.hu-berlin.de>
Message-ID: <m3bsp15dea.fsf@lambda.garshol.priv.no>

* Martin v. Loewis
| 
| xml.utils.boolean should be compiled from extensions/boolean.c, and
| installed in xml/utils. Did you perform a 'setup.py install', and it
| still did not work?

Arrrghh! No, I was so thick-headed I didn't even think of that. Sorry.
Will close the bug now.

--Lars M.


From noreply@sourceforge.net  Thu May 10 19:53:02 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 10 May 2001 11:53:02 -0700
Subject: [XML-SIG] [ pyxml-Patches-423122 ] xml.sax.writer places chardata in tags
Message-ID: <E14xvYc-0000GL-00@usw-sf-web1.sourceforge.net>

Patches item #423122, was updated on 2001-05-10 11:53
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=423122&group_id=6473

Category: sax
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Lars Marius Garshol (larsga)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: xml.sax.writer places chardata in tags

Initial Comment:
writer produces output of the form

<doccontent/>

where the element name was 'doc' and the character
data 'content'.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=423122&group_id=6473


From stuff4gary@hotmail.com  Fri May 11 00:13:44 2001
From: stuff4gary@hotmail.com (gary cor)
Date: Thu, 10 May 2001 23:13:44
Subject: [XML-SIG] XForms and SVG support in python?
Message-ID: <F16332ctFfj8ZMGLoU200002f3f@hotmail.com>

Dear All,

I am working part-time at a publishers where we do magazines in DTP packages 
- Quark 5.0, Illustrator 9.0 and Photoshop 6.0 can now export as SVG for the 
web (replacing EPS format which we currently use and has never been 
supported on the web!!).  Soon they are adding Xform fields for SVG as well!
I am wondering whether anyone could forsee any problems or opportunities 
using the Fieldstorage() cgi from Python to process Xform data? or in 
changing any parts of SVG on the fly, eg like boxes in our advert sections?

Gary

PS  I found some good tutorials for XML etc. and programming at 
http://www.w3schools.com - I am a bit disappointed :-( they had nothing on 
python, is python obscure?

_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.


From michael.clark@ntlworld.com  Sat May 12 00:06:20 2001
From: michael.clark@ntlworld.com (michael.clark)
Date: Sat, 12 May 2001 00:06:20 +0100
Subject: [XML-SIG] FREE SMS Messaging Web Service
Message-ID: <000001c0da6e$fe533ea0$ec07ff3e@clarks>

For those who are interested in SOAP web services,
I've just located this new web service. It seems to be
the first commercial one I've come across that's
a) working & b) actually useful.

You can send SMS messages to supposedly any
mobile phone in the world, free of charge! We've tried
it and so far we've sent messages to people in USA,
UK and ASIA, pretty neat we thought!

http://www.salcentral.com/help/smsreg.htm

Mark


From greg.simmons@ntlworld.com  Sun May 13 09:53:15 2001
From: greg.simmons@ntlworld.com (Greg Simmons)
Date: Sun, 13 May 2001 09:53:15 +0100
Subject: [XML-SIG] SOAP SMS Messaging Web Services for FREE
Message-ID: <001d01c0db8a$27d1bb00$e08f69d5@clarks>

For those who are interested in SOAP web services,
I've just located this new web service. It seems to be
the first commercial one I've come across that's
a) working & b) actually useful.

You can send SMS messages to supposedly any
mobile phone in the world, free of charge! We've tried
it and so far we've sent messages to people in USA,
UK and ASIA, pretty neat we thought!

http://www.salcentral.com/help/smsreg.htm

Greg.


From uche.ogbuji@fourthought.com  Sun May 13 14:36:42 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 07:36:42 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
Message-ID: <3AFE8DEA.F92054CB@fourthought.com>

Just a note: I'm guessing you intended to x-post this to xml-sig, not
python-dev.  I've changed the headers.

"Martin v. Loewis" wrote:
> 
> Currently, 4XSLT has a dependency on the DOM implementation in terms
> of memory management (among other dependencies). I'd like to reduce
> this dependency, by providing a centralized function that knows how to
> release nodes.
> 
> In PyXML, I currently use
> 
> # Define ReleaseNode in a DOM-independent way
> import xml.dom.ext
> import xml.dom.minidom
> def _releasenode(n):
>     if isinstance(n, xml.dom.minidom.Node):
>         n.unlink()
>     else:
>         xml.dom.ext.ReleaseNode(n)
> 
> try:
>     from Ft.Lib import pDomlette
>     def ReleaseNode(n):
>         if isinstance(n, pDomlette.Node):
>             pDomlette.ReleaseNode(n)
>         else:
>             _releasenode(n)
>     _XsltElementBase = pDomlette.Element
> except ImportError:
>     ReleaseNode = _releasenode
>     from minisupport import _XsltElementBase

Wouldn't it be better to make up a Reader class for minidom which
implements a releaseNode method similar to what you have above?  The
idea behind the reader architecture is to manage such things.

There might be some places in 4XSLT that don't properly call releaseNode
on the reader instance itself, but I'd rather fix them to do so.

What's "minisupport" and "_XsltElementBase"?


> This code knows how to release minidom, 4DOM, and pDomlette nodes, and
> supports installations without 4Suite (i.e. without pDomlette). I've
> put this into xslt/__init__.py, so that all callers of
> Ft.Lib.pDomlette.ReleaseNode now need to call xml.xslt.ReleaseNode.
> If desired, I could produce a patch against the public Ft CVS.
> 
> As a slightly independent question, such a function also ought to
> support DOM implementations not known to it; I'm thinking in
> particular of the Zope DOMs. I'd like to hear proposals on how such an
> interface should work; I see three options:
> 
> a) it is an operation on the document node (or any node), as in minidom.
> b) it is an operation on the DOM implementation (almost as in 4Suite;
>    you'd need to navigate from the node to the implementation, then
>    you'd need a well-known operation on the implementation)
> c) the code assumes that no release activity is necessary for unknown
>    DOMs, effectively believing in reference counting, garbage collection,
>    acquisition, and other black art.

Maybe we need a general Reader class for unknown DOM classes.  This
would require the unification of DOM factories we were discusing a few
months ago, but the releaseNode method could just be a NOP, i.e. your
(c) option.

> Any comments appreciated, in particular
> 1. from the Ft maintainers on introducing xml.xslt.ReleaseNode, and
> 2. from authors of other DOMs on a general memory management API for
>    Python DOM.

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From martin@loewis.home.cs.tu-berlin.de  Sun May 13 15:41:25 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 13 May 2001 16:41:25 +0200
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFE8DEA.F92054CB@fourthought.com> (message from Uche Ogbuji on
 Sun, 13 May 2001 07:36:42 -0600)
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com>
Message-ID: <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>

[yes, I indeed meant to cross-post to xml-sig]

> Wouldn't it be better to make up a Reader class for minidom which
> implements a releaseNode method similar to what you have above?  The
> idea behind the reader architecture is to manage such things.

How would that work? Assume there was a reader class for minidom, and
the XSLT runtime had a node object. How can you release the node?

Or do you need to know the reader class which originally created that
node as well? That would be not so good: the node might not have been
created by a reader at all, as it might have come directly from the
DOM implementation.

> There might be some places in 4XSLT that don't properly call releaseNode
> on the reader instance itself, but I'd rather fix them to do so.

There is a number of those. Grepping for ReleaseNode in the public CVS
gives

Processor.py:            pDomlette.ReleaseNode(rtfRoot)
Processor.py:            xml.dom.ext.ReleaseNode(rtfRoot)
Processor.py:            pDomlette.ReleaseNode(self._dummyDoc)
Stylesheet.py:            pDomlette.ReleaseNode(imp.stylesheet.ownerDocument)
StylesheetReader.py:        pDomlette.ReleaseNode(inc)
StylesheetReader.py:        pDomlette.ReleaseNode(sheet.ownerDocument)
StylesheetReader.py:                pDomlette.ReleaseNode(inc)
XsltContext.py:            pDomlette.ReleaseNode(doc)
XsltContext.py:                    pDomlette.ReleaseNode(rtf)
XsltContext.py:                    xml.dom.ext.ReleaseNode(rtf)

> What's "minisupport" and "_XsltElementBase"?

minisupport is an emulation of pDomlette equivalents as used by 4XSLT,
implemented using pDomlette. There are various pieces that I found
necessary: readers, ReaderBase, and Element. The latter is there to
support pickling, and to support the __init__ signature expected from
XsltElement.

> Maybe we need a general Reader class for unknown DOM classes.  This
> would require the unification of DOM factories we were discusing a few
> months ago, but the releaseNode method could just be a NOP, i.e. your
> (c) option.

I don't recall that discussion. Your comment seems to imply a
relationship between a DOM implementation and a Reader class, which I
can't find in the 4Suite code. What do I miss?

Regards,
Martin


From uche.ogbuji@fourthought.com  Sun May 13 18:48:28 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 11:48:28 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>
Message-ID: <3AFEC8EC.D6CFC2F2@fourthought.com>

I see what you mean.  I was thinking about running 4XSLT on non-domlette
source nodes.

I'm guessing you've been working on code to allow XsltElements and
result-tree fragments to use minidom, so you're talking about calls to
releaseNode that handle these things.

Well, I think the best solution to this, rather than making a universal
ReleaseNode function, is to generalize the Reader architecture into a
general factory that can read, initialize and dispose of nodes.  This
could be a Python DOM standard binding extension to DOMImplementation.

The earlier conversation I alluded to is the DOMImplementationFactory
discussion.  If the DOMImplementation gets some standard add-ons, then
this can be used to determine the destruction mechanism in the general
case.

http://mail.python.org/pipermail/xml-sig/2001-February/004508.html

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From Mike.Olson@fourthought.com  Sun May 13 19:19:06 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sun, 13 May 2001 12:19:06 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com>
Message-ID: <3AFED01A.3FF0E6F9@FourThought.com>

Uche Ogbuji wrote:
> 
> Wouldn't it be better to make up a Reader class for minidom which
> implements a releaseNode method similar to what you have above?  The
> idea behind the reader architecture is to manage such things.

The thing I don't like about the reader, is that you need to pass it
around or store it in order to call the correct release.  We could get
around this by having each node store a reference to its reader when it
is created.

node.reader.releaseNode(node)

> 
> There might be some places in 4XSLT that don't properly call releaseNode
> on the reader instance itself, but I'd rather fix them to do so.

Stylesheet nodes are the big ones (off head) because we don't keep track
of what reader the stylesheet was created with so we always call
pDomlette.releaseNode


-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From Mike.Olson@fourthought.com  Sun May 13 19:27:00 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sun, 13 May 2001 12:27:00 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>
Message-ID: <3AFED1F4.C11668EF@FourThought.com>

"Martin v. Loewis" wrote:
> 
> [yes, I indeed meant to cross-post to xml-sig]
> 
> > Wouldn't it be better to make up a Reader class for minidom which
> > implements a releaseNode method similar to what you have above?  The
> > idea behind the reader architecture is to manage such things.
> 
> How would that work? Assume there was a reader class for minidom, and
> the XSLT runtime had a node object. How can you release the node?
> 
> Or do you need to know the reader class which originally created that
> node as well? That would be not so good: the node might not have been
> created by a reader at all, as it might have come directly from the
> DOM implementation.


This is why I vote for either the implementation has the releaseNode
function, or the node itself.  Readers are great for an abstract way of
creating a DOM (atleast until we all support level III), but without a
releationship between a node instance and its reader they don't work
very well for releasing them.  I also did not think of your point of
nodes created without a reader but it is a good one.

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From martin@loewis.home.cs.tu-berlin.de  Sun May 13 20:02:20 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 13 May 2001 21:02:20 +0200
Subject: [XML-SIG] Disentangling StylesheetReader from Ft.Lib
Message-ID: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de>

I've tried to update my 4XSLT port to use the 4Suite 0.11 code base,
only to discover that the StyleseetReader class is now much stronger
connected to Ft.Lib than before, in particular to classes from
pDomletteReader, and their specific instance attributes.

I took the approach of providing alternative base classes to the ones
provided by pDomlette, but that soon became a desaster since none of
the minidom/pulldom classes bear any relationship to how the
PyExpatReader and Handler classes work.

I'd still like pursue my attempt of integrating 4XSLT to work without
Ft.Lib, and pDomlette in particular, but I'd need some advise here.  I
feel that I miss some grand picture in all these classes, and how they
are connected. It seems that the authors of the code lose track, too,
with code duplication all over the place.

So my question is: Is all this complexity really necessary? Would it
be possible to simplify things by breaking down processing in multiple
processing steps? It seems to me that all StylesheetReader does is to
create a DOM tree, except that it creates StylesheetElement nodes
where a normal DOM build would create Element nodes. If this is really
all it does, I could propose some dramatic code reduction.

Any proposals are welcome.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Sun May 13 20:04:39 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 13 May 2001 21:04:39 +0200
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFEC8EC.D6CFC2F2@fourthought.com> (message from Uche Ogbuji on
 Sun, 13 May 2001 11:48:28 -0600)
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de> <3AFEC8EC.D6CFC2F2@fourthought.com>
Message-ID: <200105131904.f4DJ4d114214@mira.informatik.hu-berlin.de>

> Well, I think the best solution to this, rather than making a universal
> ReleaseNode function, is to generalize the Reader architecture into a
> general factory that can read, initialize and dispose of nodes.  This
> could be a Python DOM standard binding extension to DOMImplementation.

That is a solution that I could easily accept; it would take some time
until all relevant implementations support the method, though, and
we'd need a name for it.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Sun May 13 20:12:39 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 13 May 2001 21:12:39 +0200
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFED01A.3FF0E6F9@FourThought.com> (message from Mike Olson on
 Sun, 13 May 2001 12:19:06 -0600)
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <3AFED01A.3FF0E6F9@FourThought.com>
Message-ID: <200105131912.f4DJCdj14251@mira.informatik.hu-berlin.de>

> The thing I don't like about the reader, is that you need to pass it
> around or store it in order to call the correct release.  We could get
> around this by having each node store a reference to its reader when it
> is created.

With regard to the reader, I'd also like to point you to the level 3
load-store interfaces,

http://www.w3.org/TR/2001/WD-DOM-Level-3-CMLS-20010419/load-save.html

where they have a DOMBuilder interface. So while your Reader interface
is fine as Ft-provided API, I think the DOMBuilder interface has a
higher chance of getting accepted widely.

Regards,
Martin


From uche.ogbuji@fourthought.com  Sun May 13 20:31:18 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 13:31:18 -0600
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de>
Message-ID: <3AFEE106.4C99F9FD@fourthought.com>

"Martin v. Loewis" wrote:
> 
> I've tried to update my 4XSLT port to use the 4Suite 0.11 code base,
> only to discover that the StyleseetReader class is now much stronger
> connected to Ft.Lib than before, in particular to classes from
> pDomletteReader, and their specific instance attributes.

This is to provide shared code, which, oddly enough, you advocate
below.  Some of the routines could indeed be moved into a generic
handler that goes into xml.utils.

> I took the approach of providing alternative base classes to the ones
> provided by pDomlette, but that soon became a desaster since none of
> the minidom/pulldom classes bear any relationship to how the
> PyExpatReader and Handler classes work.

This could all be helped by using mix-in classes in xml.utils.  Note
that I mean *real* mix-in classes, that is, classes that provide
implementation but not interface (a disturbing chunk of the Python
community seems to think that mixing in is just plain old inheritance).

> I'd still like pursue my attempt of integrating 4XSLT to work without
> Ft.Lib, and pDomlette in particular, but I'd need some advise here.  I
> feel that I miss some grand picture in all these classes, and how they
> are connected. It seems that the authors of the code lose track, too,
> with code duplication all over the place.

Of course: the code is not all polished, but I must note that what you
complained above in your first para was actually a step that eliminated
a *great* deal of duplicated code from StylesheetReader.

The solution is to move the common code somewhere accessible from PyXML.

> So my question is: Is all this complexity really necessary? Would it
> be possible to simplify things by breaking down processing in multiple
> processing steps? It seems to me that all StylesheetReader does is to
> create a DOM tree, except that it creates StylesheetElement nodes
> where a normal DOM build would create Element nodes.

Wow.  I'd count this a huge oversimplification.  The Stylesheet reader
does a great deal that most readers needn't worry about, as I'd think
would be obvious from a glance at te code.

> If this is really
> all it does, I could propose some dramatic code reduction.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Sun May 13 20:33:26 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 13:33:26 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de> <3AFEC8EC.D6CFC2F2@fourthought.com> <200105131904.f4DJ4d114214@mira.informatik.hu-berlin.de>
Message-ID: <3AFEE186.99AFB1EF@fourthought.com>

"Martin v. Loewis" wrote:
> 
> > Well, I think the best solution to this, rather than making a universal
> > ReleaseNode function, is to generalize the Reader architecture into a
> > general factory that can read, initialize and dispose of nodes.  This
> > could be a Python DOM standard binding extension to DOMImplementation.
> 
> That is a solution that I could easily accept; it would take some time
> until all relevant implementations support the method, though, and
> we'd need a name for it.

I'd favor cleanUp().

And I'm not worried that implementations would need to catch up.  The
desire for 4XSLT interop will accelerate this work.

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Sun May 13 20:34:08 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 13:34:08 -0600
Subject: [XML-SIG] [Fwd: [4suite] ReleaseNode interface in 4XSLT]
Message-ID: <3AFEE1B0.5461D63D@fourthought.com>

This is a multi-part message in MIME format.
--------------28F27A9C4716E5DCE6516942
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python
--------------28F27A9C4716E5DCE6516942
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Return-Path: <4suite-admin@dollar.fourthought.com>
Received: from dollar.fourthought.com ([204.144.146.184])
	by yen.fourthought.com (8.11.2/8.11.2) with ESMTP id f4DJLv719962;
	Sun, 13 May 2001 13:21:57 -0600
Received: from dollar.fourthought.com (localhost.localdomain [127.0.0.1])
	by dollar.fourthought.com (8.9.3/8.9.3) with ESMTP id NAA13808;
	Sun, 13 May 2001 13:16:02 -0600
Received: from yen.fourthought.com (bastion.fourthought.com [204.144.146.185])
	by dollar.fourthought.com (8.9.3/8.9.3) with ESMTP id NAA13772
	for <4suite@dollar.fourthought.com>; Sun, 13 May 2001 13:15:50 -0600
Received: from mail.cs.tu-berlin.de (root@mail.cs.tu-berlin.de [130.149.17.13])
	by yen.fourthought.com (8.11.2/8.11.2) with ESMTP id f4DJKl719685;
	Sun, 13 May 2001 13:20:48 -0600
Received: from mira.informatik.hu-berlin.de (loewis.home.cs.tu-berlin.de [130.149.147.34])
	by mail.cs.tu-berlin.de (8.9.3/8.9.3) with ESMTP id VAA14169;
	Sun, 13 May 2001 21:15:10 +0200 (MET DST)
Received: (from martin@localhost)
	by mira.informatik.hu-berlin.de (8.10.2/8.10.2/SuSE Linux 8.10.0-0.3) id f4DJ8lh14249;
	Sun, 13 May 2001 21:08:47 +0200
Message-Id: <200105131908.f4DJ8lh14249@mira.informatik.hu-berlin.de>
From: "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
To: Mike.Olson@fourthought.com
CC: 4suite@fourthought.com, python-dev@python.org
In-reply-to: <3AFECF52.FF7E9B26@FourThought.com> (message from Mike Olson on
	Sun, 13 May 2001 12:15:46 -0600)
Subject: Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFECF52.FF7E9B26@FourThought.com>
User-Agent: REMI/1.14.2 (=?ISO-8859-4?Q?Hokuhoku-=D2shima?=) Chao/1.14.1
 (=?ISO-8859-4?Q?Rokujiz=F2?=) APEL/10.2 Emacs/20.7 (i386-suse-linux)
 MULE/4.0 (HANANOEN)
MIME-Version: 1.0 (generated by REMI 1.14.2 - =?ISO-8859-4?Q?=22Hokuhoku-=D2?=
 =?ISO-8859-4?Q?shima=22?=)
Content-Type: text/plain; charset=US-ASCII
Sender: 4suite-admin@dollar.fourthought.com
Errors-To: 4suite-admin@dollar.fourthought.com
X-BeenThere: 4suite@lists.fourthought.com
X-Mailman-Version: 2.0beta6
Precedence: bulk
List-Help: <mailto:4suite-request@lists.fourthought.com?subject=help>
List-Post: <mailto:4suite@lists.fourthought.com>
List-Subscribe: <http://lists.fourthought.com/mailman/listinfo/4suite>, <mailto:4suite-request@lists.fourthought.com?subject=subscribe>
List-Id: Users and support for 4Suite tools <4suite.lists.fourthought.com>
List-Unsubscribe: <http://lists.fourthought.com/mailman/listinfo/4suite>, <mailto:4suite-request@lists.fourthought.com?subject=unsubscribe>
List-Archive: http://lists.fourthought.com/pipermail/4suite/
Date: Sun, 13 May 2001 21:08:47 +0200

> What if we put these on the implementation, that or came up with a
> standard interface on the node.  Then, every DOM imp that wants to be
> compatible with xpath/xslt needs to support this interface?
> 
> 
> node.ownerDocument.implementation.releaseNode(node)
> 
> or
> 
> node.py_unlink()

releaseNode sounds good to me; it is unlikely that W3C would give an
operation that name but a different meaning. Any objections?

Regards,
Martin
_______________________________________________
4suite mailing list
4suite@lists.fourthought.com
http://lists.fourthought.com/mailman/listinfo/4suite

--------------28F27A9C4716E5DCE6516942--


From uche.ogbuji@fourthought.com  Sun May 13 20:36:40 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 13:36:40 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <3AFED01A.3FF0E6F9@FourThought.com> <200105131912.f4DJCdj14251@mira.informatik.hu-berlin.de>
Message-ID: <3AFEE248.CA8C2BC4@fourthought.com>

"Martin v. Loewis" wrote:
> 
> > The thing I don't like about the reader, is that you need to pass it
> > around or store it in order to call the correct release.  We could get
> > around this by having each node store a reference to its reader when it
> > is created.
> 
> With regard to the reader, I'd also like to point you to the level 3
> load-store interfaces,
> 
> http://www.w3.org/TR/2001/WD-DOM-Level-3-CMLS-20010419/load-save.html
> 
> where they have a DOMBuilder interface. So while your Reader interface
> is fine as Ft-provided API, I think the DOMBuilder interface has a
> higher chance of getting accepted widely.

I'm quite familiar with DOM Level 3, but the Reader architecture
predates this, and there is no immediate prospect of time to move to the
Level 3 interfaces.  Perhaps in a month or two.  Of course, this could
be accelerated by contributions.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From martin@loewis.home.cs.tu-berlin.de  Sun May 13 21:17:22 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 13 May 2001 22:17:22 +0200
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFEE186.99AFB1EF@fourthought.com> (message from Uche Ogbuji on
 Sun, 13 May 2001 13:33:26 -0600)
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de> <3AFEC8EC.D6CFC2F2@fourthought.com> <200105131904.f4DJ4d114214@mira.informatik.hu-berlin.de> <3AFEE186.99AFB1EF@fourthought.com>
Message-ID: <200105132017.f4DKHMC20333@mira.informatik.hu-berlin.de>

> I'd favor cleanUp().

On the node, or on the DOM implementation?

Martin


From rsalz@zolera.com  Mon May 14 01:39:48 2001
From: rsalz@zolera.com (Rich Salz)
Date: Sun, 13 May 2001 20:39:48 -0400
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <3AFED01A.3FF0E6F9@FourThought.com>
Message-ID: <3AFF2954.32ACAD38@zolera.com>

> The thing I don't like about the reader, is that you need to pass it
> around or store it in order to call the correct release.  We could get
> around this by having each node store a reference to its reader when it
> is created.

I'm in favor of this for exactly this reason.

Since Python doesn't allow tilde in method names ~Node is out, so I'd go
along with releaseNode() as suggested elsewhere. :)
	/r$


From Mike.Olson@fourthought.com  Mon May 14 02:05:48 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sun, 13 May 2001 19:05:48 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de> <3AFEC8EC.D6CFC2F2@fourthought.com> <200105131904.f4DJ4d114214@mira.informatik.hu-berlin.de> <3AFEE186.99AFB1EF@fourthought.com> <200105132017.f4DKHMC20333@mira.informatik.hu-berlin.de>
Message-ID: <3AFF2F6C.B1350B6D@FourThought.com>

"Martin v. Loewis" wrote:
> 
> > I'd favor cleanUp().
> 
> On the node, or on the DOM implementation?

I'm infavor of on the node.  It would be a lot easier to access.  If it
was on the implementation, you would need more logic to release an
arbitrary node as only the document has the implementation reference
(and document's don't have an owner document)

Mike


> 
> Martin
> _______________________________________________
> 4suite mailing list
> 4suite@lists.fourthought.com
> http://lists.fourthought.com/mailman/listinfo/4suite

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From Mike.Olson@fourthought.com  Mon May 14 02:14:17 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sun, 13 May 2001 19:14:17 -0600
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de>
Message-ID: <3AFF3169.29F2B6C8@FourThought.com>

"Martin v. Loewis" wrote:
> 
> I've tried to update my 4XSLT port to use the 4Suite 0.11 code base,
> only to discover that the StyleseetReader class is now much stronger
> connected to Ft.Lib than before, in particular to classes from
> pDomletteReader, and their specific instance attributes.

I was just in there as well and quite suprised how complex the code has
become.  I thought of doing some work on it but figured, it ain't
broke.....

My thoughts were that the implementation should be able to hadle it,
then there would be on reader.  all of the code in the Stylesheet Reader
would be handled in StylesheetDocument.createElement, or atleast the
marority of it.  I haven't looked too closely to see if this is 100%
feasible thought.

> 
> I took the approach of providing alternative base classes to the ones
> provided by pDomlette, but that soon became a desaster since none of
> the minidom/pulldom classes bear any relationship to how the
> PyExpatReader and Handler classes work.

Is pDomlette the only import from Ft.Lib?  If so, why not move pDomlette
into xml.utils?  Better yet, let's merge pDomlette and minidom so there
is only one domlette.  pDomlette has greatly out grown its original
purpose so I have not problems with moving it into XML-Sig.

> 
> I'd still like pursue my attempt of integrating 4XSLT to work without
> Ft.Lib, and pDomlette in particular, but I'd need some advise here.  I
> feel that I miss some grand picture in all these classes, and how they
> are connected. It seems that the authors of the code lose track, too,
> with code duplication all over the place.

I agree.  There was a lot of redundant code when I looked into it last. 
I think there should be one xml-sig "reader" that works off a
DOMImplementation to create actual instances.  Some things to note are
that this would slow things down.  One big speed increase the pDomlette
gives us by having its own reader is that it can create elements
directly and not have to use the createElementNS interface.  The problem
with the interface is that we have to do a "prefix + ':' + localName"
just to satisfy the interface (and then the function itself does a
sting.split(qname,':').  Not really a time consuming process, but when
you call it 10000 it adds up.


> 
> So my question is: Is all this complexity really necessary? Would it
> be possible to simplify things by breaking down processing in multiple
> processing steps? It seems to me that all StylesheetReader does is to
> create a DOM tree, except that it creates StylesheetElement nodes
> where a normal DOM build would create Element nodes. If this is really
> all it does, I could propose some dramatic code reduction.

It also does validation, processing of include and import elements,
namespace aliasing, extension element processing, and more.

Though like I said, I think this could be handeled in a createElementNS
of a StylesheetDocument class.


Mike

> 
> Any proposals are welcome.
> 
> Regards,
> Martin
> 
> _______________________________________________
> 4suite mailing list
> 4suite@lists.fourthought.com
> http://lists.fourthought.com/mailman/listinfo/4suite

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From Mike.Olson@fourthought.com  Mon May 14 02:20:48 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Sun, 13 May 2001 19:20:48 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <3AFED01A.3FF0E6F9@FourThought.com> <200105131912.f4DJCdj14251@mira.informatik.hu-berlin.de>
Message-ID: <3AFF32F0.8AAAED0C@FourThought.com>

"Martin v. Loewis" wrote:
> 
> > The thing I don't like about the reader, is that you need to pass it
> > around or store it in order to call the correct release.  We could get
> > around this by having each node store a reference to its reader when it
> > is created.
> 
> With regard to the reader, I'd also like to point you to the level 3
> load-store interfaces,
> 
> http://www.w3.org/TR/2001/WD-DOM-Level-3-CMLS-20010419/load-save.html
> 
> where they have a DOMBuilder interface. So while your Reader interface
> is fine as Ft-provided API, I think the DOMBuilder interface has a
> higher chance of getting accepted widely.

Agreed, same with xml.dom.ext.Print.  Infact, all of the stuff in
xml.dom.ext was originally put there as "stuff the w3c will add
eventually"  mainly the reader and printer interfaces.  BAck when it was
only level I, there were functions to get a nodes namespace URI, prefix,
and local name in the ext directory.  We moved to level II and thoase
were not needed.  I think the same should happen with the printers and
readers.

However, are we ready to move to level III?  Is level III ready to be
moved too?  I don't think anyone here(at FT) will have too much time to
work on it for a month or too.  We are really trying to get 1.0 out. 
4Suite has been in beta for 3 years as of June 1 :)

This isn't to say that someone else can't do it and we'll help when
where we can.

Mike

> 
> Regards,
> Martin

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Mon May 14 02:57:53 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 19:57:53 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de> <3AFEC8EC.D6CFC2F2@fourthought.com> <200105131904.f4DJ4d114214@mira.informatik.hu-berlin.de> <3AFEE186.99AFB1EF@fourthought.com> <200105132017.f4DKHMC20333@mira.informatik.hu-berlin.de>
Message-ID: <3AFF3BA1.DB51A55A@fourthought.com>

"Martin v. Loewis" wrote:
> 
> > I'd favor cleanUp().
> 
> On the node, or on the DOM implementation?

DOMImplementation.

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Mon May 14 02:59:44 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 19:59:44 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de> <3AFEC8EC.D6CFC2F2@fourthought.com> <200105131904.f4DJ4d114214@mira.informatik.hu-berlin.de> <3AFEE186.99AFB1EF@fourthought.com> <200105132017.f4DKHMC20333@mira.informatik.hu-berlin.de> <3AFF2F6C.B1350B6D@FourThought.com>
Message-ID: <3AFF3C10.32E79FDA@fourthought.com>

Mike Olson wrote:
> 
> "Martin v. Loewis" wrote:
> >
> > > I'd favor cleanUp().
> >
> > On the node, or on the DOM implementation?
> 
> I'm infavor of on the node.  It would be a lot easier to access.  If it
> was on the implementation, you would need more logic to release an
> arbitrary node as only the document has the implementation reference
> (and document's don't have an owner document)

Fine with me.

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Mon May 14 03:10:09 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 20:10:09 -0600
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de> <3AFF3169.29F2B6C8@FourThought.com>
Message-ID: <3AFF3E81.7473BD6C@fourthought.com>

Mike Olson wrote:
> 
> "Martin v. Loewis" wrote:
> >
> > I've tried to update my 4XSLT port to use the 4Suite 0.11 code base,
> > only to discover that the StyleseetReader class is now much stronger
> > connected to Ft.Lib than before, in particular to classes from
> > pDomletteReader, and their specific instance attributes.
> 
> I was just in there as well and quite suprised how complex the code has
> become.  I thought of doing some work on it but figured, it ain't
> broke.....

This is a false impression.  The code is actually quite simpler than it
was before.  In the past, we had the code for mapping prefixes to NSUris
releated in pDomlette/PyExpat, pDomlette/SAX and StylesheetReader.  Now
it's in a single place.  There are many other places where code is now
shared where before it was duplicated.

It certainly needs a lot of polish still: the main problem is that all
the reader systems have evolved separately, and mix-in based
implementation merging is probbaly the best solution.

> My thoughts were that the implementation should be able to hadle it,
> then there would be on reader.  all of the code in the Stylesheet Reader
> would be handled in StylesheetDocument.createElement, or atleast the
> marority of it.  I haven't looked too closely to see if this is 100%
> feasible thought.

I don't favor this.  I think tight coupling with the parse mechanism is
important for efficiency.  It would be better to hav e a separate
fall-back Stylesheet Reader that did things throught DOM interface only
(althought I'm not sure what this would buy us since the same amount of
work would then need to be done in the DOM implementation).

> > I took the approach of providing alternative base classes to the ones
> > provided by pDomlette, but that soon became a desaster since none of
> > the minidom/pulldom classes bear any relationship to how the
> > PyExpatReader and Handler classes work.
> 
> Is pDomlette the only import from Ft.Lib?  If so, why not move pDomlette
> into xml.utils?  Better yet, let's merge pDomlette and minidom so there
> is only one domlette.  pDomlette has greatly out grown its original
> purpose so I have not problems with moving it into XML-Sig.

I disagree with the idea of merging pDomlette and minidom, but I have no
problem mocing pDomlette to xml.utils.

> > I'd still like pursue my attempt of integrating 4XSLT to work without
> > Ft.Lib, and pDomlette in particular, but I'd need some advise here.  I
> > feel that I miss some grand picture in all these classes, and how they
> > are connected. It seems that the authors of the code lose track, too,
> > with code duplication all over the place.
> 
> I agree.  There was a lot of redundant code when I looked into it last.
> I think there should be one xml-sig "reader" that works off a
> DOMImplementation to create actual instances.

Disagree.  See above.  Things can be parameterized more usign DIMImp,
but not at the parser interface level.

> Some things to note are that this would slow things down.  One big speed
> increase the pDomlette gives us by having its own reader is that it can create
> elements directly and not have to use the createElementNS interface.  The problem
> with the interface is that we have to do a "prefix + ':' + localName"
> just to satisfy the interface (and then the function itself does a
> sting.split(qname,':').  Not really a time consuming process, but when
> you call it 10000 it adds up.

There's more to it than just this.  There is a lot about the DOM factory
interfaces that is very inefficient.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uogbuji@fourthought.com  Mon May 14 04:31:14 2001
From: uogbuji@fourthought.com (Uche Ogbuji)
Date: Sun, 13 May 2001 21:31:14 -0600
Subject: [XML-SIG] [Python-Dev] Re: [4suite] ReleaseNode interface in 4XSLT (fwd)
Message-ID: <200105140331.f4E3VEt12406@localhost.local>

------- Forwarded Message

Return-Path: <python-dev-admin@python.org>
Received: from mail.fourthought.com [204.144.146.185]
	by localhost with IMAP (fetchmail-5.6.8)
	for uogbuji@localhost (single-drop); Sun, 13 May 2001 20:10:58 -0600 (MDT)
Received: from mail.python.org (mail.python.org [63.102.49.29])
	by yen.fourthought.com (8.11.2/8.11.2) with ESMTP id f4E18N706668
	for <uche.ogbuji@fourthought.com>; Sun, 13 May 2001 19:08:23 -0600
Received: from localhost.localdomain ([127.0.0.1] helo=mail.python.org)
	by mail.python.org with esmtp (Exim 3.21 #1)
	id 14z6qB-0004Y8-00; Sun, 13 May 2001 21:08:03 -0400
Received: from [204.144.146.185] (helo=yen.fourthought.com)
	by mail.python.org with esmtp (Exim 3.21 #1)
	id 14z6q5-0004Wh-00
	for python-dev@python.org; Sun, 13 May 2001 21:07:57 -0400
Received: from FourThought.com (IDENT:molson@usrtcc1-pool2-38.prolynx.com 
[63.122.17.102])
	by yen.fourthought.com (8.11.2/8.11.2) with ESMTP id f4E17k706656;
	Sun, 13 May 2001 19:07:46 -0600
Message-ID: <3AFF2E8B.31B9ED97@FourThought.com>
From: Mike Olson <Mike.Olson@fourthought.com>
Organization: FourThought, Inc
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.17-14 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
CC: 4suite@fourthought.com, python-dev@python.org
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> 
<3AFECF52.FF7E9B26@FourThought.com> <200105131908.f4DJ8lh14249@mira.informatik.
hu-berlin.de>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Subject: [Python-Dev] Re: [4suite] ReleaseNode interface in 4XSLT
Sender: python-dev-admin@python.org
Errors-To: python-dev-admin@python.org
X-BeenThere: python-dev@python.org
X-Mailman-Version: 2.0.5 (101270)
Precedence: bulk
List-Help: <mailto:python-dev-request@python.org?subject=help>
List-Post: <mailto:python-dev@python.org>
List-Subscribe: <http://mail.python.org/mailman/listinfo/python-dev>,
	<mailto:python-dev-request@python.org?subject=subscribe>
List-Id: Python core developers <python-dev.python.org>
List-Unsubscribe: <http://mail.python.org/mailman/listinfo/python-dev>,
	<mailto:python-dev-request@python.org?subject=unsubscribe>
List-Archive: <http://mail.python.org/pipermail/python-dev/>
Date: Sun, 13 May 2001 19:02:03 -0600

"Martin v. Loewis" wrote:
> 
> > What if we put these on the implementation, that or came up with a
> > standard interface on the node.  Then, every DOM imp that wants to be
> > compatible with xpath/xslt needs to support this interface?
> >
> >
> > node.ownerDocument.implementation.releaseNode(node)
> >
> > or
> >
> > node.py_unlink()
> 
> releaseNode sounds good to me; it is unlikely that W3C would give an
> operation that name but a different meaning. Any objections?


Should we standardize all of the python xml extensions with a py
prefix?  pyReleaseNode or py_releaseNode?  Then we will never have to
worry about a name clash.

Mike
> 
> Regards,
> Martin

- -- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev


------- End of Forwarded Message


From martin@loewis.home.cs.tu-berlin.de  Mon May 14 06:42:58 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 14 May 2001 07:42:58 +0200
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFF32F0.8AAAED0C@FourThought.com> (message from Mike Olson on
 Sun, 13 May 2001 19:20:48 -0600)
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <3AFED01A.3FF0E6F9@FourThought.com> <200105131912.f4DJCdj14251@mira.informatik.hu-berlin.de> <3AFF32F0.8AAAED0C@FourThought.com>
Message-ID: <200105140542.f4E5gwX01307@mira.informatik.hu-berlin.de>

> However, are we ready to move to level III?  Is level III ready to be
> moved too?

No, and no. I would not actively change or drop existing code until
DOM Level 3 is almost finished (proposed recommendation, or some
such). It's just a thing to take into consideration when designing new
code.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Mon May 14 06:39:34 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 14 May 2001 07:39:34 +0200
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFF2F6C.B1350B6D@FourThought.com> (message from Mike Olson on
 Sun, 13 May 2001 19:05:48 -0600)
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFE8DEA.F92054CB@fourthought.com> <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de> <3AFEC8EC.D6CFC2F2@fourthought.com> <200105131904.f4DJ4d114214@mira.informatik.hu-berlin.de> <3AFEE186.99AFB1EF@fourthought.com> <200105132017.f4DKHMC20333@mira.informatik.hu-berlin.de> <3AFF2F6C.B1350B6D@FourThought.com>
Message-ID: <200105140539.f4E5dYx01305@mira.informatik.hu-berlin.de>

> > 
> > > I'd favor cleanUp().
> > 
> > On the node, or on the DOM implementation?
> 
> I'm infavor of on the node.  It would be a lot easier to access.  If it
> was on the implementation, you would need more logic to release an
> arbitrary node as only the document has the implementation reference
> (and document's don't have an owner document)

In that case, I'd prefer unlink, since this is what is already
documented for minidom.

Regards,
Martin


From uche.ogbuji@fourthought.com  Mon May 14 08:06:00 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Mon, 14 May 2001 01:06:00 -0600
Subject: [XML-SIG] [Fwd: [4suite] ReleaseNode interface in 4XSLT]
Message-ID: <3AFF83D8.AFF83E34@fourthought.com>

This is a multi-part message in MIME format.
--------------8D5F80E05CA0787A819F3271
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python
--------------8D5F80E05CA0787A819F3271
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Return-Path: <4suite-admin@dollar.fourthought.com>
Received: from dollar.fourthought.com ([204.144.146.184])
	by yen.fourthought.com (8.11.2/8.11.2) with ESMTP id f4E5vl722886;
	Sun, 13 May 2001 23:57:47 -0600
Received: from dollar.fourthought.com (localhost.localdomain [127.0.0.1])
	by dollar.fourthought.com (8.9.3/8.9.3) with ESMTP id XAA24241;
	Sun, 13 May 2001 23:52:18 -0600
Received: from yen.fourthought.com (bastion.fourthought.com [204.144.146.185])
	by dollar.fourthought.com (8.9.3/8.9.3) with ESMTP id XAA24066
	for <4suite@dollar.fourthought.com>; Sun, 13 May 2001 23:50:08 -0600
Received: from mail.cs.tu-berlin.de (root@mail.cs.tu-berlin.de [130.149.17.13])
	by yen.fourthought.com (8.11.2/8.11.2) with ESMTP id f4E5t7722581;
	Sun, 13 May 2001 23:55:07 -0600
Received: from mira.informatik.hu-berlin.de (loewis.home.cs.tu-berlin.de [130.149.147.34])
	by mail.cs.tu-berlin.de (8.9.3/8.9.3) with ESMTP id HAA28334;
	Mon, 14 May 2001 07:54:00 +0200 (MET DST)
Received: (from martin@localhost)
	by mira.informatik.hu-berlin.de (8.10.2/8.10.2/SuSE Linux 8.10.0-0.3) id f4E5cOb01301;
	Mon, 14 May 2001 07:38:24 +0200
Message-Id: <200105140538.f4E5cOb01301@mira.informatik.hu-berlin.de>
From: "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
To: Mike.Olson@fourthought.com
CC: 4suite@fourthought.com, python-dev@python.org
In-reply-to: <3AFF2E8B.31B9ED97@FourThought.com> (message from Mike Olson on
	Sun, 13 May 2001 19:02:03 -0600)
Subject: Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de> <3AFECF52.FF7E9B26@FourThought.com> <200105131908.f4DJ8lh14249@mira.informatik.hu-berlin.de> <3AFF2E8B.31B9ED97@FourThought.com>
User-Agent: REMI/1.14.2 (=?ISO-8859-4?Q?Hokuhoku-=D2shima?=) Chao/1.14.1
 (=?ISO-8859-4?Q?Rokujiz=F2?=) APEL/10.2 Emacs/20.7 (i386-suse-linux)
 MULE/4.0 (HANANOEN)
MIME-Version: 1.0 (generated by REMI 1.14.2 - =?ISO-8859-4?Q?=22Hokuhoku-=D2?=
 =?ISO-8859-4?Q?shima=22?=)
Content-Type: text/plain; charset=US-ASCII
Sender: 4suite-admin@dollar.fourthought.com
Errors-To: 4suite-admin@dollar.fourthought.com
X-BeenThere: 4suite@lists.fourthought.com
X-Mailman-Version: 2.0beta6
Precedence: bulk
List-Help: <mailto:4suite-request@lists.fourthought.com?subject=help>
List-Post: <mailto:4suite@lists.fourthought.com>
List-Subscribe: <http://lists.fourthought.com/mailman/listinfo/4suite>, <mailto:4suite-request@lists.fourthought.com?subject=subscribe>
List-Id: Users and support for 4Suite tools <4suite.lists.fourthought.com>
List-Unsubscribe: <http://lists.fourthought.com/mailman/listinfo/4suite>, <mailto:4suite-request@lists.fourthought.com?subject=unsubscribe>
List-Archive: http://lists.fourthought.com/pipermail/4suite/
Date: Mon, 14 May 2001 07:38:24 +0200

> Should we standardize all of the python xml extensions with a py
> prefix?  pyReleaseNode or py_releaseNode?  Then we will never have to
> worry about a name clash.

IMO, no. The entire interface together is the Python DOM mapping. In
the unlikely event of a name clash, we could still decide to rename
the DOM function, or find some other magic (e.g. overloading on the
argument count).

Regards,
Martin

_______________________________________________
4suite mailing list
4suite@lists.fourthought.com
http://lists.fourthought.com/mailman/listinfo/4suite

--------------8D5F80E05CA0787A819F3271--


From martin@loewis.home.cs.tu-berlin.de  Mon May 14 08:26:46 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 14 May 2001 09:26:46 +0200
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
In-Reply-To: <3AFEE106.4C99F9FD@fourthought.com> (message from Uche Ogbuji on
 Sun, 13 May 2001 13:31:18 -0600)
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de> <3AFEE106.4C99F9FD@fourthought.com>
Message-ID: <200105140726.f4E7QkI01878@mira.informatik.hu-berlin.de>

>> It seems to me that all StylesheetReader does is to
>> create a DOM tree, except that it creates StylesheetElement nodes
>> where a normal DOM build would create Element nodes.

> Wow.  I'd count this a huge oversimplification.  The Stylesheet reader
> does a great deal that most readers needn't worry about, as I'd think
> would be obvious from a glance at te code.

I'd like to discuss specific aspects, then. Looking at the current
public CVS, I see:

fromStream: duplicates ReaderMixin.fromStream, then adds call to
            sheet.setup(), and some error handling

initParser: duplicates PyExpatReader.initParser. It uses
            Utf8OnlyHandler sometimes, but I could not find that class.

_completeTextNode: creates LiteralText instead of Text nodes. Also does
            not deal with top_node, but I'm not sure whether this is on
            purpose

_initializeSheet: has no equivalent elsewhere
_handleExtUris:   has no equivalent elsewhere

processingInstruction: Does *not* create PI nodes
comment:               Likewise

startElement: great similarities with Handler.startElement. The significant
              differences seem to be:
              - creates element nodes based on g_mappings[nsuri][localname],
                extension tables, or creates LiteralElement
              - processes xsl:include somehow (?)
              - passes attributes through _handleExtUris for xsl:stylesheet

endElement: great overload with Handler.endElement; I could not tell
            whether differences are on purpose or by mistake

characters: does not deal with _includeDepth and force8Bit (again, this
            might be by mistake)

Did I miss aspects of the functionality relevant to proper operation
of the StylesheetReader?

So all in all, it still seems to me that the essential difference is
what nodes are created; the control logic and parsing data structures
seem to be duplicates of the code found in the handler.

That, in turn, suggests that using a standard DOM builder with a
different DOM implementation would achieve the same effect.

Regards,
Martin


From fdrake@acm.org  Mon May 14 15:08:44 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 14 May 2001 10:08:44 -0400 (EDT)
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFEE248.CA8C2BC4@fourthought.com>
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
 <3AFE8DEA.F92054CB@fourthought.com>
 <3AFED01A.3FF0E6F9@FourThought.com>
 <200105131912.f4DJCdj14251@mira.informatik.hu-berlin.de>
 <3AFEE248.CA8C2BC4@fourthought.com>
Message-ID: <15103.59116.344325.572131@cj42289-a.reston1.va.home.com>

Uche Ogbuji writes:
 > I'm quite familiar with DOM Level 3, but the Reader architecture
 > predates this, and there is no immediate prospect of time to move to the
 > Level 3 interfaces.  Perhaps in a month or two.  Of course, this could
 > be accelerated by contributions.

  Parsed XML is already starting to support the Level 3 interfaces,
most interestingly, the Load portion of the Load/Save "feature".  (I
just haven't had time to spend on the Save portion.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From fdrake@acm.org  Mon May 14 19:11:01 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 14 May 2001 14:11:01 -0400 (EDT)
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFF32F0.8AAAED0C@FourThought.com>
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
 <3AFE8DEA.F92054CB@fourthought.com>
 <3AFED01A.3FF0E6F9@FourThought.com>
 <200105131912.f4DJCdj14251@mira.informatik.hu-berlin.de>
 <3AFF32F0.8AAAED0C@FourThought.com>
Message-ID: <15104.8117.108130.195638@cj42289-a.reston1.va.home.com>

Mike Olson writes:
 > However, are we ready to move to level III?  Is level III ready to be
 > moved too?

  I agree with Martin on this:  it's not ready.  The "Load"
specification is pretty reasonable, but it's still fairly preliminary
as well.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From fdrake@acm.org  Mon May 14 19:24:17 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 14 May 2001 14:24:17 -0400 (EDT)
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3AFED1F4.C11668EF@FourThought.com>
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
 <3AFE8DEA.F92054CB@fourthought.com>
 <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>
 <3AFED1F4.C11668EF@FourThought.com>
Message-ID: <15104.8913.239603.628509@cj42289-a.reston1.va.home.com>

Mike Olson writes:
 > This is why I vote for either the implementation has the releaseNode
 > function, or the node itself.

  Putting such a method on the node makes the most sense, if the
method makes sense at all.  This allows different classes within an
implementation to do the right thing without the dispatching overhead,
and makes the most sense for implementations which can be subclassed.
  I am a little concerned about the method, however, because I see two
different possibilities.  One is the "I don't need you anymore; don't
bother me" option (equivalent to DECREF), and the other is "Break all
your internal links and die", equivalent to the minidom .unlink()
method.  From the discussion so far, I'm getting the sense that the
latter is what is being discussed, and this is not always
appropriate.  To build DOM trees to use with the XPath/XSLT engines,
would I need to provide an empty .releaseNode(), since the DOM trees
are persistent and have lifetimes far beyond the individual use for
them with a specific transformation?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From jmurray@agyinc.com  Mon May 14 19:22:05 2001
From: jmurray@agyinc.com (Joe Murray)
Date: Mon, 14 May 2001 11:22:05 -0700
Subject: [XML-SIG] building XML docs using ?
Message-ID: <3B00224D.AFB2057D@agyinc.com>

Dear All,

I am converting many large "legacy" text files to XML.  Some of the
original text files are upwards of 100 MB.  What is the most efficient,
using the speed/memory metrics, way to convert these text files to XML?

Currently, I parse through the text files and create a DOM Document
representation.  However, the time and memory expenditure for conversion
is huge, using either xml.dom.minidom or xml.dom.  Here's an example of
what I do:

----------

# import stuff
from xml.dom.minidom import Document

# create doc and documentElement node
doc = Document()
docelement = doc.appendChild(...)
f = open(...)
..
while 1:
    
    # get data from file
    line = f.readline()
    if not line:
        break
    line = line.strip()
    data = line.split(...)
    
    # create a new element node using data from file
    node = doc.createElement(...)
    node.setAttribute(...)
    node.appendChild(...)
    docelement.appendChild(node)
...

----------

Should I forgo the ease of using the DOM objects by simply generating
outputting "hand-generated" markup?  I was doing this previously, it's
efficient, but definitely not as nice/clean as it could be...

So basically, is there a lightweight XML module which provides for (as a
graphics programmer would say) "immediate mode" output, with as nice an
interface as the DOM modules?  Oh, and BTW, can XML solve all my
problems???  ;-) 

Thanks much,

joe

-- 
Joseph Murray
Bioinformatics Specialist, AGY Therapeutics
290 Utah Avenue, South San Francisco, CA 94080
(650) 228-1146


From fdrake@acm.org  Mon May 14 20:23:57 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 14 May 2001 15:23:57 -0400 (EDT)
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <3B00224D.AFB2057D@agyinc.com>
References: <3B00224D.AFB2057D@agyinc.com>
Message-ID: <15104.12493.360699.521399@cj42289-a.reston1.va.home.com>

Joe Murray writes:
 > Currently, I parse through the text files and create a DOM Document
 > representation.  However, the time and memory expenditure for conversion
 > is huge, using either xml.dom.minidom or xml.dom.  Here's an example of
 > what I do:

  Instead of building a DOM tree, send events to a SAX output
generator.  This avoids keeping your entire document in memory.  The
xml.sax.writer module provides this, and there may be others.  (Be
sure to get the xml.sax.writer from CVS though; I just fixed a really
stupid bug...)

 > ----------
 > 
 > # import stuff
 > from xml.dom.minidom import Document
 > 
 > # create doc and documentElement node
 > doc = Document()
 > docelement = doc.appendChild(...)
 > f = open(...)
 > ..
 > while 1:
 >     
 >     # get data from file
 >     line = f.readline()
 >     if not line:
 >         break
 >     line = line.strip()
 >     data = line.split(...)
 >     
 >     # create a new element node using data from file
 >     node = doc.createElement(...)
 >     node.setAttribute(...)
 >     node.appendChild(...)
 >     docelement.appendChild(node)

  This would end up looking more like:

        writer = xml.sax.writer.XmlWriter(f)
        while 1:
            # get data from file
            ...

            # write new element to output:
            writer.startElement("item", {"attr": value})
            writer.characters(data)
            writer.endElement("item")
            writer.characters("\n")  # record separator, unless you're
                                     # using the PrettyPrinter version

        f.close()

 > So basically, is there a lightweight XML module which provides for (as a
 > graphics programmer would say) "immediate mode" output, with as nice an
 > interface as the DOM modules?  Oh, and BTW, can XML solve all my
 > problems???  ;-) 

  XML is an acronym, and as everyone knows, acronyms solve problems.
All of them.  So, yes, life will be perfect with your new-found TLA.  ;)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From martin@loewis.home.cs.tu-berlin.de  Mon May 14 21:19:42 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 14 May 2001 22:19:42 +0200
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <3B00224D.AFB2057D@agyinc.com> (message from Joe Murray on Mon,
 14 May 2001 11:22:05 -0700)
References: <3B00224D.AFB2057D@agyinc.com>
Message-ID: <200105142019.f4EKJgR05670@mira.informatik.hu-berlin.de>

> I am converting many large "legacy" text files to XML.  Some of the
> original text files are upwards of 100 MB.  What is the most efficient,
> using the speed/memory metrics, way to convert these text files to XML?

The less markup, the less the memory overhead, and the faster the
processing. So if you have a plain text file with contents XXX, the
most efficient XML document you could get (from the viewpoint of
parsing speed) is

<plaintext>
XXX
</plaintext>

Provided there is no markup in XXX, this is also the smallest XML
document storing all bytes of XXX :-)

> Currently, I parse through the text files and create a DOM Document
> representation.

Ah, so you are apparently bound by some DTD. In that case, it very
much depends on how complex the transformation is.

>     node = doc.createElement(...)
>     node.setAttribute(...)
>     node.appendChild(...)
>     docelement.appendChild(node)

So you create one element per line, in a single pass over the file?
That is quite a simple conversion procedure.

> Should I forgo the ease of using the DOM objects by simply generating
> outputting "hand-generated" markup?  

Yes, definitely.

> I was doing this previously, it's efficient, but definitely not as
> nice/clean as it could be...

Why is that? If you create the right template for a single line, e.g.

template = '<elem attr1='%d' attr2='%s'>%s</elem>'

then a simple print statement would suffice to fill out this template.
This also make a nice separation of structure and content.

> So basically, is there a lightweight XML module which provides for (as a
> graphics programmer would say) "immediate mode" output, with as nice an
> interface as the DOM modules?  

You could use the SAX interfaces, essentially implementing a Reader
class, and using an xml.sax.XMLGenerator as the content handler.
Then, you'd do proper startElement and endElement calls; the
XMLGenerator will do immediate output.

> Oh, and BTW, can XML solve all my problems???  ;-)

Almost. To get rich quick, you still need to write chain letters :-)

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Mon May 14 21:21:31 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 14 May 2001 22:21:31 +0200
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <15104.12493.360699.521399@cj42289-a.reston1.va.home.com>
 (fdrake@acm.org)
References: <3B00224D.AFB2057D@agyinc.com> <15104.12493.360699.521399@cj42289-a.reston1.va.home.com>
Message-ID: <200105142021.f4EKLVb05674@mira.informatik.hu-berlin.de>

>   This would end up looking more like:
> 
>         writer = xml.sax.writer.XmlWriter(f)

That's a SAX1 class, right? The SAX2 class is
xml.sax.saxutils.XMLGenerator.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Mon May 14 21:09:50 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 14 May 2001 22:09:50 +0200
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <15104.8913.239603.628509@cj42289-a.reston1.va.home.com>
 (fdrake@acm.org)
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
 <3AFE8DEA.F92054CB@fourthought.com>
 <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>
 <3AFED1F4.C11668EF@FourThought.com> <15104.8913.239603.628509@cj42289-a.reston1.va.home.com>
Message-ID: <200105142009.f4EK9oP05647@mira.informatik.hu-berlin.de>

>   Putting such a method on the node makes the most sense, if the
> method makes sense at all.  This allows different classes within an
> implementation to do the right thing without the dispatching overhead,
> and makes the most sense for implementations which can be subclassed.

I agree. Making it a non-method is a suggestion you might get from a
C++ programmer; the C++ equivalen - "delete this;" - bad style since
you might run a method of the object that is being destroyed. Of
course, in Python, this is not a problem.


>   I am a little concerned about the method, however, because I see two
> different possibilities.  One is the "I don't need you anymore; don't
> bother me" option (equivalent to DECREF), and the other is "Break all
> your internal links and die", equivalent to the minidom .unlink()
> method.

I can't understand the value of the first option. If you don't need an
Element or a document anymore which somebody else might be holding
onto, you can just drop it, right?

> From the discussion so far, I'm getting the sense that the
> latter is what is being discussed, and this is not always
> appropriate.  To build DOM trees to use with the XPath/XSLT engines,
> would I need to provide an empty .releaseNode(), since the DOM trees
> are persistent and have lifetimes far beyond the individual use for
> them with a specific transformation?

Not necessarily. Currently, 4XSLT uses ReleaseNode e.g. to release a
style sheet, in a data flow:
- read the style sheet using the StylesheetReader from an XML document
  (i.e. a byte stream)
- process the style sheet
- release it

Another application is with result tree fragments: when instantiating
an element, nodes get cloned over and over, and temporary results need
to be released.

There may be also cases where 4XSLT releases elements it did not
create; I'd consider that a bug.

I don't think we should introduce explicit reference counters for
documents or some such; we should strive for less memory management,
not more.

Regards,
Martin


From fdrake@acm.org  Mon May 14 21:26:24 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 14 May 2001 16:26:24 -0400 (EDT)
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <200105142021.f4EKLVb05674@mira.informatik.hu-berlin.de>
References: <3B00224D.AFB2057D@agyinc.com>
 <15104.12493.360699.521399@cj42289-a.reston1.va.home.com>
 <200105142021.f4EKLVb05674@mira.informatik.hu-berlin.de>
Message-ID: <15104.16240.140204.456352@cj42289-a.reston1.va.home.com>

Martin v. Loewis writes:
 > That's a SAX1 class, right? The SAX2 class is
 > xml.sax.saxutils.XMLGenerator.

  That's right.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From fdrake@acm.org  Mon May 14 21:37:09 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 14 May 2001 16:37:09 -0400 (EDT)
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <200105142009.f4EK9oP05647@mira.informatik.hu-berlin.de>
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
 <3AFE8DEA.F92054CB@fourthought.com>
 <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>
 <3AFED1F4.C11668EF@FourThought.com>
 <15104.8913.239603.628509@cj42289-a.reston1.va.home.com>
 <200105142009.f4EK9oP05647@mira.informatik.hu-berlin.de>
Message-ID: <15104.16885.755115.164847@cj42289-a.reston1.va.home.com>

Martin v. Loewis writes:
 > I can't understand the value of the first option. If you don't need an
 > Element or a document anymore which somebody else might be holding
 > onto, you can just drop it, right?

  You can't do that in minidom without requiring cyclic GC, and that's
not available for all projects thanks to users of legacy Python
versions.  I'm really learning to dislike Python 1.5.2.  ;-(

 > Not necessarily. Currently, 4XSLT uses ReleaseNode e.g. to release a
 > style sheet, in a data flow:
 > - read the style sheet using the StylesheetReader from an XML document
 >   (i.e. a byte stream)
 > - process the style sheet
 > - release it
 > 
 > Another application is with result tree fragments: when instantiating
 > an element, nodes get cloned over and over, and temporary results need
 > to be released.

  OK, this makes sense.  As long as it only releases nodes that it
creates and does not use as part of the result, that's fine.  As long
as I can create a stylesheet and store it as a persistent object,
create and store a bunch of documents, and then process them over &
over without damaging them, and make the results persistent and usable
in the same fashion, I'm happy.  ;-)

 > There may be also cases where 4XSLT releases elements it did not
 > create; I'd consider that a bug.

  Agreed!

 > I don't think we should introduce explicit reference counters for
 > documents or some such; we should strive for less memory management,
 > not more.

  Agreed as well.  If we can rely on GC, then I'm all for it.  I just
wanted to be sure that we were clear on the semantics of
.releaseNode(), since it has a large potential for disaster.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From larsga@garshol.priv.no  Mon May 14 22:57:13 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 14 May 2001 23:57:13 +0200
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <3B00224D.AFB2057D@agyinc.com>
References: <3B00224D.AFB2057D@agyinc.com>
Message-ID: <m37kzjsifa.fsf@lambda.garshol.priv.no>

* Joe Murray
| 
| So basically, is there a lightweight XML module which provides for
| (as a graphics programmer would say) "immediate mode" output, with
| as nice an interface as the DOM modules?

As Martin says SAX has the advantage that it does not store the entire
document in memory and so can be used to write applications that
operate with a fixed amount of memory (more or less). Unless your
document structure is too complex I would go for this.

minidom also has mechanisms that can be used to build only parts of
the tree at a time and throw them away afterwards. This may or may not
work for your processing. These mechanisms are not documented, either,
so it may be tricky to get them to work.

Pyxie also has support for building partial trees and discarding them
as you go. As an additional benefit it has an API that, IMHO, is far
nicer than the DOM API. It's unlikely to be very fast, though.

| Oh, and BTW, can XML solve all my problems???  ;-)

I'm afraid not. You'll need topic maps for that... :-)

--Lars M.


From tpassin@home.com  Mon May 14 23:38:20 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Mon, 14 May 2001 18:38:20 -0400
Subject: [XML-SIG] building XML docs using ?
References: <3B00224D.AFB2057D@agyinc.com> <m37kzjsifa.fsf@lambda.garshol.priv.no>
Message-ID: <003a01c0dcc6$94210080$7cac1218@reston1.va.home.com>

[Lars Marius Garshol]

> 
> * Joe Murray
> ...
> > Oh, and BTW, can XML solve all my problems???  ;-)
> 
> I'm afraid not. You'll need topic maps for that... :-)
> 
Hey, the man needs speed here .... :-)

Tom P


From Mike.Olson@fourthought.com  Tue May 15 05:44:51 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Mon, 14 May 2001 22:44:51 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
 <3AFE8DEA.F92054CB@fourthought.com>
 <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>
 <3AFED1F4.C11668EF@FourThought.com> <15104.8913.239603.628509@cj42289-a.reston1.va.home.com>
Message-ID: <3B00B443.18219553@FourThought.com>

"Fred L. Drake, Jr." wrote:
> 
> Mike Olson writes:
>  > This is why I vote for either the implementation has the releaseNode
>  > function, or the node itself.
> 
>   I am a little concerned about the method, however, because I see two
> different possibilities.  One is the "I don't need you anymore; don't
> bother me" option (equivalent to DECREF), and the other is "Break all
> your internal links and die", equivalent to the minidom .unlink()
> method.  From the discussion so far, I'm getting the sense that the
> latter is what is being discussed, and this is not always
> appropriate.  To build DOM trees to use with the XPath/XSLT engines,
> would I need to provide an empty .releaseNode(), since the DOM trees
> are persistent and have lifetimes far beyond the individual use for
> them with a specific transformation?

It depends on the interface into the XSLT/XPath engine.  They way
4XSLT/4XPath is set up, if you pass us a DOM node to process, we won't
touch it.  It is your DOM node, you job to release it.  However, if you
call appendStylesheetUri (as an example) we create a DOM node, and we
will release it when processing is done.  Currently, you can call
"setDocumentReader" on the 4XSLT processor to use anything that conforms
to the Reader interface when fromUri, fromString, fromStream are
called.  We then call the coresponding releaseNode on the documetn
reader to free the DOM tree when we are done with it.

So, I guess I still see plenty of cases where "unlink" makes sense. 
When would you want to use the DECREF equiv.?

Mike


> 
>   -Fred
> 
> --
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Digital Creations

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From larsga@garshol.priv.no  Tue May 15 08:17:10 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 15 May 2001 09:17:10 +0200
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <003a01c0dcc6$94210080$7cac1218@reston1.va.home.com>
References: <3B00224D.AFB2057D@agyinc.com> <m37kzjsifa.fsf@lambda.garshol.priv.no> <003a01c0dcc6$94210080$7cac1218@reston1.va.home.com>
Message-ID: <m34run3wuh.fsf@lambda.garshol.priv.no>

* Lars Marius Garshol
|
| I'm afraid not. You'll need topic maps for that... :-)

* Thomas B. Passin
|
| Hey, the man needs speed here .... :-)

SMOO. :-)

--Lars M.


From rsalz@zolera.com  Tue May 15 15:02:21 2001
From: rsalz@zolera.com (Rich Salz)
Date: Tue, 15 May 2001 10:02:21 -0400
Subject: [XML-SIG] Parsing namespace attributes (e.g., xml.dom.ext.GetAllNs)
Message-ID: <3B0136ED.EC1EE700@zolera.com>

According to my reading of the namespace spec, "xmlns" is not a
namespace identifier, but is instead just lexically significant.  Yet
xml.dom (cf Document.py and ext/__init__.py) treats it as if it were a
namespace, and uses it to find namespace nodes.  Is that just an
implementation technique?

Where is the "xmlns" defined in a W3 recommendation?  For example, in
dom/__init__.py:
	XMLNS_NAMESPACE = "http://www.w3.org/2000/xmlns/"
I can't find that value in W3C docs -- what am I missing?

I'm asking for a couple of reasons.  First, I might be missing something
on the specs.  Second, I need to add this to xml/ns.py if it's really
there, and third, it seems that if I'm write, then there's a
(minor/obscure) bug.
	<tns:foo xmlns:tns="uri:zolera.com" xmlns="uri.zolera.com"
	xmlns:foo="http://www.w3.org/2000/xmlns/">
		<bar foo:tns="uri:example.com">
			<tns:testit>value</tns:testit>
		</bar>
	</tns:foo>

What namespace is "testit" really in?  I believe uri:zolera.com
	/r$


From fdrake@acm.org  Tue May 15 15:06:01 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 15 May 2001 10:06:01 -0400 (EDT)
Subject: [XML-SIG] Parsing namespace attributes (e.g., xml.dom.ext.GetAllNs)
In-Reply-To: <3B0136ED.EC1EE700@zolera.com>
References: <3B0136ED.EC1EE700@zolera.com>
Message-ID: <15105.14281.196876.100997@cj42289-a.reston1.va.home.com>

Rich Salz writes:
 > Where is the "xmlns" defined in a W3 recommendation?  For example, in
 > dom/__init__.py:
 > 	XMLNS_NAMESPACE = "http://www.w3.org/2000/xmlns/"
 > I can't find that value in W3C docs -- what am I missing?

  AFAICR, this is noted in the DOM Level 2 specification, with a note
that it was an oversight in the Namespaces in XML recommendation that
the W3C intends to correct in some future version.  I haven't checked
the errata for the Namespaces recommendation, however.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From rsalz@zolera.com  Tue May 15 15:35:34 2001
From: rsalz@zolera.com (Rich Salz)
Date: Tue, 15 May 2001 10:35:34 -0400
Subject: [XML-SIG] Parsing namespace attributes (e.g., xml.dom.ext.GetAllNs)
References: <3B0136ED.EC1EE700@zolera.com> <15105.14281.196876.100997@cj42289-a.reston1.va.home.com>
Message-ID: <3B013EB6.A9EE8032@zolera.com>

>   AFAICR, this is noted in the DOM Level 2 specification

Aha, found it.

"Note: In the DOM, all namespace declaration attributes are by
definition bound to the namespace URI: "http://www.w3.org/2000/xmlns/".
These are the attributes whose namespace prefix or qualified name is
"xmlns". Although, at the time of writing, this is not part of the XML
Namespaces specification [Namespaces], it is planned to be incorporated
in a future revision."

I won't hold my breath waiting for a revision of the XML Namespace spec,
which seems pretty clear that xmlns is lexical, so I'd anticipate a
fight. :)

Thanks.
	/r$


From fdrake@acm.org  Tue May 15 16:40:26 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 15 May 2001 11:40:26 -0400 (EDT)
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: <3B00B443.18219553@FourThought.com>
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
 <3AFE8DEA.F92054CB@fourthought.com>
 <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>
 <3AFED1F4.C11668EF@FourThought.com>
 <15104.8913.239603.628509@cj42289-a.reston1.va.home.com>
 <3B00B443.18219553@FourThought.com>
Message-ID: <15105.19946.344121.580203@cj42289-a.reston1.va.home.com>

Mike Olson writes:
 > It depends on the interface into the XSLT/XPath engine.  They way
 > 4XSLT/4XPath is set up, if you pass us a DOM node to process, we won't
 > touch it.  It is your DOM node, you job to release it.  However, if you
 > call appendStylesheetUri (as an example) we create a DOM node, and we
 > will release it when processing is done.  Currently, you can call
 > "setDocumentReader" on the 4XSLT processor to use anything that conforms
 > to the Reader interface when fromUri, fromString, fromStream are
 > called.  We then call the coresponding releaseNode on the documetn
 > reader to free the DOM tree when we are done with it.

  This sounds pretty reasonable to me.

 > So, I guess I still see plenty of cases where "unlink" makes sense. 
 > When would you want to use the DECREF equiv.?

  If you're using something that isn't GC friendly, such as minidom,
you need explicit incref/decref machinery to be able to discard the
document when it is no longer being used.  This is less of an issue
with the cycle detector introduced in "modern" Python releases, but is
still a real problem with Python 1.5.2.  And there are still a fair
number of users of the older version, for a variety of reasons.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From Mike.Olson@fourthought.com  Tue May 15 16:48:22 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Tue, 15 May 2001 09:48:22 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
References: <200105131208.f4DC82o11349@mira.informatik.hu-berlin.de>
 <3AFE8DEA.F92054CB@fourthought.com>
 <200105131441.f4DEfPe12921@mira.informatik.hu-berlin.de>
 <3AFED1F4.C11668EF@FourThought.com>
 <15104.8913.239603.628509@cj42289-a.reston1.va.home.com>
 <3B00B443.18219553@FourThought.com> <15105.19946.344121.580203@cj42289-a.reston1.va.home.com>
Message-ID: <3B014FC6.FCE8CE4F@FourThought.com>

"Fred L. Drake, Jr." wrote:
> 
> 
>   If you're using something that isn't GC friendly, such as minidom,
> you need explicit incref/decref machinery to be able to discard the
> document when it is no longer being used.  This is less of an issue
> with the cycle detector introduced in "modern" Python releases, but is
> still a real problem with Python 1.5.2.  And there are still a fair
> number of users of the older version, for a variety of reasons.


So your saying a smarter unlink.  either flag that I am no longer using
this document, or completely destroy it if I was the last  external
reference to document.  I think I see what your saying.

Mike


> 
>   -Fred
> 
> --
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Digital Creations

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From noreply@sourceforge.net  Tue May 15 17:24:17 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Tue, 15 May 2001 09:24:17 -0700
Subject: [XML-SIG] [ pyxml-Bugs-424260 ] error importing Xhtml2HtmlPrinter
Message-ID: <E14zhcP-0001Ak-00@usw-sf-web2.sourceforge.net>

Bugs item #424260, was updated on 2001-05-15 09:24
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=424260&group_id=6473

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: error importing Xhtml2HtmlPrinter

Initial Comment:
>>> import
xml.dom.ext.XHtml2HtmlPrinter                              
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File
"/usr/lib/python1.5/site-packages/xml/dom/ext/XHtml2HtmlPrinter.py",
line 3, in ?
    from xml.dom.html import HTML_FORBIDDEN_END,
XHTML_NAMESPACE
ImportError: cannot import name XHTML_NAMESPACE


Patch for the bug is:

--- XHtml2HtmlPrinter.py        Tue Apr 24 20:31:42
2001
+++ /home/alf/XHtml2HtmlPrinter.py      Tue May 15
18:18:18 2001
@@ -1,6 +1,7 @@
 import string
 import Printer
-from xml.dom.html import HTML_FORBIDDEN_END,
XHTML_NAMESPACE
+from xml.dom.html import HTML_FORBIDDEN_END
+from xml.dom import XHTML_NAMESPACE
 
 class HtmlDocType:
     name = 'HTML'


Cheers 

Alexandre Fayolle
(I could not logging, because it seems there's some
problem with SF and their ssl server)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=424260&group_id=6473


From Alexandre.Fayolle@logilab.fr  Tue May 15 17:41:33 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Tue, 15 May 2001 18:41:33 +0200 (CEST)
Subject: [XML-SIG] Python newbie question
Message-ID: <Pine.LNX.4.21.0105151838070.9347-100000@orion.logilab.fr>

Hi there,

I really feel dumb for asking this... Well here comes anyway.

In xml.dom.ext.Xhtml2HtmlPrinter, there's the following statement:

import Printer

There's also a file called Printer in xml/dom/ext, but xml/dom/ext is not,
as far as I know, in my PYTHONPATH. So how does this work (a pointer to
the right page of TFM is fine by me)?

TIA

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From fdrake@acm.org  Tue May 15 18:06:36 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 15 May 2001 13:06:36 -0400 (EDT)
Subject: [XML-SIG] pyexpat interface issue
Message-ID: <15105.25116.646987.317835@cj42289-a.reston1.va.home.com>

  The pyexpat module defines two wrappers for handlers which are
expected to return integers (NotStandaloneHandler and
ExternalEntityRefHandler).  What stands out about these handlers is
that Expat is expecting a return value (the others have void
returns).  The wrappers will propogate an exception if one is raised
by the Python handler implementation, but then assumes that the return
value is actually an integer.  They use PyInt_AsLong() to convert the
return value to an integer, but don't check the return value:  if
PyInt_AsLong() returns -1 and PyErr_Occurred() is non-NULL, a
TypeError was raised by PyInt_AsLong() because the value passed to it
was not an integer object.  The -1 will be passed to Expat, which will
happily continue parsing since it expects a false value to tell it to
stop parsing.  This has been this way for a while.
  Should the documentation for these interfaces be modifed to reflect
this (strange) behavior, with some code cleanup to avoid having unused
exception state laying around (which *can* show up later in unrelated
code), or should the implementation be fixed to propogate the
exception, or something else?  I'm concerned that changing the actual
behavior will adversely effect existing code that uses pyexpat.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From jmurray@agyinc.com  Tue May 15 18:17:41 2001
From: jmurray@agyinc.com (Joe Murray)
Date: Tue, 15 May 2001 10:17:41 -0700
Subject: [XML-SIG] building XML docs using ?
References: <3B00224D.AFB2057D@agyinc.com> <m37kzjsifa.fsf@lambda.garshol.priv.no> <003a01c0dcc6$94210080$7cac1218@reston1.va.home.com>
Message-ID: <3B0164B5.BFB9EB46@agyinc.com>

Thanks to everyone for their helpful responses.  And to probe even
further, into this technology that will "solve all my problems"...

"Thomas B. Passin" wrote:
> 
> [Lars Marius Garshol]
> >
> > * Joe Murray
> > ...
> > > Oh, and BTW, can XML solve all my problems???  ;-)
> >
> > I'm afraid not. You'll need topic maps for that... :-)
> >
> Hey, the man needs speed here .... :-)


So, with regard to speed, is there an XSLT processor (python or not)
which take a SAX-like event-driven approach to transforming XML?  I know
this doesn't deal fully with the dynamicity of an XSL doc, but it would
be useful.  I checked some old xml-dev, xml-sig... I can't vouch for the
people who were discussing such a processor and given the fact that most
of the posts were circa 1999... I couldn't find a straightforward
answer.  Does Sablotron support this?  It seems as if the Oracle XML
parsers packages do... but after some surfin', I ain't certain...


"Martin v. Loewis" wrote:
> > Should I forgo the ease of using the DOM objects by simply generating
> > outputting "hand-generated" markup?
> 
> Yes, definitely.
> 
> > I was doing this previously, it's efficient, but definitely not as
> > nice/clean as it could be...
> 
> Why is that? If you create the right template for a single line, e.g.
> 
> template = '<elem attr1='%d' attr2='%s'>%s</elem>'
> 
> then a simple print statement would suffice to fill out this template.
> This also make a nice separation of structure and content.


Indeed, this is the route I have gone.  I'm using
xml.sax.saxutils.escape, a handy function, in lieu of the SAX writer
interfaces.

All you guys are a helpful bunch!  

Regards,

joe

-- 
Joseph Murray
Bioinformatics Specialist, AGY Therapeutics
290 Utah Avenue, South San Francisco, CA 94080
(650) 228-1146


From larsga@garshol.priv.no  Tue May 15 18:35:29 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 15 May 2001 19:35:29 +0200
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <3B0164B5.BFB9EB46@agyinc.com>
References: <3B00224D.AFB2057D@agyinc.com> <m37kzjsifa.fsf@lambda.garshol.priv.no> <003a01c0dcc6$94210080$7cac1218@reston1.va.home.com> <3B0164B5.BFB9EB46@agyinc.com>
Message-ID: <m38zjya526.fsf@lambda.garshol.priv.no>

* Joe Murray
| 
| So, with regard to speed, is there an XSLT processor (python or not)
| which take a SAX-like event-driven approach to transforming XML?  

Currently there is not, and part of the reason for that is that some
parts of XSLT require the entire document to be available to the
processor at the same time. If you use only a subset of XSLT one can
use an event-based approach, but currently nobody has implemented
anything like this.

However, SAXON has some extensions that can enable you to build only
parts of the tree at a time. This puts some constraints on what you
are able to do, but you may be able to live with it.

| Does Sablotron support this?  

It does not.

| It seems as if the Oracle XML parsers packages do... but after some
| surfin', I ain't certain...

I don't think they do, though there is a chance that I might be wrong.
You should in any case distinguish carefully between XML parsers
(nearly all of which have event-based interfaces) and XSLT engines. 

--Lars M.


From martin@loewis.home.cs.tu-berlin.de  Tue May 15 19:43:04 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 15 May 2001 20:43:04 +0200
Subject: [XML-SIG] Python newbie question
In-Reply-To: <Pine.LNX.4.21.0105151838070.9347-100000@orion.logilab.fr>
 (message from Alexandre Fayolle on Tue, 15 May 2001 18:41:33 +0200
 (CEST))
References: <Pine.LNX.4.21.0105151838070.9347-100000@orion.logilab.fr>
Message-ID: <200105151843.f4FIh4Z01461@mira.informatik.hu-berlin.de>

> There's also a file called Printer in xml/dom/ext, but xml/dom/ext is not,
> as far as I know, in my PYTHONPATH. So how does this work (a pointer to
> the right page of TFM is fine by me)?

I don't think the package import procedure is documented anywhere; the
best you can get is

http://www.python.org/doc/essays/packages.html

For your specific question, see Intra-package References.

Regards,
Martin


From tpassin@home.com  Wed May 16 00:55:15 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Tue, 15 May 2001 19:55:15 -0400
Subject: [XML-SIG] building XML docs using ?
References: <3B00224D.AFB2057D@agyinc.com> <m37kzjsifa.fsf@lambda.garshol.priv.no> <003a01c0dcc6$94210080$7cac1218@reston1.va.home.com> <3B0164B5.BFB9EB46@agyinc.com>
Message-ID: <002801c0dd9a$7d011960$7cac1218@reston1.va.home.com>

[Joe Murray]
> So, with regard to speed, is there an XSLT processor (python or not)
> which take a SAX-like event-driven approach to transforming XML?  I know
> this doesn't deal fully with the dynamicity of an XSL doc, but it would
> be useful.  I checked some old xml-dev, xml-sig... I can't vouch for the
> people who were discussing such a processor and given the fact that most
> of the posts were circa 1999... I couldn't find a straightforward
> answer.  Does Sablotron support this?  It seems as if the Oracle XML
> parsers packages do... but after some surfin', I ain't certain...
>
>

Some processors can do lazy evaluation and thereby avoid computing branches
that aren't used in a particular transformation.  I'm pretty sure Xalan does
this, and I think Saxon can be asked to. Of course, if your source document
and transform need to pull together nodes from all parts of the document,
this won't help.

Otherwise, some processors can ingest the xml via SAX as well as the/a DOM,
but they then build their own DOM model.

Cheers,

Tom P


From Alexandre.Fayolle@logilab.fr  Wed May 16 10:06:03 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Wed, 16 May 2001 11:06:03 +0200 (CEST)
Subject: [XML-SIG] Python newbie question
In-Reply-To: <200105151843.f4FIh4Z01461@mira.informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.21.0105161103550.10884-100000@orion.logilab.fr>

On Tue, 15 May 2001, Martin v. Loewis wrote:

> > There's also a file called Printer in xml/dom/ext, but xml/dom/ext is not,
> > as far as I know, in my PYTHONPATH. So how does this work (a pointer to
> > the right page of TFM is fine by me)?
> 
> I don't think the package import procedure is documented anywhere; the
> best you can get is
> 
> http://www.python.org/doc/essays/packages.html

Thanks for this very interesting pointer. It clarifies a number of notions
for which I only had an intuitive grasp.

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From uche.ogbuji@fourthought.com  Thu May 17 07:59:15 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 17 May 2001 00:59:15 -0600
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
In-Reply-To: Message from "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
 of "Mon, 14 May 2001 09:26:46 +0200." <200105140726.f4E7QkI01878@mira.informatik.hu-berlin.de>
Message-ID: <200105170659.f4H6xFF13604@localhost.local>

> >> It seems to me that all StylesheetReader does is to
> >> create a DOM tree, except that it creates StylesheetElement nodes
> >> where a normal DOM build would create Element nodes.
> 
> > Wow.  I'd count this a huge oversimplification.  The Stylesheet reader
> > does a great deal that most readers needn't worry about, as I'd think
> > would be obvious from a glance at te code.
> 
> I'd like to discuss specific aspects, then. Looking at the current
> public CVS, I see:
> 
> fromStream: duplicates ReaderMixin.fromStream, then adds call to
>             sheet.setup(), and some error handling
> 
> initParser: duplicates PyExpatReader.initParser. It uses
>             Utf8OnlyHandler sometimes, but I could not find that class.
> 
> _completeTextNode: creates LiteralText instead of Text nodes. Also does
>             not deal with top_node, but I'm not sure whether this is on
>             purpose
> 
> _initializeSheet: has no equivalent elsewhere
> _handleExtUris:   has no equivalent elsewhere
> 
> processingInstruction: Does *not* create PI nodes
> comment:               Likewise
> 
> startElement: great similarities with Handler.startElement. The significant
>               differences seem to be:
>               - creates element nodes based on g_mappings[nsuri][localname],
>                 extension tables, or creates LiteralElement
>               - processes xsl:include somehow (?)
>               - passes attributes through _handleExtUris for xsl:stylesheet
> 
> endElement: great overload with Handler.endElement; I could not tell
>             whether differences are on purpose or by mistake
> 
> characters: does not deal with _includeDepth and force8Bit (again, this
>             might be by mistake)
> 
> Did I miss aspects of the functionality relevant to proper operation
> of the StylesheetReader?
> 
> So all in all, it still seems to me that the essential difference is
> what nodes are created; the control logic and parsing data structures
> seem to be duplicates of the code found in the handler.
> 
> That, in turn, suggests that using a standard DOM builder with a
> different DOM implementation would achieve the same effect.

There is a lot of state that the StylesheetReader manages that other readers 
don't.

This would be very cumbersome to shoe-horn into a standard DOM reader.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Thu May 17 08:03:07 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 17 May 2001 01:03:07 -0600
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: Message from Joe Murray <jmurray@agyinc.com>
 of "Mon, 14 May 2001 11:22:05 PDT." <3B00224D.AFB2057D@agyinc.com>
Message-ID: <200105170703.f4H738F13626@localhost.local>

> Dear All,
> 
> I am converting many large "legacy" text files to XML.  Some of the
> original text files are upwards of 100 MB.  What is the most efficient,
> using the speed/memory metrics, way to convert these text files to XML?

1) Using SAX
2) Cutting the output docs to reasonable size

I can guarantee you you want nothing to do with XML files in the hundreds of 
MB.  You don't even want them in the MB, period.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Thu May 17 08:04:48 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 17 May 2001 01:04:48 -0600
Subject: [XML-SIG] Re: [4suite] ReleaseNode interface in 4XSLT
In-Reply-To: Message from "Fred L. Drake, Jr." <fdrake@acm.org>
 of "Mon, 14 May 2001 16:37:09 EDT." <15104.16885.755115.164847@cj42289-a.reston1.va.home.com>
Message-ID: <200105170704.f4H74mr13635@localhost.local>

>  > There may be also cases where 4XSLT releases elements it did not
>  > create; I'd consider that a bug.
> 
>   Agreed!

I'm not aware of any such case.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From tony.mcdonald@ncl.ac.uk  Thu May 17 08:18:29 2001
From: tony.mcdonald@ncl.ac.uk (Tony McDonald)
Date: Thu, 17 May 2001 08:18:29 +0100
Subject: [XML-SIG] Advice needed: RTF->XML conversions
Message-ID: <B72939D4.81BA%tony.mcdonald@ncl.ac.uk>

Hi all,
I'm currently using Omnimark to convert RTF files into a usable form of XML,
ready for uploading into our SQL database.

Omnimark is no longer free, so this means I can't pass on our software to
other HE institutions in the UK.

Can anyone suggest some (preferably python based) tools I can use to get
from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML pages)
to an XML form?

If someone has written something that takes that (dreadful) 'XML' output
that Word 2001 outputs and cleans it up into valid XML that would be a great
start for me.

Many thanks
Tone.
-- 
Dr Tony McDonald,  Assistant Director, FMCC, http://www.fmcc.org.uk/
The Medical School, Newcastle University Tel: +44 191 243 6140
A Zope list for UK HE/FE  http://www.fmcc.org.uk/mailman/listinfo/zope


From Alexandre.Fayolle@logilab.fr  Thu May 17 08:49:08 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Thu, 17 May 2001 09:49:08 +0200 (CEST)
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <B72939D4.81BA%tony.mcdonald@ncl.ac.uk>
Message-ID: <Pine.LNX.4.21.0105170945050.11584-100000@leo.logilab.fr>

On Thu, 17 May 2001, Tony McDonald wrote:

> Can anyone suggest some (preferably python based) tools I can use to get
> from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML pages)
> to an XML form?
> 
> If someone has written something that takes that (dreadful) 'XML' output
> that Word 2001 outputs and cleans it up into valid XML that would be a great
> start for me.

I don't have a coded solution, but if I were to do such thing, I'd use the
Automation interface of Word together with python's COM interface on
windows to have Word parse the document for me using the various iterators
available in the Word Document interface and building my own XML. 

This can be very simple if your document only uses the basic styles in
word (title 1, text body, toc... [I don't know the english names, only
guessing here]), or dreadful if your document features images, tables,
floating text sections, etc.

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From tony.mcdonald@ncl.ac.uk  Thu May 17 09:14:34 2001
From: tony.mcdonald@ncl.ac.uk (Tony McDonald)
Date: Thu, 17 May 2001 09:14:34 +0100
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <Pine.LNX.4.21.0105170945050.11584-100000@leo.logilab.fr>
Message-ID: <B72946F8.81CA%tony.mcdonald@ncl.ac.uk>

On 17/5/01 8:49 am, "Alexandre Fayolle" <Alexandre.Fayolle@logilab.fr>
wrote:

> On Thu, 17 May 2001, Tony McDonald wrote:
> 
>> Can anyone suggest some (preferably python based) tools I can use to get
>> from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML pages)
>> to an XML form?
>> 
>> If someone has written something that takes that (dreadful) 'XML' output
>> that Word 2001 outputs and cleans it up into valid XML that would be a great
>> start for me.
> 
> I don't have a coded solution, but if I were to do such thing, I'd use the
> Automation interface of Word together with python's COM interface on
> windows to have Word parse the document for me using the various iterators
> available in the Word Document interface and building my own XML.
> 

We have very little experience of doing things this way - we're a Unix and
Zope shop and try not to get too involved with the inner workings of
Microsoft software (if at all possible).

> This can be very simple if your document only uses the basic styles in
> word (title 1, text body, toc... [I don't know the english names, only
> guessing here]), or dreadful if your document features images, tables,
> floating text sections, etc.
> 
> Alexandre Fayolle

Thanks for the advice Alexandre, but it's the latter case I'm afraid :(

Our documents have tables, images, superscripts/subscripts, greek characters
(ie simple formulas), page breaks and more besides.

Cheers
Tone.
-- 
Dr Tony McDonald,  Assistant Director, FMCC, http://www.fmcc.org.uk/
The Medical School, Newcastle University Tel: +44 191 243 6140
A Zope list for UK HE/FE  http://www.fmcc.org.uk/mailman/listinfo/zope


From Mike.Olson@fourthought.com  Thu May 17 09:24:37 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Thu, 17 May 2001 02:24:37 -0600
Subject: [XML-SIG] Advice needed: RTF->XML conversions
References: <B72946F8.81CA%tony.mcdonald@ncl.ac.uk>
Message-ID: <3B038AC5.B205328F@FourThought.com>

Tony McDonald wrote:
> 
> On 17/5/01 8:49 am, "Alexandre Fayolle" <Alexandre.Fayolle@logilab.fr>
> wrote:
> 
> > On Thu, 17 May 2001, Tony McDonald wrote:
> >
> >> Can anyone suggest some (preferably python based) tools I can use to get
> >> from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML pages)
> >> to an XML form?

Can you send me a sample of the word XML output, and the format your
looking for.  You can probably do it with a stylesheet as long as what
word spits out really is XML.

Mike


-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From larsga@garshol.priv.no  Thu May 17 09:45:05 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 17 May 2001 10:45:05 +0200
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <200105170703.f4H738F13626@localhost.local>
References: <200105170703.f4H738F13626@localhost.local>
Message-ID: <m3eltotlda.fsf@lambda.garshol.priv.no>

* Uche Ogbuji
| 
| I can guarantee you you want nothing to do with XML files in the
| hundreds of MB.  You don't even want them in the MB, period.

Why ever not? I've worked with lots of XML files of that size over the
last years and see nothing wrong with that. If the amount of data you
need to move around or work with is large, then your XML documents
will be large.

I see no reason why this should be considered somehow suspect or wrong.
If you use SAX there is really no reason why you shouldn't be able to
handle such documents.

--Lars M.


From larsga@garshol.priv.no  Thu May 17 09:48:20 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 17 May 2001 10:48:20 +0200
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <B72939D4.81BA%tony.mcdonald@ncl.ac.uk>
References: <B72939D4.81BA%tony.mcdonald@ncl.ac.uk>
Message-ID: <m3d798tl7v.fsf@lambda.garshol.priv.no>

* Tony McDonald
|
| Can anyone suggest some (preferably python based) tools I can use to get
| from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML pages)
| to an XML form?

These are the only ones I know of:
  <URL: http://www.garshol.priv.no/download/xmltools/prod/RTF2XML.html >
  <URL: http://www.garshol.priv.no/download/xmltools/prod/Majix.html >

--Lars M.


From mal@lemburg.com  Thu May 17 11:12:40 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 17 May 2001 12:12:40 +0200
Subject: [XML-SIG] Advice needed: RTF->XML conversions
References: <B72939D4.81BA%tony.mcdonald@ncl.ac.uk> <m3d798tl7v.fsf@lambda.garshol.priv.no>
Message-ID: <3B03A418.5871B67@lemburg.com>

Lars Marius Garshol wrote:
> 
> * Tony McDonald
> |
> | Can anyone suggest some (preferably python based) tools I can use to get
> | from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML pages)
> | to an XML form?
> 
> These are the only ones I know of:
>   <URL: http://www.garshol.priv.no/download/xmltools/prod/RTF2XML.html >
>   <URL: http://www.garshol.priv.no/download/xmltools/prod/Majix.html >

If you want to invest some time, you may want to look at the
RTF.py example in mxTextTools (see Python Software link below)
and extend it to whatever you need as basis for generating XML
from the RTF input.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From DShriyash@pun.cognizant.com  Thu May 17 11:35:30 2001
From: DShriyash@pun.cognizant.com (Shriyash, Divekar (CTS))
Date: Thu, 17 May 2001 16:05:30 +0530
Subject: [XML-SIG] Small problem in XML parsing
Message-ID: <49532EE860A3D411812A00508B690B29FBC264@ctsinpunsxua>

This is a multi-part message in MIME format.
--------------InterScan_NT_MIME_Boundary
Content-Type: text/plain;
	charset="iso-8859-1"

Hi Folks,

Have got a small problem in XML parsing.

I wish to append a new element in my XML file without creating new Elements.
.
General methodology is to first remove all the available tags & then by
'document.createElement', create the new required element.

My requirement is to point to already available element and append new child
to it.

e.g.
<security-role-assignment>
	<role-name> New Role</role-name>
	<principal-name> abc </principal-name>
	<principal-name> def </principal-name>
	---------
	---------
</security-role-assignment>

Here, <principal-name> <value> </principal-name>  will go on adding.

I wish to point to already available  '<role-name> New Role</role-name>'
tags & the append new principals to it.
XML does not diffrentiates between <role-name>  & <principal-name>  tags.

This may look a very simple problem but causing us bit more efforts.

We would be very happy if anybody can throw  light on it.

Thanx in advance

Regards
Shri


--------------InterScan_NT_MIME_Boundary
Content-Type: text/plain;
	name="InterScan_Disclaimer.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename="InterScan_Disclaimer.txt"

This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful.
Visit us at http://www.cognizant.com


--------------InterScan_NT_MIME_Boundary--


From tony.mcdonald@ncl.ac.uk  Thu May 17 12:05:39 2001
From: tony.mcdonald@ncl.ac.uk (Tony McDonald)
Date: Thu, 17 May 2001 12:05:39 +0100
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <m3d798tl7v.fsf@lambda.garshol.priv.no>
Message-ID: <B7296CAA.81FE%tony.mcdonald@ncl.ac.uk>

On 17/5/01 9:48 am, "Lars Marius Garshol" <larsga@garshol.priv.no> wrote:

> 
> * Tony McDonald
> |
> | Can anyone suggest some (preferably python based) tools I can use to get
> | from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML pages)
> | to an XML form?
> 
> These are the only ones I know of:
> <URL: http://www.garshol.priv.no/download/xmltools/prod/RTF2XML.html >
> <URL: http://www.garshol.priv.no/download/xmltools/prod/Majix.html >
> 
> --Lars M.
> 

Thanks for that Lars,
However, the first program is based on Omnimark (it's actually what I'm
using now), and the second is a Java based program, and I *think* the java
program I've mentioned in my other post (wh2fo) does a good enough job to
get initially to XML.

I still need to do my other machinations on the resultant XML however.

Thanks
Tone.
-- 
Dr Tony McDonald,  Assistant Director, FMCC, http://www.fmcc.org.uk/
The Medical School, Newcastle University Tel: +44 191 243 6140
A Zope list for UK HE/FE  http://www.fmcc.org.uk/mailman/listinfo/zope


From tony.mcdonald@ncl.ac.uk  Thu May 17 12:05:39 2001
From: tony.mcdonald@ncl.ac.uk (Tony McDonald)
Date: Thu, 17 May 2001 12:05:39 +0100
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <3B038AC5.B205328F@FourThought.com>
Message-ID: <B7296E06.81FE%tony.mcdonald@ncl.ac.uk>

On 17/5/01 9:24 am, "Mike Olson" <Mike.Olson@fourthought.com> wrote:

> Tony McDonald wrote:
>> 
>> On 17/5/01 8:49 am, "Alexandre Fayolle" <Alexandre.Fayolle@logilab.fr>
>> wrote:
>> 
>>> On Thu, 17 May 2001, Tony McDonald wrote:
>>> 
>>>> Can anyone suggest some (preferably python based) tools I can use to get
>>>> from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML
>>>> pages)
>>>> to an XML form?
> 
> Can you send me a sample of the word XML output, and the format your
> looking for.  You can probably do it with a stylesheet as long as what
> word spits out really is XML.
> 
> Mike
> 

Thanks for the offer Mike - I *was* under the impression that what word spat
out was not real XML, but I found this (sorry, Java) based program;
http://www-uk.hpl.hp.com/people/fabgia/wh2fo/wh2fo.html
which generates XML and XSL from the html files that word 2000 generates.

Frankly, I'm amazed, as I thought that constructs such as

<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta name=Title content="TRAUMA &amp; BURNS">
<meta name=Keywords content="">
<meta http-equiv=Content-Type content="text/html; charset=macintosh">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 9">
<meta name=Originator content="Microsoft Word 9">
<link rel=File-List href="hepatology%20resource%20day_f_files/filelist.xml">
<link rel=Edit-Time-Data
href="hepatology%20resource%20day_f_files/editdata.mso">
<link rel=OLE-Object-Data
href="hepatology%20resource%20day_f_files/oledata.mso">

where attributes aren't quoted or, if they are, they're quoted with " or '
inconsistently, were very bad XML. I guess I was wrong.

I still need to do some work with the XML that the above program uses and
would like to use Python for that as I'm *far* more comfortable with it than
java.

If I've ready you right, are you saying I could apply a stylesheet to this
XML to get to my output XML which is then ok for my (finally!) python based
sgmlop processor that makes SQL? If so, I'll be very happy indeed!

Essentially I need to 'stack' the headings in the original document so that
this;

Heading 1 "Title"
    heading 2 "Overview"
        heading 3 "Core Content"
    heading 2 "Theme 1"

Goes to
<topic type="heading1" content="Title">
 <topic type="heading2" content="Overview">
  <topic type="heading3" content="Core Content">
  </topic>
 <topic type="heading2" content="Theme 1">
 </topic>
</topic>

If you're saying that I can use XSL stylesheets to get this to work, then I
need to do some reading!

Thanks for the comments,
Tone.
-- 
Dr Tony McDonald,  Assistant Director, FMCC, http://www.fmcc.org.uk/
The Medical School, Newcastle University Tel: +44 191 243 6140
A Zope list for UK HE/FE  http://www.fmcc.org.uk/mailman/listinfo/zope


From tony.mcdonald@ncl.ac.uk  Thu May 17 12:05:40 2001
From: tony.mcdonald@ncl.ac.uk (Tony McDonald)
Date: Thu, 17 May 2001 12:05:40 +0100
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <3B03A418.5871B67@lemburg.com>
Message-ID: <B7296E65.8200%tony.mcdonald@ncl.ac.uk>

On 17/5/01 11:12 am, "M.-A. Lemburg" <mal@lemburg.com> wrote:

> Lars Marius Garshol wrote:
>> 
>> * Tony McDonald
>> |
>> | Can anyone suggest some (preferably python based) tools I can use to get
>> | from Word RTF (or even, gasp, the 'XML' Word 2001 expels as it's HTML
>> pages)
>> | to an XML form?
>> 
>> These are the only ones I know of:
>>   <URL: http://www.garshol.priv.no/download/xmltools/prod/RTF2XML.html >
>>   <URL: http://www.garshol.priv.no/download/xmltools/prod/Majix.html >
> 
> If you want to invest some time, you may want to look at the
> RTF.py example in mxTextTools (see Python Software link below)
> and extend it to whatever you need as basis for generating XML
> from the RTF input.

Thanks for the pointer Marc,
I did look at the RTF.py files a while back, but at the time I was ok with
Omnimark and the code was a bit over my head, so I had to put it on the back
burner.

Cheers
Tone.
-- 
Dr Tony McDonald,  Assistant Director, FMCC, http://www.fmcc.org.uk/
The Medical School, Newcastle University Tel: +44 191 243 6140
A Zope list for UK HE/FE  http://www.fmcc.org.uk/mailman/listinfo/zope


From larsga@garshol.priv.no  Thu May 17 12:18:21 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 17 May 2001 13:18:21 +0200
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <B7296CAA.81FE%tony.mcdonald@ncl.ac.uk>
References: <B7296CAA.81FE%tony.mcdonald@ncl.ac.uk>
Message-ID: <m3wv7grzpe.fsf@lambda.garshol.priv.no>

* Tony McDonald
| 
| However, the first program is based on Omnimark (it's actually what I'm
| using now),

Uh - sorry, should have seen that.

| and the second is a Java based program, and I *think* the java
| program I've mentioned in my other post (wh2fo) does a good enough
| job to get initially to XML.
 
Thanks for that pointer. I've put it in the inbox to my site.

| I still need to do my other machinations on the resultant XML however.

Well, that's an ordinary XML processing job, so Python should have all
the tools you need for that task. 

--Lars M.


From tpassin@home.com  Thu May 17 14:51:39 2001
From: tpassin@home.com (Thomas B. Passin)
Date: Thu, 17 May 2001 09:51:39 -0400
Subject: [XML-SIG] Advice needed: RTF->XML conversions
References: <B72939D4.81BA%tony.mcdonald@ncl.ac.uk>
Message-ID: <002801c0ded8$810ba4a0$7cac1218@reston1.va.home.com>

[Tony McDonald]
>
> If someone has written something that takes that (dreadful) 'XML' output
> that Word 2001 outputs and cleans it up into valid XML that would be a
great
> start for me.
>
HTML-tidy has an option to clean up Word 2000 xml.  You can get it from the
W3C site, or in a GUI editor, as part of HTML-kit (free), from
www.chami.org.

Cheers,

Tom P


From rsalz@zolera.com  Thu May 17 14:57:08 2001
From: rsalz@zolera.com (Rich Salz)
Date: Thu, 17 May 2001 09:57:08 -0400
Subject: [XML-SIG] XML Canonicalization
Message-ID: <3B03D8B4.9108432D@zolera.com>

This is a multi-part message in MIME format.
--------------4C8E83122B2EF8C15F82E15C
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Someone had asked for code to do XML C14N (canonicalization) a couple of
weeks ago.  I finally got around to cleaning up my code; it's attached.

I would be more than happy to add this to PyXML if there's interest. 
Since it operates on DOM nodes, perhaps xml.dom.utils ?  I'd probably
also need to upgrade the documentation -- the docstrings in the code
should tell you all you need.

Hope this helps -- looking forward to feedback.
	/r$
--------------4C8E83122B2EF8C15F82E15C
Content-Type: text/plain; charset=us-ascii;
 name="c14n.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="c14n.py"

#! /usr/bin/env python
'''XML C14N

Perform XML Canonicalization.  Not fully conformant to the spec
in a couple of ways (mostly minor):
    Comments are always stripped
    Whitespace preservation/stripping not totally correct
    Processing Instruction nodes aren't handled
    The nodeset must start with an element and includes all descendants
Fixing the last one would be non-trivial.
'''

_copyright = '''Copyright 2001, Zolera Systems Inc.  All Rights Reserved.
Distributed under the terms of the Python 2.0 Copyright.'''

from xml.dom import Node
import re
import StringIO

_attrs = lambda E: E._get_attributes() or []
_children = lambda E: E._get_childNodes() or []
_sorter = lambda n1, n2: cmp(n1._get_nodeName(), n2._get_nodeName())
xmlns_base = "http://www.w3.org/2000/xmlns/"

class _implementation:

    # Handlers for each node, by node type.
    handlers = {}

    # pattern/replacement list for whitespace stripping.
    repats = (
	( re.compile(r'^[ \t]+', re.MULTILINE), '' ),
	( re.compile(r'[ \t]+$', re.MULTILINE), '' ),
	( re.compile(r'[\r\n]+'), '\n' ),
    )

    def __init__(self, node, write, nsdict={}, stripspace=0):
	'''Create and run the implementation.'''
	if node._get_nodeType() != Node.ELEMENT_NODE:
	    raise TypeError, 'Non-element node'
	self.write, self.ns_stack, self.stripspace = \
		write, [nsdict], stripspace
	self._do_element(node)
	self.ns_stack.pop()

    def _do_text(self, node):
	'Output a text node in canonical form.'
	s = node._get_data() \
		.replace("\015", "&#xD;") \
		.replace("&", "&amp;") \
		.replace("<", "&lt;") \
		.replace(">", "&gt;")
	if self.stripspace:
	    for pat,repl in _implementation.repats:
		s = re.sub(pat, repl, s)
	if s: self.write(s)
    handlers[Node.TEXT_NODE] =_do_text
    handlers[Node.CDATA_SECTION_NODE] =_do_text

    def _do_pi(self, node):
	'Output a processing instruction in canonical form.'
	pass	# XXX
    handlers[Node.PROCESSING_INSTRUCTION_NODE] =_do_pi

    def _do_comment(self, node):
	'Output a comment node in canonical form.'
	pass	# XXX
    handlers[Node.COMMENT_NODE] =_do_comment

    def _do_attr(self, n, value):
	'Output an attribute in canonical form.'
	W = self.write
	W(' ')
	W(n)
	W('="')
	s = value \
	    .replace("&", "&amp;") \
	    .replace("<", "&lt;") \
	    .replace('"', '&quot;') \
	    .replace('\011', '&#9') \
	    .replace('\012', '&#A') \
	    .replace('\015', '&#D')
	W(s)
	W('"')

    def _do_element(self, node):
	'Output an element (and its children) in canonical form.'
	name = node._get_nodeName()
	parent_ns = self.ns_stack[-1]
	my_ns = { 'xmlns': parent_ns.get('xmlns', '') }
	W = self.write
	W('<')
	W(name)

	# Divide attributes to NS definitions and others.
	nsnodes, others = [], []
	for a in _attrs(node):
	    if a._get_namespaceURI() == xmlns_base:
		nsnodes.append(a)
	    else:
		others.append(a)

	# Namespace attributes: update dictionary; if not already
	# in parent, output it.
	nsnodes.sort(_sorter)
	for a in nsnodes:
	    n = a._get_nodeName()
	    if n == "xmlns:":
		key, n = "", "xmlns"
	    else:
		key = a._get_localName()
	    v = my_ns[key] = a._get_nodeValue()
	    pval = parent_ns.get(key, None)
	    if v != pval: self._do_attr(n, v)

	# Other attributes: sort and output.
	others.sort(_sorter)
	for a in others:
	    self._do_attr(a._get_nodeName(), a._get_value())
	W('>')

	self.ns_stack.append(my_ns)
	for c in _children(node):
	    handler = _implementation.handlers.get(c._get_nodeType(), None)
	    if handler: handler(self, c)
	self.ns_stack.pop()
	W('</%s>' % (name,))
    handlers[Node.ELEMENT_NODE] =_do_element

def XMLC14N(node, output=None, **kw):
    '''Canonicalize a DOM element node and everything underneath it.
    Return the text; if output is specified then output.write will
    be called to output the text and the return value will be None.
    Keyword parameters:
	stripspace -- remove extra (almost all) whitespace from text nodes
	nsdict -- a dictionary of prefix/uri namespace entries assumed
	    to exist in the surrounding context.
    '''

    if output:
	s = None
    else:
	output = s = StringIO.StringIO()

    _implementation(node,
	output.write,
	stripspace=kw.get('stripspace', 0),
	nsdict=kw.get('nsdict', {})
    )
    if s: return (s.getvalue(), s.close())[0]
    return None
    if s == None: return None
    ret = s.getvalue()
    s.close()
    return ret

if __name__ == '__main__':
    text = '''<SOAP-ENV:Envelope
      xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
      xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:spare='foo'
      SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
	<SOAP-ENV:Body xmlns='test-uri'><?MYPI spenser?>
	    <Price xsi:type='xsd:integer'>34</Price>	<!-- 0 -->
	    <SOAP-ENC:byte>44</SOAP-ENC:byte>	<!-- 1 -->
	    <Name>This is the name</Name>	<!-- 2 -->
	    <n2><![CDATA[<greeting>Hello</greeting>]]></n2> <!-- 3 -->
	    <n3 href='#zzz' xsi:type='SOAP-ENC:string'/>		<!-- 4 -->
	    <n64>a GVsbG8=</n64>		<!-- 5 -->
	    <SOAP-ENC:string>Red</SOAP-ENC:string>	<!-- 6 -->
	    <a2 href='#tri2'/>		<!-- 7 -->
	    <a2 xmlns:f='z' xmlns:aa='zz'><i xmlns:f='z'>12</i><t>rich salz</t></a2> <!-- 8 -->
	    <xsd:hexBinary>3F2041</xsd:hexBinary> <!-- 9 -->
	    <nullint xsi:nil='1'/> <!-- 10 -->
	</SOAP-ENV:Body>
      <z xmlns='myns' id='zzz'>The value of n3</z>
      <zz xmlns:spare='foo' xmlns='myns2' id='tri2'><inner>content</inner></zz>
    </SOAP-ENV:Envelope>'''

    print _copyright
    from xml.dom.ext.reader import PyExpat
    reader = PyExpat.Reader()
    dom = reader.fromString(text)
    for e in _children(dom):
	if e._get_nodeType() != Node.ELEMENT_NODE: continue
	for ee in _children(e):
	    if ee._get_nodeType() != Node.ELEMENT_NODE: continue
	    print '\n', '=' * 60
	    print XMLC14N(ee, nsdict={'spare':'foo'}, stripspace=1)
	    print '-' * 60
	    print XMLC14N(ee, stripspace=0)
	    print '=' * 60

--------------4C8E83122B2EF8C15F82E15C--


From rsalz@zolera.com  Thu May 17 15:13:48 2001
From: rsalz@zolera.com (Rich Salz)
Date: Thu, 17 May 2001 10:13:48 -0400
Subject: [XML-SIG] XML Canonicalization
References: <3B03D8B4.9108432D@zolera.com>
Message-ID: <3B03DC9C.D12A6B91@zolera.com>

Oops.  I didn't save-file in the other window before I sent...

> def XMLC14N(node, output=None, **kw):
...
>     if s: return (s.getvalue(), s.close())[0]
>     return None
>     if s == None: return None		**
>     ret = s.getvalue()		**
>     s.close()				**
>     return ret			**

Obviously those last four lines can be deleted.


From uche.ogbuji@fourthought.com  Thu May 17 18:09:28 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 17 May 2001 11:09:28 -0600
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: Message from Lars Marius Garshol <larsga@garshol.priv.no>
 of "17 May 2001 10:45:05 +0200." <m3eltotlda.fsf@lambda.garshol.priv.no>
Message-ID: <200105171709.f4HH9SX17328@localhost.local>

> 
> * Uche Ogbuji
> | 
> | I can guarantee you you want nothing to do with XML files in the
> | hundreds of MB.  You don't even want them in the MB, period.
> 
> Why ever not? I've worked with lots of XML files of that size over the
> last years and see nothing wrong with that. If the amount of data you
> need to move around or work with is large, then your XML documents
> will be large.
> 
> I see no reason why this should be considered somehow suspect or wrong.
> If you use SAX there is really no reason why you shouldn't be able to
> handle such documents.

Why not?  Because most XML handling tools are not very scalable, XSLT being 
the foremost example.

Also because XML eliminates the need, which I think quite unneccesary, of 
storing mountains of data in a single file.  Inclusion, transclusion, other 
linking mechanisms, and many tools are available for breaking XML into 
manageable packets.

So, in my opionion, it's suspect *and* wrong to be dealing with 100MB XML 
files.  Opinion of others might vary, of course.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Thu May 17 18:13:06 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 17 May 2001 11:13:06 -0600
Subject: [XML-SIG] XML Canonicalization
In-Reply-To: Message from Rich Salz <rsalz@zolera.com>
 of "Thu, 17 May 2001 09:57:08 EDT." <3B03D8B4.9108432D@zolera.com>
Message-ID: <200105171713.f4HHD6Z17352@localhost.local>

> Someone had asked for code to do XML C14N (canonicalization) a couple of
> weeks ago.  I finally got around to cleaning up my code; it's attached.
> 
> I would be more than happy to add this to PyXML if there's interest. 
> Since it operates on DOM nodes, perhaps xml.dom.utils ?  I'd probably
> also need to upgrade the documentation -- the docstrings in the code
> should tell you all you need.

Brilliant!  I heartily vote for its inclusion in PyXML.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From martin@loewis.home.cs.tu-berlin.de  Thu May 17 19:15:13 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 17 May 2001 20:15:13 +0200
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <3B038AC5.B205328F@FourThought.com> (message from Mike Olson on
 Thu, 17 May 2001 02:24:37 -0600)
References: <B72946F8.81CA%tony.mcdonald@ncl.ac.uk> <3B038AC5.B205328F@FourThought.com>
Message-ID: <200105171815.f4HIFDF01101@mira.informatik.hu-berlin.de>

> Can you send me a sample of the word XML output, and the format your
> looking for.  You can probably do it with a stylesheet as long as what
> word spits out really is XML.

It isn't. Most notably, attribute values are not enclosed in quotes.
I found that sgmlop can parse what word produces, though.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Thu May 17 20:06:53 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 17 May 2001 21:06:53 +0200
Subject: [XML-SIG] XML Canonicalization
In-Reply-To: <200105171713.f4HHD6Z17352@localhost.local> (message from Uche
 Ogbuji on Thu, 17 May 2001 11:13:06 -0600)
References: <200105171713.f4HHD6Z17352@localhost.local>
Message-ID: <200105171906.f4HJ6rZ01295@mira.informatik.hu-berlin.de>

> Brilliant!  I heartily vote for its inclusion in PyXML.

It's fine with me, too. Rich, could you please check it in?

Thanks,
Martin


From rsalz@zolera.com  Thu May 17 20:20:05 2001
From: rsalz@zolera.com (Rich Salz)
Date: Thu, 17 May 2001 15:20:05 -0400
Subject: [XML-SIG] XML Canonicalization
References: <200105171713.f4HHD6Z17352@localhost.local> <200105171906.f4HJ6rZ01295@mira.informatik.hu-berlin.de>
Message-ID: <3B042465.1DCA826D@zolera.com>

> Rich, could you please check it in?


Gladly.  Just tell me where (xml.dom.utils?) and where are the docs that
I should update.
	/r$


From uche.ogbuji@fourthought.com  Thu May 17 20:27:31 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 17 May 2001 13:27:31 -0600
Subject: [XML-SIG] XML Canonicalization
References: <200105171713.f4HHD6Z17352@localhost.local> <200105171906.f4HJ6rZ01295@mira.informatik.hu-berlin.de>
Message-ID: <3B042623.157DD7F1@fourthought.com>

"Martin v. Loewis" wrote:
> 
> > Brilliant!  I heartily vote for its inclusion in PyXML.
> 
> It's fine with me, too. Rich, could you please check it in?

Rich did ask about the best place to put it.

He suggested xml.dom.utils, but I wonder if there's any prospect of
generalizing it so that it would work with SAX streams.  Based on his
DOM ops, I guess probably not.

So maybe xml.dom.ext.c14n

I think this will be handy for RDF (parseType="literal", ya know).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From chris@hddesign.com  Thu May 17 21:01:47 2001
From: chris@hddesign.com (Chris Meyers)
Date: Thu, 17 May 2001 15:01:47 -0500
Subject: [XML-SIG] newbie question
Message-ID: <20010517150147.A5471@hddesign.com>

Ok I have been looking at PyXML for a couple of days now, and I still can't really find a good example of the basic stuff I need to do. I want to read in an XML file, traverse the tree and pull out information. For example I would like to go through this xml:

<?xml version="1.0" encoding="UTF-8"?>
<report>
	<data>
		<rec>
			<fld id="1">123</fld>
			<fld id="2">John></fld>
			<fld id="3">Smith></fld>
		</rec>
	</data>
</report>

>From this xml I would like to pull out the id attributes and the values from the <fld> elements. I can do this in jython with jdom easily enough, but I need to use python for my current application

If someone could point me in the right direction as to where to look to find an example similar to what I am trying to do, I would really appreciate it.

Thanks,
Chris


From martin@loewis.home.cs.tu-berlin.de  Thu May 17 21:12:36 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 17 May 2001 22:12:36 +0200
Subject: [XML-SIG] XML Canonicalization
In-Reply-To: <3B042623.157DD7F1@fourthought.com> (message from Uche Ogbuji on
 Thu, 17 May 2001 13:27:31 -0600)
References: <200105171713.f4HHD6Z17352@localhost.local> <200105171906.f4HJ6rZ01295@mira.informatik.hu-berlin.de> <3B042623.157DD7F1@fourthought.com>
Message-ID: <200105172012.f4HKCaR02192@mira.informatik.hu-berlin.de>

> He suggested xml.dom.utils, but I wonder if there's any prospect of
> generalizing it so that it would work with SAX streams.  Based on his
> DOM ops, I guess probably not.
> 
> So maybe xml.dom.ext.c14n

xml.dom.ext sounds better than xml.dom.utils, since I dislike packages
with only a single module, and because it is also an extension.

I'm not whether people can make sense out of c14n - I certainly
couldn't, although it is a cute name. 'normalize' would not be
appropriate, would it?

Regards,
Martin


From Mike.Olson@fourthought.com  Thu May 17 23:18:52 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Thu, 17 May 2001 16:18:52 -0600
Subject: [XML-SIG] newbie question
References: <20010517150147.A5471@hddesign.com>
Message-ID: <3B044E4C.37A2F38C@FourThought.com>

Chris Meyers wrote:
> 
> Ok I have been looking at PyXML for a couple of days now, and I still can't really find a good example of the basic stuff I need to do. I want to read in an XML file, traverse the tree and pull out information. For example I would like to go through this xml:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <report>
>         <data>
>                 <rec>
>                         <fld id="1">123</fld>
>                         <fld id="2">John></fld>
>                         <fld id="3">Smith></fld>
>                 </rec>
>         </data>
> </report>

There are a couple of ways:

1.  Use DOM

from xml.dom.ext.reader import PyExpat
reader = PyExpat.Reader()

dom = reader.fromString(XML_SRC)

flds = dom.documentElement.getElementsByTagName('fld')

for f in flds:
    print fld.getAttribute('id')
    print fld.firstChild.data


2.  Use XPath

from xml import xpath
from xml.dom.ext.reader import PyExpat
reader = PyExpat.Reader()

dom = reader.fromString(XML_SRC)

flds = xpath.Evaluate('//fld',contextNode = dom)

for f in flds:
    print fld.getAttribute('id')
    print fld.firstChild.data


Mike


> 
> >From this xml I would like to pull out the id attributes and the values from the <fld> elements. I can do this in jython with jdom easily enough, but I need to use python for my current application
> 
> If someone could point me in the right direction as to where to look to find an example similar to what I am trying to do, I would really appreciate it.
> 
> Thanks,
> Chris
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From chris@hddesign.com  Thu May 17 23:47:42 2001
From: chris@hddesign.com (Chris Meyers)
Date: Thu, 17 May 2001 17:47:42 -0500
Subject: [XML-SIG] newbie question
In-Reply-To: <3B044E4C.37A2F38C@FourThought.com>; from Mike.Olson@fourthought.com on Thu, May 17, 2001 at 04:18:52PM -0600
References: <20010517150147.A5471@hddesign.com> <3B044E4C.37A2F38C@FourThought.com>
Message-ID: <20010517174742.A5790@hddesign.com>

Thanks a lot, that did the trick.

Chris

On Thu, May 17, 2001 at 04:18:52PM -0600, Mike Olson wrote:
> Chris Meyers wrote:
> > 
> > Ok I have been looking at PyXML for a couple of days now, and I still can't really find a good example of the basic stuff I need to do. I want to read in an XML file, traverse the tree and pull out information. For example I would like to go through this xml:
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <report>
> >         <data>
> >                 <rec>
> >                         <fld id="1">123</fld>
> >                         <fld id="2">John></fld>
> >                         <fld id="3">Smith></fld>
> >                 </rec>
> >         </data>
> > </report>
> 
> There are a couple of ways:
> 
> 1.  Use DOM
> 
> from xml.dom.ext.reader import PyExpat
> reader = PyExpat.Reader()
> 
> dom = reader.fromString(XML_SRC)
> 
> flds = dom.documentElement.getElementsByTagName('fld')
> 
> for f in flds:
>     print fld.getAttribute('id')
>     print fld.firstChild.data
> 
> 
> 2.  Use XPath
> 
> from xml import xpath
> from xml.dom.ext.reader import PyExpat
> reader = PyExpat.Reader()
> 
> dom = reader.fromString(XML_SRC)
> 
> flds = xpath.Evaluate('//fld',contextNode = dom)
> 
> for f in flds:
>     print fld.getAttribute('id')
>     print fld.firstChild.data
> 
> 
> Mike
> 
> 
> > 
> > >From this xml I would like to pull out the id attributes and the values from the <fld> elements. I can do this in jython with jdom easily enough, but I need to use python for my current application
> > 
> > If someone could point me in the right direction as to where to look to find an example similar to what I am trying to do, I would really appreciate it.
> > 
> > Thanks,
> > Chris
> > 
> > _______________________________________________
> > XML-SIG maillist  -  XML-SIG@python.org
> > http://mail.python.org/mailman/listinfo/xml-sig
> 
> -- 
> Mike Olson				 Principal Consultant
> mike.olson@fourthought.com               (303)583-9900 x 102
> Fourthought, Inc.                         http://Fourthought.com 
> Software-engineering, knowledge-management, XML, CORBA, Linux, Python
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig

-- 
Chris Meyers
7941 Tree Lane Suite 200
Madison WI 53717


From jsydik@virtualparadigm.com  Fri May 18 00:14:30 2001
From: jsydik@virtualparadigm.com (Jeremy J. Sydik)
Date: Thu, 17 May 2001 18:14:30 -0500
Subject: [XML-SIG] Advice needed: RTF->XML conversions
In-Reply-To: <200105171815.f4HIFDF01101@mira.informatik.hu-berlin.de>
Message-ID: <MMEHLOIJDENFKMFKBPHEOEKECDAA.jsydik@virtualparadigm.com>

---------------------------------------------------------------------------
Martin is right.  The Office/Word 'XML' can be a difficult thing to work
with.  It's been a while since i've thought about it, but you will probably
need to account for the following:

	* Not all attributes are quoted
	* Singleton tags aren't closed (This can be dealt with fairly easily,
            however.  It's simply the 'standard' singleton html tags that
		occur this way (br, img, etc).
	* There are a few microsoft namespaces to deal with, as well as
		special tags.  The documentation for these is found in:
		http://msdn.microsoft.com/library/officedev/ofxml2k/ofhtml9.exe
		The primary ones you'll probably encounter are o: and w:
	* Also described in this document are 
			<!--[if condition]>...<[endif]-->
		and 
			<![if condition]>...<![endif]>
		pairs.  These break most SGML
		and XML implementations.  (It would be good to think of a regex
		solution, since you'll probably need one to properly enclose
		the attributes anyway).

Once those issues are addressed, you SHOULD have valid XML.  If you don't,
chances are you haven't hit everything in this list :)

	Good Luck,
		Jeremy
-----Original Message-----
From: xml-sig-admin@python.org [mailto:xml-sig-admin@python.org]On
Behalf Of Martin v. Loewis
Sent: Thursday, May 17, 2001 1:15 PM
To: Mike.Olson@fourthought.com
Cc: tony.mcdonald@ncl.ac.uk; Alexandre.Fayolle@logilab.fr;
xml-sig@python.org
Subject: Re: [XML-SIG] Advice needed: RTF->XML conversions


> Can you send me a sample of the word XML output, and the format your
> looking for.  You can probably do it with a stylesheet as long as what
> word spits out really is XML.

It isn't. Most notably, attribute values are not enclosed in quotes.
I found that sgmlop can parse what word produces, though.

Regards,
Martin

_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://mail.python.org/mailman/listinfo/xml-sig


From rsalz@zolera.com  Fri May 18 01:09:59 2001
From: rsalz@zolera.com (Rich Salz)
Date: Thu, 17 May 2001 20:09:59 -0400
Subject: [XML-SIG] newbie question
References: <20010517150147.A5471@hddesign.com>
Message-ID: <3B046857.2D18B6B4@zolera.com>

Mike's already posted a solution.

I've found the code in dom.ext useful for examples.
	/r$


From rsalz@zolera.com  Fri May 18 01:40:34 2001
From: rsalz@zolera.com (Rich Salz)
Date: Thu, 17 May 2001 20:40:34 -0400
Subject: [XML-SIG] XML Canonicalization
References: <200105171713.f4HHD6Z17352@localhost.local> <200105171906.f4HJ6rZ01295@mira.informatik.hu-berlin.de> <3B042623.157DD7F1@fourthought.com> <200105172012.f4HKCaR02192@mira.informatik.hu-berlin.de>
Message-ID: <3B046F82.3F306701@zolera.com>

> xml.dom.ext sounds better than xml.dom.utils, since I dislike packages
> with only a single module

Me too.

> and because it is also an extension.

I think it's a matter of very detailed use of English. :)  I view it as
a utility.  But it doesn't matter.

> I'm not whether people can make sense out of c14n - I certainly
> couldn't, although it is a cute name. 'normalize' would not be
> appropriate, would it?

No, the proper term really is canonicalization.  I agree, the name is
somewhat cute, but within the community C14N is as well-known as I18N.

How about
	from xml.dom.ext import Canonicalize
and in ext/__init__.py I add
	from c14n import Canonicalize

So the filename is c14n.py, but the exported name is more use-friendly.


From martin@loewis.home.cs.tu-berlin.de  Thu May 17 22:36:38 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 17 May 2001 23:36:38 +0200
Subject: [XML-SIG] newbie question
In-Reply-To: <20010517150147.A5471@hddesign.com> (message from Chris Meyers on
 Thu, 17 May 2001 15:01:47 -0500)
References: <20010517150147.A5471@hddesign.com>
Message-ID: <200105172136.f4HLacH02948@mira.informatik.hu-berlin.de>

> From this xml I would like to pull out the id attributes and the
> values from the <fld> elements. I can do this in jython with jdom
> easily enough, but I need to use python for my current application

In PyXML, it works mostly the same way. The only different thing is
how to obtain a DOM Document; you use xml.dom.ext.reader.Sax2.FromXml*
for this. Once you have a DOM tree, you proceed just as with jython,
i.e.  using getElementsByTagName, etc.

You probably need to be aware of the Python DOM mapping, see

http://www.python.org/doc/current/lib/module-xml.dom.html

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Fri May 18 05:08:54 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 18 May 2001 06:08:54 +0200
Subject: [XML-SIG] XML Canonicalization
In-Reply-To: <3B046F82.3F306701@zolera.com> (message from Rich Salz on Thu, 17
 May 2001 20:40:34 -0400)
References: <200105171713.f4HHD6Z17352@localhost.local> <200105171906.f4HJ6rZ01295@mira.informatik.hu-berlin.de> <3B042623.157DD7F1@fourthought.com> <200105172012.f4HKCaR02192@mira.informatik.hu-berlin.de> <3B046F82.3F306701@zolera.com>
Message-ID: <200105180408.f4I48s000954@mira.informatik.hu-berlin.de>

> How about
> 	from xml.dom.ext import Canonicalize
> and in ext/__init__.py I add
> 	from c14n import Canonicalize
> 
> So the filename is c14n.py, but the exported name is more use-friendly.

That sounds good.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Fri May 18 05:17:00 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 18 May 2001 06:17:00 +0200
Subject: [XML-SIG] XML Canonicalization
In-Reply-To: <3B042465.1DCA826D@zolera.com> (message from Rich Salz on Thu, 17
 May 2001 15:20:05 -0400)
References: <200105171713.f4HHD6Z17352@localhost.local> <200105171906.f4HJ6rZ01295@mira.informatik.hu-berlin.de> <3B042465.1DCA826D@zolera.com>
Message-ID: <200105180417.f4I4H0p00981@mira.informatik.hu-berlin.de>

> Gladly.  Just tell me where (xml.dom.utils?) and where are the docs that
> I should update.

As for the docs, it would be IMO best to put a
\section{xml.dom.ext.c14n} into doc/xml-ref.tex. You'll notice that
much of the content of that file is outdated. Since updating the
documentation consists of removing most of the stuff first, adding new
sections contributes to that update process.

Regards,
Martin


From rsalz@zolera.com  Fri May 18 13:42:53 2001
From: rsalz@zolera.com (Rich Salz)
Date: Fri, 18 May 2001 08:42:53 -0400
Subject: [XML-SIG] newbie question
References: <20010517150147.A5471@hddesign.com> <200105172136.f4HLacH02948@mira.informatik.hu-berlin.de>
Message-ID: <3B0518CD.79A4D3D9@zolera.com>

> You probably need to be aware of the Python DOM mapping, see
> 
> http://www.python.org/doc/current/lib/module-xml.dom.html

That brings up a question I meant to ask last week.

What's better, the "raw" mapping documented above, or the Corba-style
mapping? That is, self.nodeType or self._get_nodeType() ?

I am mainly interested to know which is most portable across Python DOM
implementations, but I also care a bit about efficiency.

Since Python has documented its own DOM interface, having an official
Corba->Python mapping doesn't matter all that much to me, although it is
convenient to be able to read Corba IDL and write Python without any
intermediate docs.
	/r$


From fdrake@acm.org  Fri May 18 15:22:23 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 18 May 2001 10:22:23 -0400 (EDT)
Subject: [XML-SIG] XML Canonicalization
In-Reply-To: <3B046F82.3F306701@zolera.com>
References: <200105171713.f4HHD6Z17352@localhost.local>
 <200105171906.f4HJ6rZ01295@mira.informatik.hu-berlin.de>
 <3B042623.157DD7F1@fourthought.com>
 <200105172012.f4HKCaR02192@mira.informatik.hu-berlin.de>
 <3B046F82.3F306701@zolera.com>
Message-ID: <15109.12319.311051.900182@cj42289-a.reston1.va.home.com>

Rich Salz writes:
 > How about
 > 	from xml.dom.ext import Canonicalize
 > and in ext/__init__.py I add
 > 	from c14n import Canonicalize

  How about calling the module "canon":

	from xml.dom.ext import canon

        def main():
            ... = canon.Canonicalize(...)


  -Fred


-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From martin@loewis.home.cs.tu-berlin.de  Fri May 18 21:52:46 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 18 May 2001 22:52:46 +0200
Subject: [XML-SIG] newbie question
In-Reply-To: <3B0518CD.79A4D3D9@zolera.com> (message from Rich Salz on Fri, 18
 May 2001 08:42:53 -0400)
References: <20010517150147.A5471@hddesign.com> <200105172136.f4HLacH02948@mira.informatik.hu-berlin.de> <3B0518CD.79A4D3D9@zolera.com>
Message-ID: <200105182052.f4IKqkF01843@mira.informatik.hu-berlin.de>

> What's better, the "raw" mapping documented above, or the Corba-style
> mapping? That is, self.nodeType or self._get_nodeType() ?
> 
> I am mainly interested to know which is most portable across Python DOM
> implementations, but I also care a bit about efficiency.

It's mainly a matter of personal taste. Some people believe in
accessor functions, some in attributes.

If you want to care about portability and speed, you should use
attributes. Whether you go through __getattr__ or not varies depending
on DOM implementation and attribute; most attributes will be directly
available, though.

Regards,
Martin


From tony.mcdonald@ncl.ac.uk  Sun May 20 10:17:20 2001
From: tony.mcdonald@ncl.ac.uk (Tony McDonald)
Date: Sun, 20 May 2001 10:17:20 +0100
Subject: [XML-SIG] Problems with 'multiple definitions'
Message-ID: <B72D4A2F.8440%tony.mcdonald@ncl.ac.uk>

Hi all,
This isn't strictly an XML thing, but as the packages I really want to use
are the XML ones, I thought the group might be able to help.

I'm working with python2.1 and MacOS X and compiling up packages such as
PyXML and 4Suite (although this happens with packages such as MySQLdb
too).

I use the standard procedure to build and install these packages, ie

% python2.1 setup.py install

But, when I test out 4Suite (for example), ie

% cd /usr/local/doc/4Suite-0.11/test_suite/4XSLT
% python2.1 basic_test.py

I get this;

dyld: python2.1 multiple definitions of symbol _XML_DefaultCurrent
python2.1 definition of _XML_DefaultCurrent
/usr/local/lib/python2.1/site-packages/Ft/Lib/cDomlettec.so definition
of _XML_DefaultCurrent

I get similar errors with other packages such as PyXML and MySQLdb.

I've managed to install MySQLdb by stripping out an offending symbol
from libmysqlclient.a, but surely there's a cleaner way of doing this?
Is there some compiler flag I can set that gets around this?

The python is a pre-compiled version from http://tony.lownds.com/macosx/

any help would be appreciated, this effectively stops me using any
compiled modules under MacOS X (which is, in almost all other respects,
excellent!).

TIA
tone
-- 
Dr Tony McDonald,  Assistant Director, FMCC, http://www.fmcc.org.uk/
The Medical School, Newcastle University Tel: +44 191 243 6140
A Zope list for UK HE/FE  http://www.fmcc.org.uk/mailman/listinfo/zope


From karl@digicool.com  Tue May 22 00:38:55 2001
From: karl@digicool.com (Karl Anderson)
Date: 21 May 2001 16:38:55 -0700
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
In-Reply-To: Mike Olson's message of "Sun, 13 May 2001 19:14:17 -0600"
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de> <3AFF3169.29F2B6C8@FourThought.com>
Message-ID: <m1r8xixofk.fsf@localhost.localdomain>

Mike Olson <Mike.Olson@fourthought.com> writes:

> "Martin v. Loewis" wrote:
> > 
> > I've tried to update my 4XSLT port to use the 4Suite 0.11 code base,
> > only to discover that the StyleseetReader class is now much stronger
> > connected to Ft.Lib than before, in particular to classes from
> > pDomletteReader, and their specific instance attributes.
> 
> I was just in there as well and quite suprised how complex the code has
> become.  I thought of doing some work on it but figured, it ain't
> broke.....

> Is pDomlette the only import from Ft.Lib?  If so, why not move pDomlette
> into xml.utils?  Better yet, let's merge pDomlette and minidom so there
> is only one domlette.  pDomlette has greatly out grown its original
> purpose so I have not problems with moving it into XML-Sig.

If you're suggesting that the DOMs should be consolidated so that
tools like PyXML's XSLT could support only that DOM, I hope you'll
reconsider.  I'd like Zope's DOM to be usable by PyXML's XSLT and
XPath implementations.

There are some hurdles to this, though.  The tests are only usable
with 4Suite, which makes it harder to find inconsistencies.
Submitting patches to 4Suite's implementations wouldn't be helpful for
my goals, because 4Suite's XSLT and XPath processors have become more
reliant on its particular DOM since these modules were forked to
PyXML.

-- 
Karl Anderson                          karl@digicool.com


From Mike.Olson@fourthought.com  Tue May 22 00:57:53 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Mon, 21 May 2001 17:57:53 -0600
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de> <3AFF3169.29F2B6C8@FourThought.com> <m1r8xixofk.fsf@localhost.localdomain>
Message-ID: <3B09AB81.910B27F2@FourThought.com>

Karl Anderson wrote:
> 
> Mike Olson <Mike.Olson@fourthought.com> writes:
> 
> > "Martin v. Loewis" wrote:
> > >
> > > I've tried to update my 4XSLT port to use the 4Suite 0.11 code base,
> > > only to discover that the StyleseetReader class is now much stronger
> > > connected to Ft.Lib than before, in particular to classes from
> > > pDomletteReader, and their specific instance attributes.
> >
> > I was just in there as well and quite suprised how complex the code has
> > become.  I thought of doing some work on it but figured, it ain't
> > broke.....
> 
> > Is pDomlette the only import from Ft.Lib?  If so, why not move pDomlette
> > into xml.utils?  Better yet, let's merge pDomlette and minidom so there
> > is only one domlette.  pDomlette has greatly out grown its original
> > purpose so I have not problems with moving it into XML-Sig.
> 
> If you're suggesting that the DOMs should be consolidated so that
> tools like PyXML's XSLT could support only that DOM, I hope you'll
> reconsider.  I'd like Zope's DOM to be usable by PyXML's XSLT and
> XPath implementations.

Not at all.  I was suggesting that both miniDOM and pDomlette are light
weight python DOM implementations and I don't think we need two of
them.  If Zope's DOM supports the Python DOM interface, then it should
work in xslt/xpath.  If not it is a bug in xslt/xpath.

However, I don't know if this will always be the case.  4XSLT is about
to get a _big_ rewrite and we might not support a "runNode" interface
anymore.  If we do, it will probably not be the most efficent way to use
4xslt as we will have to translate from DOM into the internal data
structure.


> 
> There are some hurdles to this, though.  The tests are only usable
> with 4Suite, which makes it harder to find inconsistencies.
> Submitting patches to 4Suite's implementations wouldn't be helpful for
> my goals, because 4Suite's XSLT and XPath processors have become more
> reliant on its particular DOM since these modules were forked to
> PyXML.

Actually, the tests would be easy to fix to use another DOM, (though I'm
not sure how you would do it in Zope as I ran into many hurdles
executing ZDOM outside of the Zope environment).  Hoever, to do this,
edit the file test_harness.py.  It is used by every 4XSLT test script. 
Either add a test for ParsedXML, or replace all of the existing tests
with a parsedXML test.  Then just run test.py and all of the 4XSLT tests
will use Parsed XML.


I don't understand the more reliant part.  How have we become more
reliant.  Are you talking about the fact that MArtin did a lot of work
when he first moved 4XSLT into PyXML to disentangle 4XSLT from Ft.Lib? 
Then its not really more reliant, just not ported yet.

FYI to all, we will be synching 4XSLT with Martins changes in the near
future.


> 
> --
> Karl Anderson                          karl@digicool.com

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From karl@digicool.com  Tue May 22 03:07:25 2001
From: karl@digicool.com (Karl Anderson)
Date: 21 May 2001 19:07:25 -0700
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
In-Reply-To: Mike Olson's message of "Mon, 21 May 2001 17:57:53 -0600"
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de> <3AFF3169.29F2B6C8@FourThought.com> <m1r8xixofk.fsf@localhost.localdomain> <3B09AB81.910B27F2@FourThought.com>
Message-ID: <m166euxhk2.fsf@localhost.localdomain>

Mike Olson <Mike.Olson@fourthought.com> writes:

> Karl Anderson wrote:
> > 
> > There are some hurdles to this, though.  The tests are only usable
> > with 4Suite, which makes it harder to find inconsistencies.
> > Submitting patches to 4Suite's implementations wouldn't be helpful for
> > my goals, because 4Suite's XSLT and XPath processors have become more
> > reliant on its particular DOM since these modules were forked to
> > PyXML.
> 
> Actually, the tests would be easy to fix to use another DOM, (though I'm
> not sure how you would do it in Zope as I ran into many hurdles
> executing ZDOM outside of the Zope environment).

I don't know that ZDOM is a good measure of usefulness with other
DOMs - I haven't really looked at it, much less tested it.  Right now
I'm concentrating on ParsedXML's DOM.

For a simple example of using PyXML's XPath with ParsedXML:
http://www.zope.org/Wikis/DevSite/Projects/ParsedXML/ParsedXMLWith4XPath
You do need a Zope installation with ParsedXML, although you don't
need to actually run Zope :)

If you want to use ParsedXML to test usability with other DOM
implementations, I'd be glad to help.

>  Hoever, to do this,
> edit the file test_harness.py.  It is used by every 4XSLT test script. 
> Either add a test for ParsedXML, or replace all of the existing tests
> with a parsedXML test.  Then just run test.py and all of the 4XSLT tests
> will use Parsed XML.

Thanks, I'll look into this when I can.

> I don't understand the more reliant part.  How have we become more
> reliant.  Are you talking about the fact that MArtin did a lot of work
> when he first moved 4XSLT into PyXML to disentangle 4XSLT from Ft.Lib? 
> Then its not really more reliant, just not ported yet.

Perhaps I misread the CVS histories.  I was looking into how PyXML and
4Suite depended on the included DOM implementations, and I thought
that 4XPath was copied over to PyXML, and that after that updates to
4Suite's tree made it dependent on its DOM.  But looking again (I was
running into trouble with XPath/Conversions.py), there seems to have
been some syncing and stuff going on, I'd have to do some work to
convince myself that I was correct.

-- 
Karl Anderson                          karl@digicool.com


From martin@loewis.home.cs.tu-berlin.de  Tue May 22 06:15:11 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 22 May 2001 07:15:11 +0200
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
In-Reply-To: <m166euxhk2.fsf@localhost.localdomain> (message from Karl
 Anderson on 21 May 2001 19:07:25 -0700)
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de> <3AFF3169.29F2B6C8@FourThought.com> <m1r8xixofk.fsf@localhost.localdomain> <3B09AB81.910B27F2@FourThought.com> <m166euxhk2.fsf@localhost.localdomain>
Message-ID: <200105220515.f4M5FBi00961@mira.informatik.hu-berlin.de>

> Perhaps I misread the CVS histories.  I was looking into how PyXML and
> 4Suite depended on the included DOM implementations, and I thought
> that 4XPath was copied over to PyXML, and that after that updates to
> 4Suite's tree made it dependent on its DOM.  But looking again (I was
> running into trouble with XPath/Conversions.py), there seems to have
> been some syncing and stuff going on, I'd have to do some work to
> convince myself that I was correct.

Before I first checked 4XPath/4XSLT into PyXML, I had already
significantly modified it; see README.4XPath for an outline of the
changes.

Some of these changes have been integrated into 4Suite. To continue to
keep the two branches similar, I've now integrated the changes of
4Suite 0.11 into PyXML. I have not yet modified them to work
stand-alone, yet, since I got stuck updating the Stylesheet reader.  I
think I will write a new stylesheet reader from scratch which only
uses a SAX DOM builder and a DOM implementation, but I haven't started
with that, yet.

Regards,
Martin


From sam@webslingerZ.com  Tue May 22 14:55:13 2001
From: sam@webslingerZ.com (Sam Brauer)
Date: Tue, 22 May 2001 09:55:13 -0400 (EDT)
Subject: [XML-SIG] ANN: new release of maki
In-Reply-To: <E150mi0-0006yC-00@mail.python.org>
Message-ID: <Pine.LNX.4.31.0105220933220.1926-100000@localhost.localdomain>

I've released a new version of maki at http://maki.sourceforge.net

maki is a mod_python handler which uses various 4Suite components to serve
XML with Apache.  It allows a web developer to specify processing rules
based on path-matching regular expressions.  Each rule describes a
pipeline with any number of XSLT steps and/or custom processing steps.  A
processor that evaluates embedded Python source to dynamically modify the
document is included.  maki also supports time-based caching of output.
Also included are two example "logicsheets": one that adds HTTP request
data to the document and another that executes SQL queries and creates
elements from the results.

The overall functionality is similar (though intentionally not identical)
to Cocoon.

For more info, please take a look at the online documentation at
http://maki.sourceforge.net/manual/

Thank you,
Sam

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sam Brauer : sbrauer@users.sourceforge.net


From karl@digicool.com  Tue May 22 20:05:36 2001
From: karl@digicool.com (Karl Anderson)
Date: 22 May 2001 12:05:36 -0700
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
In-Reply-To: "Martin v. Loewis"'s message of "Tue, 22 May 2001 07:15:11 +0200"
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de> <3AFF3169.29F2B6C8@FourThought.com> <m1r8xixofk.fsf@localhost.localdomain> <3B09AB81.910B27F2@FourThought.com> <m166euxhk2.fsf@localhost.localdomain> <200105220515.f4M5FBi00961@mira.informatik.hu-berlin.de>
Message-ID: <m1ofslw6f3.fsf@localhost.localdomain>

Martin v. Loewis <martin@loewis.home.cs.tu-berlin.de> writes:

> > Perhaps I misread the CVS histories.  I was looking into how PyXML and
> > 4Suite depended on the included DOM implementations, and I thought
> > that 4XPath was copied over to PyXML, and that after that updates to
> > 4Suite's tree made it dependent on its DOM.  But looking again (I was
> > running into trouble with XPath/Conversions.py), there seems to have
> > been some syncing and stuff going on, I'd have to do some work to
> > convince myself that I was correct.
> 
> Before I first checked 4XPath/4XSLT into PyXML, I had already
> significantly modified it; see README.4XPath for an outline of the
> changes.

Thanks for clearing that up.

> Some of these changes have been integrated into 4Suite. To continue to
> keep the two branches similar, I've now integrated the changes of
> 4Suite 0.11 into PyXML. I have not yet modified them to work
> stand-alone, yet, since I got stuck updating the Stylesheet reader.  I
> think I will write a new stylesheet reader from scratch which only
> uses a SAX DOM builder and a DOM implementation, but I haven't started
> with that, yet.

Just to be clear, is PyXML's XSLT intended to work with already
created DOM trees as well?

-- 
Karl Anderson                          karl@digicool.com


From Mike.Olson@fourthought.com  Tue May 22 21:43:33 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Tue, 22 May 2001 14:43:33 -0600
Subject: [XML-SIG] ANN: 4Suite and 4SuiteServer 0.11.1 release canidate 1
Message-ID: <3B0ACF75.9694866B@FourThought.com>

All,

Here is the first release canidate for our 0.11.1 release.  A handful of
new features in this release and many bug fixes.  Please give it a try
as we try to work out the documentation and packaging bugs for the
0.11.1 final release (expected later this week).

Please see http://4Suite.org/download.html for the packages.

4Suite new features:

  pure python parser for Xslt, XPath, and XPointer
  Support for unicode in the C based XSLT, XPath and XPointer parsers
  ODS dictionaries and type definitions
  ODS bug fixes and optimizations
  
 
4Suite Server new features

  FTP server
  text indexing using swish
  more CORBA support
  Backup and Restore command line tools
  Better security
  True access control lists


Mike  

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From uche.ogbuji@fourthought.com  Tue May 22 21:54:49 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 22 May 2001 14:54:49 -0600
Subject: [XML-SIG] Re: [4suite] ANN: 4Suite and 4SuiteServer 0.11.1 release canidate 1
References: <3B0ACF75.9694866B@FourThought.com>
Message-ID: <3B0AD219.CB493E17@fourthought.com>

Mike Olson wrote:

> 4Suite Server new features
> 
>   FTP server
>   text indexing using swish
>   more CORBA support
>   Backup and Restore command line tools
>   Better security
>   True access control lists

One note.  The new 4SS requires a re-initialization of all databases. 
We specifically made backup and restore facilities a priority for this
release so that in future, we will provide a smooth migration path
whenever new releases break data.

Hopefully no one has accumulated irreplaceable data in 4SS yet, and if
you have, let us know and we should be able to help with the migration.

Migration testing will be a standard part of every 4SS release form now
so you needn't fear for your data in future.

I apologize for any inconvenience.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From mnot@mnot.net  Tue May 22 23:06:41 2001
From: mnot@mnot.net (Mark Nottingham)
Date: Tue, 22 May 2001 15:06:41 -0700
Subject: [XML-SIG] XML and Unicode
Message-ID: <20010522150638.C22396@mnot.net>

How does one detect the charset used in an XML document from a SAX2
parser (PyXML 0.6.5)?

Also, if I have an XML document encoded ISO-8851-1 (and properly
identified), should I have a reasonable expectation that the output
of a SAX processor, post- .encode('utf-8'), should be correct if
viewed in a Web browser with UTF-8 selected as a character encoding?
In other words, is the post-parse unicode string a neutral
representation of the 8851-x string, which can then be encoded as
utf-8? Or, is it in the charset of the original XML document (my
testing seems to indicate the latter - what was a 8851 character in
the original text does not successfully come out the other side)?

(Sorry if this is obtuse - just getting into i18n, and Python docs
are thin on the ground)

-- 
Mark Nottingham
http://www.mnot.net/


From mal@lemburg.com  Tue May 22 23:38:34 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 23 May 2001 00:38:34 +0200
Subject: [XML-SIG] XML and Unicode
References: <20010522150638.C22396@mnot.net>
Message-ID: <3B0AEA6A.9CCD2A1F@lemburg.com>

Mark Nottingham wrote:
> 
> How does one detect the charset used in an XML document from a SAX2
> parser (PyXML 0.6.5)?
> 
> Also, if I have an XML document encoded ISO-8851-1 (and properly
> identified), should I have a reasonable expectation that the output
> of a SAX processor, post- .encode('utf-8'), should be correct if
> viewed in a Web browser with UTF-8 selected as a character encoding?

This should work...

> In other words, is the post-parse unicode string a neutral
> representation of the 8851-x string, which can then be encoded as
> utf-8?

Unicode is encoding neutral in the sense that it provides
space for the characters of most scripts. If the parser returns
Unicode, then you can encode it as UTF-8 and have the original
contents of the attribute/element represented as UTF-8 string.

> Or, is it in the charset of the original XML document (my
> testing seems to indicate the latter - what was a 8851 character in
> the original text does not successfully come out the other side)?
> 
> (Sorry if this is obtuse - just getting into i18n, and Python docs
> are thin on the ground)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From jeremy.kloth@fourthought.com  Tue May 22 23:53:45 2001
From: jeremy.kloth@fourthought.com (Jeremy J Kloth)
Date: Tue, 22 May 2001 16:53:45 -0600
Subject: [XML-SIG] New parsers in 4XPath and 4XSLT
Message-ID: <003101c0e312$0e4519e0$f803a8c0@dhcp.fourthought.comfourthought.com>

The new generated parsers in XPath and XSLT are now created in a more
factory-ish method.  The parsers are now referenced from:
xml.(xpath|xslt).parser   This allows for the changing of parsers easily.
To create a runtime parser, call parser.new().  And to parse expressions
simply use the parse() method on the created object.

Hopefully this change will help ease the integration into PyXML.
--
Jeremy Kloth                        Consultant
jeremy.kloth@fourthought.com        (303)583-9900 x 105
Fourthought, Inc.                   http://www.fourthought.com
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From mnot@mnot.net  Wed May 23 03:33:18 2001
From: mnot@mnot.net (Mark Nottingham)
Date: Tue, 22 May 2001 19:33:18 -0700
Subject: [XML-SIG] XML and Unicode
In-Reply-To: <3B0AEA6A.9CCD2A1F@lemburg.com>; from mal@lemburg.com on Wed, May 23, 2001 at 12:38:34AM +0200
References: <20010522150638.C22396@mnot.net> <3B0AEA6A.9CCD2A1F@lemburg.com>
Message-ID: <20010522193314.E22396@mnot.net>

--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline


OK, so I'm not getting something then. The attached test script (and
data file) is the problem pared down - if u'string' is a neutral
encoding, and .encode('utf-8') generates a utf-8 encoded string of
that encoding, then the utf-8.html output file should display
correctly; however, it doesn't, while the latin-1 output does
(because the input is latin-1).

It seems like the XML parser isn't converting the ISO-8859-1 to
Unicode; does this make sense?

Thanks,


On Wed, May 23, 2001 at 12:38:34AM +0200, M.-A. Lemburg wrote:
> Mark Nottingham wrote:
> > 
> > How does one detect the charset used in an XML document from a SAX2
> > parser (PyXML 0.6.5)?
> > 
> > Also, if I have an XML document encoded ISO-8851-1 (and properly
> > identified), should I have a reasonable expectation that the output
> > of a SAX processor, post- .encode('utf-8'), should be correct if
> > viewed in a Web browser with UTF-8 selected as a character encoding?
> 
> This should work...
> 
> > In other words, is the post-parse unicode string a neutral
> > representation of the 8851-x string, which can then be encoded as
> > utf-8?
> 
> Unicode is encoding neutral in the sense that it provides
> space for the characters of most scripts. If the parser returns
> Unicode, then you can encode it as UTF-8 and have the original
> contents of the attribute/element represented as UTF-8 string.
> 
> > Or, is it in the charset of the original XML document (my
> > testing seems to indicate the latter - what was a 8851 character in
> > the original text does not successfully come out the other side)?
> > 
> > (Sorry if this is obtuse - just getting into i18n, and Python docs
> > are thin on the ground)
> 
> -- 
> Marc-Andre Lemburg
> CEO eGenix.com Software GmbH
> ______________________________________________________________________
> Company & Consulting:                           http://www.egenix.com/
> Python Software:                        http://www.lemburg.com/python/

-- 
Mark Nottingham
http://www.mnot.net/

--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="testuni.py"

#!/usr/bin/env python2.0

from xml import sax
import string

def run(i, e):
	dh = Parser()
	p = sax.sax2exts.make_parser()
	p.setContentHandler(dh)
	p.setFeature(sax.handler.feature_namespaces, 1)
	p.parse(i + '.xml')
	content = dh.content.encode(e)
	file = open(e + ".html", 'w')
	file.write(template % (e, content))
	file.close()

class Parser(sax.handler.ContentHandler):
	def __init__(self):
		self._tmp_buf = ''
		self.content = None
	
	def startElementNS(self, name, qname, attrs):
		pass
	
	def endElementNS(self, name, qname):
		if name[1] == 'content':
			self.content = string.strip(self._tmp_buf)
		
	def characters(self, content):
		self._tmp_buf = self._tmp_buf + content
		

template = """\
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=%s">
</head>
<body>
<p>%s</p>
</body>
</html
"""

if __name__ == '__main__':
	run('ISO-8859-1', 'UTF-8')
	run('ISO-8859-1', 'Latin-1')

--jI8keyz6grp/JLjh
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: attachment; filename="ISO-8859-1.xml"
Content-Transfer-Encoding: 8bit

<?xml version="1.0" encoding="ISO-8859-1" ?>
<content>Net 21 � The Survivors</content>

--jI8keyz6grp/JLjh--


From mal@lemburg.com  Wed May 23 08:38:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 23 May 2001 09:38:14 +0200
Subject: [XML-SIG] XML and Unicode
References: <20010522150638.C22396@mnot.net> <3B0AEA6A.9CCD2A1F@lemburg.com> <20010522193314.E22396@mnot.net>
Message-ID: <3B0B68E6.9CBF7689@lemburg.com>

Mark Nottingham wrote:
> 
> OK, so I'm not getting something then. The attached test script (and
> data file) is the problem pared down - if u'string' is a neutral
> encoding, and .encode('utf-8') generates a utf-8 encoded string of
> that encoding, then the utf-8.html output file should display
> correctly; however, it doesn't, while the latin-1 output does
> (because the input is latin-1).
> 
> It seems like the XML parser isn't converting the ISO-8859-1 to
> Unicode; does this make sense?

That's a possibility (even though I don't see any funny characters
in your example XML file); looking through the pyexpat.c code
it seems as if the parser assumes that the XML file is encoded 
as UTF-8 -- at least all Unicode conversions are done using UTF-8.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/


From hansv@net4all.be  Wed May 23 08:44:20 2001
From: hansv@net4all.be (Hans verschooten)
Date: Wed, 23 May 2001 09:44:20 +0200
Subject: [XML-SIG] HTML parsing on Python 2.1
Message-ID: <B73136F3.6A02%hansv@net4all.be>

> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

--MS_Mac_OE_3073455860_75874_MIME_Part
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit

Hi,

I am using a freshly installed MacPython 2.1 and would like to know what I
should install extra to use the following script:

[uogbuji@borgia one-offs]$ cat html-to-xhtml-converter.py
import sys
from xml.dom.ext.reader import HtmlLib
import xml.dom.ext

#set up a re-usable reader object
reader = HtmlLib.Reader()

#parse HTML ffrom file or URI given on command line.  Return the DOM
document
doc = reader.fromUri(sys.argv[1])

#Just for kicks, write it out as XHTML, i.e. all lowercase, XML syntax for
empty tags, all attributes with given value, etc.

xml.dom.ext.XHtmlPrettyPrint(doc)

If anybody could point me in the right direction, If tried installing PyXML
but keep getting end-of line errors. After trying to correct these I keep
running into errors like, ReleaseNode not found; HtmlLib has no module named
Reader.

Any help as to how and what should be installed on MacPython 2.1 would be
greatly appreciated.

Hans


--MS_Mac_OE_3073455860_75874_MIME_Part
Content-type: text/html; charset="US-ASCII"
Content-transfer-encoding: quoted-printable

<HTML>
<HEAD>
<TITLE>HTML parsing on Python 2.1</TITLE>
</HEAD>
<BODY>
Hi,<BR>
<BR>
I am using a freshly installed MacPython 2.1 and would like to know what I =
should install extra to use the following script:<BR>
<FONT SIZE=3D"4"><FONT FACE=3D"Courier New"><BR>
[<FONT COLOR=3D"#0000FF"><U>uogbuji@borgia</U></FONT> one-offs]$ cat html-to-=
xhtml-converter.py <BR>
import sys<BR>
from xml.dom.ext.reader import HtmlLib<BR>
import xml.dom.ext<BR>
<BR>
#set up a re-usable reader object<BR>
reader =3D HtmlLib.Reader()<BR>
<BR>
#parse HTML ffrom file or URI given on command line. &nbsp;Return the DOM d=
ocument<BR>
doc =3D reader.fromUri(sys.argv[1])<BR>
<BR>
#Just for kicks, write it out as XHTML, i.e. all lowercase, XML syntax for =
<BR>
empty tags, all attributes with given value, etc.<BR>
<BR>
xml.dom.ext.XHtmlPrettyPrint(doc)<BR>
<BR>
If anybody could point me in the right direction, If tried installing PyXML=
 but keep getting end-of line errors. After trying to correct these I keep r=
unning into errors like, ReleaseNode not found; HtmlLib has no module named =
Reader.<BR>
<BR>
Any help as to how and what should be installed on MacPython 2.1 would be g=
reatly appreciated.<BR>
<BR>
Hans<BR>
<BR>
</FONT></FONT>
</BODY>
</HTML>


--MS_Mac_OE_3073455860_75874_MIME_Part--


From Alexandre.Fayolle@logilab.fr  Wed May 23 10:57:19 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Wed, 23 May 2001 11:57:19 +0200 (CEST)
Subject: [XML-SIG] ANN: Narval 1.0
Message-ID: <Pine.LNX.4.21.0105231156470.1970-100000@orion.logilab.fr>

Logilab (www.logilab.com) announces the release of

	Narval 1.0

	GPL'd Intelligent Personnal Assistant Framework
	
	http://www.logilab.org/narval


News
----

    The engine is now stable as it has been working nicely for the past three
    months. It's also much faster.

    The Horn GUI features lots of usability improvements.

    The infopal application (available separately) is now usable.


Description
-----------

Narval is a framework (language + interpreter + GUI/IDE) dedicated to the
setting up of intelligent personal assistants (IPAs).

An Intelligent Personal Assitant is a companion that will help you in your daily
work in the information world. It runs on your machine or on a remote server,
and you can communicate with it via all standard means (email, web, telnet,
phone, specific GUI, etc). It executes recipes (sequences of actions) you wrote,
to perform a wide range of tasks, such as prepare your morning newspaper, help
you surf the web by filtering out junk ads, keep searching the web day after day
for things you want, participe in on-line auctions, learn you interests and
bring you back valuable information, take care of repetitive chores, answer
e-mail, negociate the date and time of a meeting, and much more... It is easy to
extend the built in action library by writing new actions in Python.

Infopal, your information pal, is a Narval application that implements part of
the above, but Narval makes it easy for you to set up new assistants. Others
applications will soon be available from Logilab.

Logilab S.A. is a french company that specializes in the fields of artificial
intelligence, knowledge management, data analysis and natural language
processing.


More info
---------

Please see

        http://www.logilab.org/narval
        http://www.logilab.com
	http://www.logilab.fr

or contact	contact@logilab.fr

-- 
Alexandre Fayolle

http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From stuff4gary@hotmail.com  Wed May 23 14:40:41 2001
From: stuff4gary@hotmail.com (gary cor)
Date: Wed, 23 May 2001 13:40:41
Subject: [XML-SIG] XLST - Can't show JPEG image from XML abstraction to rendition
Message-ID: <F231IWOZmurKcDXUjRr00001ff4@hotmail.com>

Hi,

I hope someone can help!  I have set up some XSL files which use XLST 
methods to produce tables of information about images which works great!.. 
just using the MSXML 4.0 parser with explorer 5.5.  However, I can't get the 
cells which suppose to show my imagethumbnails to display any images at all 
(the transformations for the tables won't work when they have my x:link for 
them in the XML).

****    IN XSL  *****
<xsl:template match="image">
<xsl:value-of select="picture"/> etc.
</xsl:template>

****    IN XML  *****
<picture xlink:form="simple"
href="imageLibrary/Sky.jpg" show"embed"
actuate="auto">

I would be greatful if anyone has any suggestions on how I should go about 
including theses images .

Kind Regards

Gary C

PS  Also in a few months I would like to include SVG images for some 
illustrations that I have... I am under the understanding that I will have 
to use the svg.htc for explorer 5.5 and the <object> tag, does anyone know 
wether it is possible to use the same method for both  images and svg... is 
it easier for me to do with a python parser than the micorosoft one?

_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.


From mnot@mnot.net  Wed May 23 16:46:25 2001
From: mnot@mnot.net (Mark Nottingham)
Date: Wed, 23 May 2001 08:46:25 -0700
Subject: [XML-SIG] XML and Unicode
In-Reply-To: <3B0B68E6.9CBF7689@lemburg.com>; from mal@lemburg.com on Wed, May 23, 2001 at 09:38:14AM +0200
References: <20010522150638.C22396@mnot.net> <3B0AEA6A.9CCD2A1F@lemburg.com> <20010522193314.E22396@mnot.net> <3B0B68E6.9CBF7689@lemburg.com>
Message-ID: <20010523084622.A25059@mnot.net>

It's the em dash in the middle. If true, this behaviour would be a
bug, no? Is there any kind of workaround possible (such as detecting
the encoding of the XML file outside of the parser and .encode()ing
to suit)?

Thanks again,


On Wed, May 23, 2001 at 09:38:14AM +0200, M.-A. Lemburg wrote:
> Mark Nottingham wrote:
> > 
> > OK, so I'm not getting something then. The attached test script (and
> > data file) is the problem pared down - if u'string' is a neutral
> > encoding, and .encode('utf-8') generates a utf-8 encoded string of
> > that encoding, then the utf-8.html output file should display
> > correctly; however, it doesn't, while the latin-1 output does
> > (because the input is latin-1).
> > 
> > It seems like the XML parser isn't converting the ISO-8859-1 to
> > Unicode; does this make sense?
> 
> That's a possibility (even though I don't see any funny characters
> in your example XML file); looking through the pyexpat.c code
> it seems as if the parser assumes that the XML file is encoded 
> as UTF-8 -- at least all Unicode conversions are done using UTF-8.
> 
> -- 
> Marc-Andre Lemburg
> CEO eGenix.com Software GmbH
> ______________________________________________________________________
> Company & Consulting:                           http://www.egenix.com/
> Python Software:                        http://www.lemburg.com/python/

-- 
Mark Nottingham
http://www.mnot.net/


From rsalz@zolera.com  Wed May 23 17:10:39 2001
From: rsalz@zolera.com (Rich Salz)
Date: Wed, 23 May 2001 12:10:39 -0400
Subject: [XML-SIG] Web services non-SIG
Message-ID: <3B0BE0FF.56CAB4FC@zolera.com>

Guido is unconvinced of the longer-term viability of a separate Web
Services SIG, and since we have no desire to add to his administrivia,
for now we're going to use a SourceForge project.

In particular, the "pywebsvcs-talk" mailing list is intended for
discussion of Python and Web Sevices.  To join, visit
	http://lists.sourceforge.net/lists/listinfo/pywebsvcs-talk

The pywebsvcs project also has a developer's mailing list, and hopefully
will soon have a CVS tree with <gasp>sources.

If nobody objects, I'll add a link to the pywebsvcs project in the pyxml
web page (htdocs/links.h it seems). 
	/r$


From martin@loewis.home.cs.tu-berlin.de  Wed May 23 19:28:21 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 23 May 2001 20:28:21 +0200
Subject: [XML-SIG] Re: [4suite] Disentangling StylesheetReader from Ft.Lib
In-Reply-To: <m1ofslw6f3.fsf@localhost.localdomain> (message from Karl
 Anderson on 22 May 2001 12:05:36 -0700)
References: <200105131902.f4DJ2KK14103@mira.informatik.hu-berlin.de> <3AFF3169.29F2B6C8@FourThought.com> <m1r8xixofk.fsf@localhost.localdomain> <3B09AB81.910B27F2@FourThought.com> <m166euxhk2.fsf@localhost.localdomain> <200105220515.f4M5FBi00961@mira.informatik.hu-berlin.de> <m1ofslw6f3.fsf@localhost.localdomain>
Message-ID: <200105231828.f4NISLP01544@mira.informatik.hu-berlin.de>

> Just to be clear, is PyXML's XSLT intended to work with already
> created DOM trees as well?

My immediate target is to make it work with minidom, without
pDomlette.  That will initially be tested by reading both the
stylesheet and the document through a parser, but I can't see anything
preventing usage of pre-loaded trees for the document or the
stylesheet.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Wed May 23 21:01:50 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 23 May 2001 22:01:50 +0200
Subject: [XML-SIG] XML and Unicode
In-Reply-To: <20010522150638.C22396@mnot.net> (message from Mark Nottingham on
 Tue, 22 May 2001 15:06:41 -0700)
References: <20010522150638.C22396@mnot.net>
Message-ID: <200105232001.f4NK1ot02120@mira.informatik.hu-berlin.de>

> How does one detect the charset used in an XML document from a SAX2
> parser (PyXML 0.6.5)?

That is not supported in SAX. The underlying parser may expose this
information; but that is of course parser dependent.

> Also, if I have an XML document encoded ISO-8851-1 (and properly
> identified), should I have a reasonable expectation that the output
> of a SAX processor, post- .encode('utf-8'), should be correct if
> viewed in a Web browser with UTF-8 selected as a character encoding?

Not necessarily. If the document was a HTML document, and if it
has a

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

line, then the browser has to decide whether it leaves the XML header
or the Content-Type. It would normally use the content type, which
would be incorrect.

If there is no incorrect character set information in the output
document, then a receiver should display it properly.

Of course, whether a Web browser can "correctly" display arbitrary XML
documents is a different question.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Wed May 23 21:04:11 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 23 May 2001 22:04:11 +0200
Subject: [XML-SIG] XML and Unicode
In-Reply-To: <20010522193314.E22396@mnot.net> (message from Mark Nottingham on
 Tue, 22 May 2001 19:33:18 -0700)
References: <20010522150638.C22396@mnot.net> <3B0AEA6A.9CCD2A1F@lemburg.com> <20010522193314.E22396@mnot.net>
Message-ID: <200105232004.f4NK4Bo02122@mira.informatik.hu-berlin.de>

> It seems like the XML parser isn't converting the ISO-8859-1 to
> Unicode; does this make sense?

As others have explained, your document is really Windows CP 1252, not
ISO 8859 1 encoded.

If you consider the document as ISO-8859-1, then the parser *will*
convert it correctly.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Wed May 23 21:15:06 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 23 May 2001 22:15:06 +0200
Subject: [XML-SIG] XML and Unicode
In-Reply-To: <20010523084622.A25059@mnot.net> (message from Mark Nottingham on
 Wed, 23 May 2001 08:46:25 -0700)
References: <20010522150638.C22396@mnot.net> <3B0AEA6A.9CCD2A1F@lemburg.com> <20010522193314.E22396@mnot.net> <3B0B68E6.9CBF7689@lemburg.com> <20010523084622.A25059@mnot.net>
Message-ID: <200105232015.f4NKF6I02205@mira.informatik.hu-berlin.de>

> > That's a possibility (even though I don't see any funny characters
> > in your example XML file); looking through the pyexpat.c code
> > it seems as if the parser assumes that the XML file is encoded 
> > as UTF-8 -- at least all Unicode conversions are done using UTF-8.
> > 
> It's the em dash in the middle. If true, this behaviour would be a
> bug, no?

It would be a bug, but pyexpat works correctly. expat indeed does
guarantee that all text is UTF-8, because it converts the file from
any input encoding to UTF-8 before passing it to the application.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Wed May 23 21:29:06 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 23 May 2001 22:29:06 +0200
Subject: [XML-SIG] New parsers in 4XPath and 4XSLT
In-Reply-To: <003101c0e312$0e4519e0$f803a8c0@dhcp.fourthought.comfourthought.com>
 (jeremy.kloth@fourthought.com)
References: <003101c0e312$0e4519e0$f803a8c0@dhcp.fourthought.comfourthought.com>
Message-ID: <200105232029.f4NKT6O02302@mira.informatik.hu-berlin.de>

> The new generated parsers in XPath and XSLT are now created in a more
> factory-ish method.  The parsers are now referenced from:
> xml.(xpath|xslt).parser   This allows for the changing of parsers easily.
> To create a runtime parser, call parser.new().  And to parse expressions
> simply use the parse() method on the created object.

Great! I hope I can look into that shortly.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Wed May 23 21:28:32 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 23 May 2001 22:28:32 +0200
Subject: [XML-SIG] HTML parsing on Python 2.1
In-Reply-To: <B73136F3.6A02%hansv@net4all.be> (message from Hans verschooten
 on Wed, 23 May 2001 09:44:20 +0200)
References: <B73136F3.6A02%hansv@net4all.be>
Message-ID: <200105232028.f4NKSWW02300@mira.informatik.hu-berlin.de>

> I am using a freshly installed MacPython 2.1 and would like to know
> what I should install extra to use the following script:

It works fine for me with Python 2.1 on Linux, using PyXML 0.6.5(+).

> If anybody could point me in the right direction, If tried
> installing PyXML but keep getting end-of line errors. After trying
> to correct these I keep running into errors like, ReleaseNode not
> found; HtmlLib has no module named Reader.

That is quite unspecific: what exactly did you try, and what exactly
happened?

Regards,
Martin


From mnot@mnot.net  Wed May 23 21:44:23 2001
From: mnot@mnot.net (Mark Nottingham)
Date: Wed, 23 May 2001 13:44:23 -0700
Subject: [XML-SIG] XML and Unicode
In-Reply-To: <200105232015.f4NKF6I02205@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Wed, May 23, 2001 at 10:15:06PM +0200
References: <20010522150638.C22396@mnot.net> <3B0AEA6A.9CCD2A1F@lemburg.com> <20010522193314.E22396@mnot.net> <3B0B68E6.9CBF7689@lemburg.com> <20010523084622.A25059@mnot.net> <200105232015.f4NKF6I02205@mira.informatik.hu-berlin.de>
Message-ID: <20010523134419.A4434@mnot.net>

Martin,

Thanks. If that's the case, what's happening here (see test script)?
The source text, when written directly to HTML and identified as
ISO-8859-1, correctly displays. when parsed by pyexpat, the resulting
unicode string, .encode('UTF-8') and included in HTML identified as
UTF-8 does not display correctly.   

I'm not sure I understand your previous message - noone has suggested
that it's Windows CP 1252 (although I may have missed messages), and
I'm not sure what you mean by 'consider the document as ISO-8859-1';
I'm feeding a document into an XML parser with encoding="ISO-8859-1",
and getting unicode strings out of it. What mechanism do I have to
consider it as having a particular encoding, beyond the XML
declaration? I've been given the impression that unicode strings are
encoding-neutral.

Cheers & thanks,


On Wed, May 23, 2001 at 10:15:06PM +0200, Martin v. Loewis wrote:
> > > That's a possibility (even though I don't see any funny
> > > characters in your example XML file); looking through the
> > > pyexpat.c code it seems as if the parser assumes that the XML
> > > file is encoded as UTF-8 -- at least all Unicode conversions
> > > are done using UTF-8.
> > > 
> > It's the em dash in the middle. If true, this behaviour would be
> > a bug, no?
> 
> It would be a bug, but pyexpat works correctly. expat indeed does
> guarantee that all text is UTF-8, because it converts the file from
> any input encoding to UTF-8 before passing it to the application.
> 
> Regards,
> Martin
> 


On Wed, May 23, 2001 at 10:04:11PM +0200, Martin v. Loewis wrote:
> > It seems like the XML parser isn't converting the ISO-8859-1 to
> > Unicode; does this make sense?
> 
> As others have explained, your document is really Windows CP 1252,
> not ISO 8859 1 encoded.
> 
> If you consider the document as ISO-8859-1, then the parser *will*
> convert it correctly.
> 
> Regards,
> Martin

-- 
Mark Nottingham
http://www.mnot.net/


From martin@loewis.home.cs.tu-berlin.de  Wed May 23 23:37:03 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 24 May 2001 00:37:03 +0200
Subject: [XML-SIG] XML and Unicode
In-Reply-To: <20010523134419.A4434@mnot.net> (message from Mark Nottingham on
 Wed, 23 May 2001 13:44:23 -0700)
References: <20010522150638.C22396@mnot.net> <3B0AEA6A.9CCD2A1F@lemburg.com> <20010522193314.E22396@mnot.net> <3B0B68E6.9CBF7689@lemburg.com> <20010523084622.A25059@mnot.net> <200105232015.f4NKF6I02205@mira.informatik.hu-berlin.de> <20010523134419.A4434@mnot.net>
Message-ID: <200105232237.f4NMb3p03391@mira.informatik.hu-berlin.de>

> I'm not sure I understand your previous message - noone has suggested
> that it's Windows CP 1252 (although I may have missed messages), and
> I'm not sure what you mean by 'consider the document as ISO-8859-1';
> I'm feeding a document into an XML parser with encoding="ISO-8859-1",
> and getting unicode strings out of it. 

There simply is no em-dash in ISO-8859-1; this is a Microsoft
invention.  Microsoft organizes character sets in code pages (an idea
taken from IBM). For Code Page 1252, we have the character assignments

<-N>                   /x96   <U2013> EN DASH
<-M>                   /x97   <U2014> EM DASH

So the characters '\x96' and '\x97', when interpreted as CP 1252,
identify EN DASH and EM DASH, respectively.

In ISO 8859-1, these characters have the meanings

<SG>                   /x96   <U0096> START OF GUARDED AREA (SPA)
<EG>                   /x97   <U0097> END OF GUARDED AREA (EPA)

As you can see, they are considered control characters in ISO-8859-1.
So if you want the character to be treated as EM DASH, you should
identify the character set as CP 1252, not ISO-8859-1.

Doing so, in turn, will result in the Unicode characters U+2013 and
U+2014 being used, instead of the Unicode characters U+0096 and U+0097
(which identify control characters).

Now, assuming that you correctly identify your character set, XML
parsers may refuse your document in case they don't know what cp-1252
is. Even if that succeeds, converting the resulting Unicode strings to
ISO-8859-1 will fail, as EM DASH has no representation in that
character set. Of course, conversion into UTF-8 will succeed in any
case - all Unicode characters are representable in UTF-8

> What mechanism do I have to consider it as having a particular
> encoding, beyond the XML declaration?

Sorry, I cannot understand this question; please rephrase.

> I've been given the impression that unicode strings are
> encoding-neutral.

That impression is correct. Unfortunately, byte-oriented files are not
encoding-neutral, so when you read or write from/to a byte stream, you
have to know its encoding.

Regards,
Martin

P.S. If you have a browser that displays '\x96' as EN DASH even if the
encoding is ISO-8859-1, this browser is broken - it should treat the
character as START OF GUARDED AREA. I could not figure out what the
exact meaning of this character is, something along the lines: text
between SPA and EPA is "guarded", i.e. it cannot be edited or cleared.
I doubt any browser implements that.


From mnot@mnot.net  Wed May 23 23:55:02 2001
From: mnot@mnot.net (Mark Nottingham)
Date: Wed, 23 May 2001 15:55:02 -0700
Subject: [XML-SIG] XML and Unicode
In-Reply-To: <200105232237.f4NMb3p03391@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Thu, May 24, 2001 at 12:37:03AM +0200
References: <20010522150638.C22396@mnot.net> <3B0AEA6A.9CCD2A1F@lemburg.com> <20010522193314.E22396@mnot.net> <3B0B68E6.9CBF7689@lemburg.com> <20010523084622.A25059@mnot.net> <200105232015.f4NKF6I02205@mira.informatik.hu-berlin.de> <20010523134419.A4434@mnot.net> <200105232237.f4NMb3p03391@mira.informatik.hu-berlin.de>
Message-ID: <20010523155458.C4434@mnot.net>

On Thu, May 24, 2001 at 12:37:03AM +0200, Martin v. Loewis wrote:

> There simply is no em-dash in ISO-8859-1; this is a Microsoft
> invention.  Microsoft organizes character sets in code pages (an idea
> taken from IBM). For Code Page 1252, we have the character assignments
[...]
> P.S. If you have a browser that displays '\x96' as EN DASH even if the
> encoding is ISO-8859-1, this browser is broken - it should treat the
> character as START OF GUARDED AREA. 

Ah! That explains it. Thank you very much. Both IE and Mozilla
display this character as an em dash when the encoding is set to
ISO-8859-1 (and a few others). Very confusing.

Thanks again, 


-- 
Mark Nottingham
http://www.mnot.net/


From DKGunter@lbl.gov  Thu May 24 02:46:54 2001
From: DKGunter@lbl.gov (Dan Gunter)
Date: Wed, 23 May 2001 18:46:54 -0700
Subject: [XML-SIG] PythonWorks SOAP
Message-ID: <3B0C680E.54971ECB@lbl.gov>

I have been using PythonWorks' soaplib.py in a project, and although
it is not bad I was hoping that some of the rough edges would get
polished in the next release. But the next (0.9) version does not seem
forthcoming. My question is: does anyone know when this might happen
and/or what SOAP library is being more actively worked on? Thanks in
advance,

-- 
# 
# Dan Gunter
# http://www-didc.lbl.gov/~dang/
#


From sallyd@internationalexhibits.com  Thu May 24 15:19:37 2001
From: sallyd@internationalexhibits.com (Sally Daugherty)
Date: Thu, 24 May 2001 07:19:37 -0700
Subject: [XML-SIG] European display solutions that can reduce the impact of the rising gasoline prices.
Message-ID: <01C0E424.808F3B80@tc03-20-204.tscnet.net>

International Exhibits Inc. is performing a beta test on an e-mail =
marketing campaign to promote our product lines.  The intent is to =
provide an unobtrusive, cost-effective and environmentally friendly =
marketing campaign (compared to bulk mailings and fax-grams that kill =
trees).  We would appreciate your thoughts on our approach.  If you want =
to be removed from our data base please reply with the word "remove" in =
the subject line.  Our hope is that you will visit our web site at =
http://www.internationalexhibits.com. =20

Note:  International Exhibits manufactures 7 product lines and is a =
distributor for another 20 product lines.  I added an attachment on =
several new European Product lines that will be shortly introduced on =
our web site.  We believe that these items will be a cost-effective =
solution to the rising gasoline prices.

If you project any display needs please feel free to contact me at =
www.internatinalexhibits.com or by telephone at (360)769-9726.

Warm Regards,

Sally Daugherty
General Manager
International Exhibits, Inc.


From doc@sympatico.ca  Thu May 24 18:25:21 2001
From: doc@sympatico.ca (DOC)
Date: Thu, 24 May 2001 10:25:21 -0700
Subject: [XML-SIG] PythonWorks SOAP
References: <3B0C680E.54971ECB@lbl.gov>
Message-ID: <004501c0e476$9c7597c0$20afd1d8@c5y3j01>

What do you need? Perhaps I can help.

I have been playing around with XML recently and am looking
for something more substantial to work on.

DOC


----- Original Message ----- 
From: "Dan Gunter" <dkgunter@lbl.gov>
To: <xml-sig@python.org>
Sent: Wednesday, May 23, 2001 6:46 PM
Subject: [XML-SIG] PythonWorks SOAP


> I have been using PythonWorks' soaplib.py in a project, and although
> it is not bad I was hoping that some of the rough edges would get
> polished in the next release. But the next (0.9) version does not seem
> forthcoming. My question is: does anyone know when this might happen
> and/or what SOAP library is being more actively worked on? Thanks in
> advance,
> 
> -- 
> # 
> # Dan Gunter
> # http://www-didc.lbl.gov/~dang/
> #
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig


From eliot@isogen.com  Thu May 24 15:22:06 2001
From: eliot@isogen.com (W. Eliot Kimber)
Date: Thu, 24 May 2001 09:22:06 -0500
Subject: [XML-SIG] Messengers in DOM and XSLT Processors
Message-ID: <3B0D190E.A4FD12F1@isogen.com>

Using the framework provided with James Clark's SP parser, GroveMinder
(a commercial grove and HyTime implementation sold by Epremis
(www.epremis.com)) provides a very handy messenger facility where you
pass in a callback that takes a structured message as input. Using this
an application can collect messages and do something cool with them. We
use GroveMinder in our distributed link management system and use its
messenger facility. We also use the Python DOM and 4Suite XSLT processor
to do server-side processing and we need to be able to capture messages
and return them to the client. We already have a general messenger
framework in our client and server code. I need to add support for
messengers to the Python DOM and XSLT processors.

Before I dive into this--is this something that's already there and I
just haven't noticed it (I can't claim to have studied every line of
code in detail) or can anyone offer any tips on how to proceed or things
to avoid? 

We would, of course, be contributing any messenger support we added back
to the project (and I'm still working on completing my DOM
fixes/enhancements and packing those up as patches--I should be able to
get that together by the end of next week as it's finally become a
priority here).

Thanks,

Eliot
-- 
. . . . . . . . . . . . . . . . . . . . . . . .

W. Eliot Kimber | Lead Brain

1016 La Posada Dr. | Suite 240 | Austin TX  78752
    T 512.656.4139 |  F 512.419.1860 | eliot@isogen.com

w w w . d a t a c h a n n e l . c o m


From rsalz@zolera.com  Thu May 24 16:07:47 2001
From: rsalz@zolera.com (Rich Salz)
Date: Thu, 24 May 2001 11:07:47 -0400
Subject: [XML-SIG] PythonWorks SOAP
References: <3B0C680E.54971ECB@lbl.gov> <004501c0e476$9c7597c0$20afd1d8@c5y3j01>
Message-ID: <3B0D23C3.A8310894@zolera.com>

> I have been using PythonWorks' soaplib.py in a project, and although
> it is not bad I was hoping that some of the rough edges would get
> polished in the next release. But the next (0.9) version does not seem
> forthcoming. My question is: does anyone know when this might happen
> and/or what SOAP library is being more actively worked on? Thanks in
> advance,

There are a couple of python SOAP projects that have more active
development right now.  Look at SOAP.py (www.actzero.com), SOAPy
(soapy.sf.net), and shortly after the weekend ZSI (web location not
yet).  4thought has a SOAP ipmlementation used with their RDF stuff, I
recall.

You might also want to go to pywebsvcs.sf.net, the home of a
just-starting group of folks in the python and web services area.
	/r$


From uche.ogbuji@fourthought.com  Thu May 24 19:02:54 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Thu, 24 May 2001 12:02:54 -0600
Subject: [XML-SIG] Messengers in DOM and XSLT Processors
In-Reply-To: Message from "W. Eliot Kimber" <eliot@isogen.com>
 of "Thu, 24 May 2001 09:22:06 CDT." <3B0D190E.A4FD12F1@isogen.com>
Message-ID: <200105241802.f4OI2sM06596@localhost.local>

> Using the framework provided with James Clark's SP parser, GroveMinder
> (a commercial grove and HyTime implementation sold by Epremis
> (www.epremis.com)) provides a very handy messenger facility where you
> pass in a callback that takes a structured message as input. Using this
> an application can collect messages and do something cool with them. We
> use GroveMinder in our distributed link management system and use its
> messenger facility. We also use the Python DOM and 4Suite XSLT processor
> to do server-side processing and we need to be able to capture messages
> and return them to the client. We already have a general messenger
> framework in our client and server code. I need to add support for
> messengers to the Python DOM and XSLT processors.
> 
> Before I dive into this--is this something that's already there and I
> just haven't noticed it (I can't claim to have studied every line of
> code in detail) or can anyone offer any tips on how to proceed or things
> to avoid? 

I think you'd be a pioneer on this one, but I do appreciate your interest in 
taking a few arrows.  I think that decoupled access to DOM and 4XSLT would be 
very useful in general.


> We would, of course, be contributing any messenger support we added back
> to the project (and I'm still working on completing my DOM
> fixes/enhancements and packing those up as patches--I should be able to
> get that together by the end of next week as it's finally become a
> priority here).

Which DOM are you using?  4DOM? minidom? other?


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From eliot@isogen.com  Thu May 24 19:09:39 2001
From: eliot@isogen.com (W. Eliot Kimber)
Date: Thu, 24 May 2001 13:09:39 -0500
Subject: [XML-SIG] Messengers in DOM and XSLT Processors
References: <200105241802.f4OI2sM06596@localhost.local>
Message-ID: <3B0D4E63.991D3699@isogen.com>

Uche Ogbuji wrote:

> > We would, of course, be contributing any messenger support we added back
> > to the project (and I'm still working on completing my DOM
> > fixes/enhancements and packing those up as patches--I should be able to
> > get that together by the end of next week as it's finally become a
> > priority here).
> 
> Which DOM are you using?  4DOM? minidom? other?

4DOM, as far as I know.

Cheers,

E.

-- 
. . . . . . . . . . . . . . . . . . . . . . . .

W. Eliot Kimber | Lead Brain

1016 La Posada Dr. | Suite 240 | Austin TX  78752
    T 512.656.4139 |  F 512.419.1860 | eliot@isogen.com

w w w . d a t a c h a n n e l . c o m


From amorgan@mitre.org  Thu May 24 23:39:21 2001
From: amorgan@mitre.org (Alex Morgan)
Date: Thu, 24 May 2001 18:39:21 -0400
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
Message-ID: <3B0D8D99.B94E7542@mitre.org>

When an xml.parsers.expat parser handles CDATA with an '&lt;' in it, it
turns this into a '<' when it processes it.  How can I stop this
behavior?  

I apologize if this was discussed recently in the mailing list or is in
the documentation.  I have looked in both areas, but may have missed it.

Thanks,

-- 
-Alex Morgan		Homepage:   http://pubpages.unh.edu/~amorgan
			AIM login:  HomeySage
			Phone:	    (781) 271-6306
			Office:     3K-136, 202 Burlington Rd, Bedford, MA


From martin@loewis.home.cs.tu-berlin.de  Fri May 25 01:06:35 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 25 May 2001 02:06:35 +0200
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
In-Reply-To: <3B0D8D99.B94E7542@mitre.org> (message from Alex Morgan on Thu,
 24 May 2001 18:39:21 -0400)
References: <3B0D8D99.B94E7542@mitre.org>
Message-ID: <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de>

> When an xml.parsers.expat parser handles CDATA with an '&lt;' in it, it
> turns this into a '<' when it processes it.

It does not do this for me, using PyXML 0.6.5 on Linux. Can you give a
specific example where markup in a CDATA section is interpreted?

Regards,
Martin


From Juergen Hermann" <jh@web.de  Fri May 25 02:17:26 2001
From: Juergen Hermann" <jh@web.de (Juergen Hermann)
Date: Fri, 25 May 2001 03:17:26 +0200
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
In-Reply-To: <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de>
Message-ID: <m1536Cs-007Yb7C@smtp.web.de>

On Fri, 25 May 2001 02:06:35 +0200, Martin v. Loewis wrote:

>> When an xml.parsers.expat parser handles CDATA with an '&lt;' in it, =
it
>> turns this into a '<' when it processes it.
>
>It does not do this for me, using PyXML 0.6.5 on Linux. Can you give a
>specific example where markup in a CDATA section is interpreted?

Also, is Alex talking about a CDATA section, or is he mixing up PCDATA w=
ith 
CDATA?

<t>PCDATA &lt;</t>

<t><![CDATA[CDATA <]]></t>


Ciao, J=FCrgen


From fdrake@acm.org  Fri May 25 05:13:46 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 25 May 2001 00:13:46 -0400 (EDT)
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
In-Reply-To: <m1536Cs-007Yb7C@smtp.web.de>
References: <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de>
 <m1536Cs-007Yb7C@smtp.web.de>
Message-ID: <15117.56314.662015.891593@cj42289-a.reston1.va.home.com>


Juergen Hermann writes:
 > On Fri, 25 May 2001 02:06:35 +0200, Martin v. Loewis wrote:
 > >It does not do this for me, using PyXML 0.6.5 on Linux. Can you give a
 > >specific example where markup in a CDATA section is interpreted?
 > 
 > Also, is Alex talking about a CDATA section, or is he mixing up PCDATA with
 > CDATA?
 > 
 > <t>PCDATA &lt;</t>
 > 
 > <t><![CDATA[CDATA <]]></t>

  Or a CDATA attribute value?  (My first guess.)

	<t attr="The &lt; symbol is cool!" />


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From Alexandre.Fayolle@logilab.fr  Fri May 25 08:34:38 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Fri, 25 May 2001 09:34:38 +0200 (CEST)
Subject: [XML-SIG] external entities and CDATA sections
Message-ID: <Pine.LNX.4.21.0105250930240.6432-100000@orion.logilab.fr>

Hello,

While writing some documentation, I wanted to include some python code in
a docbook document. My first thought was using an external entity
referencing the source file. However, the code has some interger
comparison code, and features a couple of '<' characters, so it should be
set in a CDATA section for proper handling. This in turn prevents the
resolution of the external entity. 

How would the XML experts on the list tackle this?

TIA

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From rsalz@zolera.com  Fri May 25 14:00:19 2001
From: rsalz@zolera.com (Rich Salz)
Date: Fri, 25 May 2001 09:00:19 -0400
Subject: [XML-SIG] cStringIO
Message-ID: <3B0E5763.FC2ED68E@zolera.com>

Are there any guidelines as to when it isn't safe to use cStringIO?  As
long as everyone sticks to UTF-8 it should work, right?  I know that
incoming XML might have some other encoding, but if I'm using SAX or
DOM, it will have been converted, right?
	/r$


From fdrake@acm.org  Fri May 25 15:27:58 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 25 May 2001 10:27:58 -0400 (EDT)
Subject: [XML-SIG] cStringIO
In-Reply-To: <3B0E5763.FC2ED68E@zolera.com>
References: <3B0E5763.FC2ED68E@zolera.com>
Message-ID: <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>


Rich Salz writes:
 > Are there any guidelines as to when it isn't safe to use cStringIO?  As
 > long as everyone sticks to UTF-8 it should work, right?  I know that
 > incoming XML might have some other encoding, but if I'm using SAX or
 > DOM, it will have been converted, right?

  cStringIO works with 8-bit strings, regardless of the encoding.  It
does not work with non-ASCII Unicode strings.  Fixing that is on my
plate, but I don't have time allotted for it yet.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From uche.ogbuji@fourthought.com  Fri May 25 16:02:09 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Fri, 25 May 2001 09:02:09 -0600
Subject: [XML-SIG] external entities and CDATA sections
In-Reply-To: Message from Alexandre Fayolle <Alexandre.Fayolle@logilab.fr>
 of "Fri, 25 May 2001 09:34:38 +0200." <Pine.LNX.4.21.0105250930240.6432-100000@orion.logilab.fr>
Message-ID: <200105251502.f4PF29313415@localhost.local>

> Hello,
> 
> While writing some documentation, I wanted to include some python code in
> a docbook document. My first thought was using an external entity
> referencing the source file. However, the code has some interger
> comparison code, and features a couple of '<' characters, so it should be
> set in a CDATA section for proper handling. This in turn prevents the
> resolution of the external entity. 
> 
> How would the XML experts on the list tackle this?

Unfortunately there is no easy solution.  You'd have to wrap all the entities 
in

]]>&entity;<![CDATA[

None of the other potential solutions, XInclude, XLink of embed type, etc., 
would help here.

Of course you can always not use CDATA sections and just &lt; escape what you 
need to.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From Alexandre.Fayolle@logilab.fr  Fri May 25 16:09:56 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Fri, 25 May 2001 17:09:56 +0200 (CEST)
Subject: [XML-SIG] external entities and CDATA sections
In-Reply-To: <200105251502.f4PF29313415@localhost.local>
Message-ID: <Pine.LNX.4.21.0105251707560.8076-100000@leo.logilab.fr>

On Fri, 25 May 2001, Uche Ogbuji wrote:

> Of course you can always not use CDATA sections and just &lt; escape what you 
> need to.

The idea was using the code 'as is' in the documentation to avoid
maintaining both the escaped and runnable version. I'll go one using a
small script to escape the examples when generating the documentation.

Thanks.

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From fdrake@acm.org  Fri May 25 16:55:36 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 25 May 2001 11:55:36 -0400 (EDT)
Subject: [XML-SIG] cStringIO
In-Reply-To: <3B0E7D16.3FA06CE8@zolera.com>
References: <3B0E5763.FC2ED68E@zolera.com>
 <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>
 <3B0E7D16.3FA06CE8@zolera.com>
Message-ID: <15118.32888.681428.716667@cj42289-a.reston1.va.home.com>

Rich Salz writes:
 > Sorry, I don't understand.  What's a non-ASCII unicode string?
 > Something with the high-bit on?  If so, then doesn't httplib.py
 > have a problem using cStringIO ?

  Yes; any Unicode string that contains non-ASCII characters can't be
converted to an 8-bit string correctly since the ASCII encoding is
used by default (and there's no way to tell cStringIO to use a
different encoding).
  Why would httplib have a problem with cStringIO?  Pulling data over
a socket always yields 8-bit strings, which work just fine with
cStringIO regardless of the high bit.  The problems with the cStringIO
and Unicode are based entirely on the implicit conversion of the
Unicode to ASCII, not the 8th bit per se.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From rsalz@zolera.com  Fri May 25 16:41:10 2001
From: rsalz@zolera.com (Rich Salz)
Date: Fri, 25 May 2001 11:41:10 -0400
Subject: [XML-SIG] cStringIO
References: <3B0E5763.FC2ED68E@zolera.com> <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>
Message-ID: <3B0E7D16.3FA06CE8@zolera.com>

>   cStringIO works with 8-bit strings, regardless of the encoding.  It
> does not work with non-ASCII Unicode strings.  Fixing that is on my
> plate, but I don't have time allotted for it yet.

Sorry, I don't understand.  What's a non-ASCII unicode string?
Something with the high-bit on?  If so, then doesn't httplib.py
have a problem using cStringIO ?

tnx.
	/r$
PS:  #define UNLESS ... ?  Someone has a PERL sense of humor. :)
	/r$


From amorgan@mitre.org  Fri May 25 17:19:30 2001
From: amorgan@mitre.org (Alex Morgan)
Date: Fri, 25 May 2001 12:19:30 -0400
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
References: <3B0D8D99.B94E7542@mitre.org> <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de>
Message-ID: <3B0E8612.45BAF00D@mitre.org>

An example of the behavior I am talking about is input that includes the
following:

'<reference> Morse &amp; Feshbach </reference>'

With a CDATA handler:

'def char_data(data):
	print data'

Will return 'Morse & Feshbach', when I would like it to return the
original string, as is.


"Martin v. Loewis" wrote:
> 
> > When an xml.parsers.expat parser handles CDATA with an '&lt;' in it, it
> > turns this into a '<' when it processes it.
> 
> It does not do this for me, using PyXML 0.6.5 on Linux. Can you give a
> specific example where markup in a CDATA section is interpreted?
> 
> Regards,
> Martin

-- 
-Alex Morgan		Homepage:   http://pubpages.unh.edu/~amorgan
			AIM login:  HomeySage
			Phone:	    (781) 271-6306
			Office:     3K-136, 202 Burlington Rd, Bedford, MA


From rsalz@zolera.com  Fri May 25 17:24:08 2001
From: rsalz@zolera.com (Rich Salz)
Date: Fri, 25 May 2001 12:24:08 -0400
Subject: [XML-SIG] cStringIO
References: <3B0E5763.FC2ED68E@zolera.com>
 <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>
 <3B0E7D16.3FA06CE8@zolera.com> <15118.32888.681428.716667@cj42289-a.reston1.va.home.com>
Message-ID: <3B0E8728.365E67B@zolera.com>

It looks to me (from skimming the code in cStringIO.c), that the code
is 8bit transparent.  I thought UTF-8 made all multi-byte values have
the 8th bit on.  So, if I'm using cStringIO I should be okay, if I'm
just using cStringIO to transport data, or maybe do readline or
similar.  Once I need to look at individual characters, I'm hosed.  But
if I want to collect the value ofa bunch of TEXT_NODE elements and
output them, wont' that work?
	/r$


From fdrake@acm.org  Fri May 25 17:29:31 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 25 May 2001 12:29:31 -0400 (EDT)
Subject: [XML-SIG] cStringIO
In-Reply-To: <3B0E8728.365E67B@zolera.com>
References: <3B0E5763.FC2ED68E@zolera.com>
 <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>
 <3B0E7D16.3FA06CE8@zolera.com>
 <15118.32888.681428.716667@cj42289-a.reston1.va.home.com>
 <3B0E8728.365E67B@zolera.com>
Message-ID: <15118.34923.55835.44275@cj42289-a.reston1.va.home.com>


Rich Salz writes:
 > It looks to me (from skimming the code in cStringIO.c), that the code
 > is 8bit transparent.  I thought UTF-8 made all multi-byte values have
 > the 8th bit on.  So, if I'm using cStringIO I should be okay, if I'm
 > just using cStringIO to transport data, or maybe do readline or

  That's correct.

 > similar.  Once I need to look at individual characters, I'm hosed.  But
 > if I want to collect the value ofa bunch of TEXT_NODE elements and
 > output them, wont' that work?

  The *only* problem involves Unicode objects, not Unicode data
encoded in 8-bit strings.  So if you're TEXT_NODE objects actually
contain 8-bit strings, it'll work just fine.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From fdrake@acm.org  Fri May 25 17:32:50 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 25 May 2001 12:32:50 -0400 (EDT)
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
In-Reply-To: <3B0E8612.45BAF00D@mitre.org>
References: <3B0D8D99.B94E7542@mitre.org>
 <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de>
 <3B0E8612.45BAF00D@mitre.org>
Message-ID: <15118.35122.442238.574572@cj42289-a.reston1.va.home.com>


Alex Morgan writes:
 > Will return 'Morse & Feshbach', when I would like it to return the
 > original string, as is.

  Set the DefaultHandler attribute of the parser object; it will be
called with the unexpanded entity reference as a string '&amp;'.
Depending on what other handlers you set, it may get other things as
well, but always in the marked-up form rather than the interpreted
form.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From martin@loewis.home.cs.tu-berlin.de  Fri May 25 21:15:42 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 25 May 2001 22:15:42 +0200
Subject: [XML-SIG] cStringIO
In-Reply-To: <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>
 (fdrake@acm.org)
References: <3B0E5763.FC2ED68E@zolera.com> <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>
Message-ID: <200105252015.f4PKFga01183@mira.informatik.hu-berlin.de>

>   cStringIO works with 8-bit strings, regardless of the encoding.  It
> does not work with non-ASCII Unicode strings.  Fixing that is on my
> plate, but I don't have time allotted for it yet.

One issue of reading UTF-8, whether from cStringIO or elsewhere, might
break result strings inside a character (i.e. between character
boundaries). So be careful with applying unicode() or .decode on such
a string - you may have to save some bytes for the next .read() call.

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Fri May 25 21:27:28 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 25 May 2001 22:27:28 +0200
Subject: [XML-SIG] cStringIO
In-Reply-To: <3B0E8728.365E67B@zolera.com> (message from Rich Salz on Fri, 25
 May 2001 12:24:08 -0400)
References: <3B0E5763.FC2ED68E@zolera.com>
 <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>
 <3B0E7D16.3FA06CE8@zolera.com> <15118.32888.681428.716667@cj42289-a.reston1.va.home.com> <3B0E8728.365E67B@zolera.com>
Message-ID: <200105252027.f4PKRS601186@mira.informatik.hu-berlin.de>

> It looks to me (from skimming the code in cStringIO.c), that the code
> is 8bit transparent.  I thought UTF-8 made all multi-byte values have
> the 8th bit on.  So, if I'm using cStringIO I should be okay, if I'm
> just using cStringIO to transport data, or maybe do readline or
> similar.  Once I need to look at individual characters, I'm hosed.  But
> if I want to collect the value ofa bunch of TEXT_NODE elements and
> output them, wont' that work?

Depends on how exactly you do that. If you just write the text.data
attribute to the cStringIO, it might fail, if text.data is a Unicode
object (please note that a string object that is UTF-8-encoded is
*not* a Unicode object, it is a byte string).

To see the problem, do

import cStringIO 
o = cStringIO.StringIO()
o.write(u"My 0.02\N{EURO SIGN}")

Regards,
Martin


From martin@loewis.home.cs.tu-berlin.de  Fri May 25 21:38:43 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 25 May 2001 22:38:43 +0200
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
In-Reply-To: <3B0E8612.45BAF00D@mitre.org> (message from Alex Morgan on Fri,
 25 May 2001 12:19:30 -0400)
References: <3B0D8D99.B94E7542@mitre.org> <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de> <3B0E8612.45BAF00D@mitre.org>
Message-ID: <200105252038.f4PKchI01340@mira.informatik.hu-berlin.de>

> An example of the behavior I am talking about is input that includes the
> following:
> 
> '<reference> Morse &amp; Feshbach </reference>'
> 
> With a CDATA handler:
> 
> 'def char_data(data):
> 	print data'
> 
> Will return 'Morse & Feshbach', when I would like it to return the
> original string, as is.

Fred already mentioned the default handler, but I'd like you to
reconsider your request: &amp; and & are really the same thing; one is
marked-up, the other is not.

If you have a need to output the contents again as XML, you may find
xml.sax.saxutils.escape useful.

Regards,
Martin


From fdrake@acm.org  Fri May 25 21:39:52 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 25 May 2001 16:39:52 -0400 (EDT)
Subject: [XML-SIG] cStringIO
In-Reply-To: <200105252015.f4PKFga01183@mira.informatik.hu-berlin.de>
References: <3B0E5763.FC2ED68E@zolera.com>
 <15118.27630.468805.814729@cj42289-a.reston1.va.home.com>
 <200105252015.f4PKFga01183@mira.informatik.hu-berlin.de>
Message-ID: <15118.49944.451281.919103@cj42289-a.reston1.va.home.com>

Martin v. Loewis writes:
 > One issue of reading UTF-8, whether from cStringIO or elsewhere, might
 > break result strings inside a character (i.e. between character
 > boundaries). So be careful with applying unicode() or .decode on such
 > a string - you may have to save some bytes for the next .read() call.

  Correct -- the cStringIO object is just a stream of bytes, like a
file object.  To read characters, you'll need to wrap it with a
decoder using the codecs module, or pass the bytes to a parser that
can handle them properly (like Expat).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From fdrake@acm.org  Fri May 25 21:50:59 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 25 May 2001 16:50:59 -0400 (EDT)
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
In-Reply-To: <200105252038.f4PKchI01340@mira.informatik.hu-berlin.de>
References: <3B0D8D99.B94E7542@mitre.org>
 <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de>
 <3B0E8612.45BAF00D@mitre.org>
 <200105252038.f4PKchI01340@mira.informatik.hu-berlin.de>
Message-ID: <15118.50611.898283.174028@cj42289-a.reston1.va.home.com>


Martin v. Loewis writes:
 > Fred already mentioned the default handler, but I'd like you to
 > reconsider your request: &amp; and & are really the same thing; one is
 > marked-up, the other is not.

  I only wish it were that easy!  In cases where you want to preserve
the input as much as possible, it can be important to distinguish
between an internal entity reference and the expansion:

<!DOCTYPE doc [
    <!ENTITY MyEmployer "Digital Creations">
]>
<doc>&MyEmployer;</doc>

  Now, if I want to load the document into a DOM, modify a few things,
and dump it back out for further human editing, I want the entity
references intact.  With Expat, the only way I've found to do this is
to use the DefaultHandler to capture this information.  Whether or not
the text is expanded directly or made a child of an entity reference
node should be determined by the application.  The DOM Level 3
Load/Save working draft has knobs to control this behavior.
  (If anyone knows a way to determine whether a document contains &lt;,
&#60;, &#x3c;, or &#x3C;, I'd love to hear about it!)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From martin@loewis.home.cs.tu-berlin.de  Fri May 25 22:00:13 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 25 May 2001 23:00:13 +0200
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
In-Reply-To: <15118.50611.898283.174028@cj42289-a.reston1.va.home.com>
 (fdrake@acm.org)
References: <3B0D8D99.B94E7542@mitre.org>
 <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de>
 <3B0E8612.45BAF00D@mitre.org>
 <200105252038.f4PKchI01340@mira.informatik.hu-berlin.de> <15118.50611.898283.174028@cj42289-a.reston1.va.home.com>
Message-ID: <200105252100.f4PL0Dv01673@mira.informatik.hu-berlin.de>

>   Now, if I want to load the document into a DOM, modify a few things,
> and dump it back out for further human editing, I want the entity
> references intact.  With Expat, the only way I've found to do this is
> to use the DefaultHandler to capture this information.

Of course, those of us contributing to Expat should have no problems
to make it not expand internal entity references :-)

Regards,
Martin


From fdrake@acm.org  Fri May 25 22:15:12 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 25 May 2001 17:15:12 -0400 (EDT)
Subject: [XML-SIG] xml.parsers.expat not converting aliased CDATA elements
In-Reply-To: <200105252100.f4PL0Dv01673@mira.informatik.hu-berlin.de>
References: <3B0D8D99.B94E7542@mitre.org>
 <200105250006.f4P06ZX01302@mira.informatik.hu-berlin.de>
 <3B0E8612.45BAF00D@mitre.org>
 <200105252038.f4PKchI01340@mira.informatik.hu-berlin.de>
 <15118.50611.898283.174028@cj42289-a.reston1.va.home.com>
 <200105252100.f4PL0Dv01673@mira.informatik.hu-berlin.de>
Message-ID: <15118.52064.181731.967296@cj42289-a.reston1.va.home.com>


Martin v. Loewis writes:
 > Of course, those of us contributing to Expat should have no problems
 > to make it not expand internal entity references :-)

  Of course not.  ;-)  Now, is there anyone *actually* contributing?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations


From haering_python@gmx.de  Sun May 27 02:29:47 2001
From: haering_python@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Sun, 27 May 2001 03:29:47 +0200
Subject: [XML-SIG] SRPMs
Message-ID: <20010527032946.A12024@lilith.hqd-internal>

Sorry if this is not the right place to ask.

>From the SourceForge page, I can download a Windows installer and RPMs for
PyXML. There isn't a source RPM available, however. Could this possibly be
fixed?

I don't want to write yet another SPEC file myself if I can avoid it.

Gerhard
-- 
mail:   gerhard <at> bigfoot <dot> de       registered Linux user #64239
web:    http://highqualdev.com              public key at homepage
public key fingerprint: DEC1 1D02 5743 1159 CD20  A4B6 7B22 6575 86AB 43C0
reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))


From teg@redhat.com  Sun May 27 04:09:56 2001
From: teg@redhat.com (Trond Eivind =?iso-8859-1?q?Glomsr=F8d?=)
Date: 26 May 2001 23:09:56 -0400
Subject: [XML-SIG] SRPMs
In-Reply-To: <20010527032946.A12024@lilith.hqd-internal>
References: <20010527032946.A12024@lilith.hqd-internal>
Message-ID: <xuy1ypb32t7.fsf@halden.devel.redhat.com>

Gerhard H=E4ring <haering_python@gmx.de> writes:

> Sorry if this is not the right place to ask.
>=20
> >From the SourceForge page, I can download a Windows installer and RPMs=
 for
> PyXML. There isn't a source RPM available, however. Could this possibly=
 be
> fixed?
>=20
> I don't want to write yet another SPEC file myself if I can avoid it.

Just download the tar file - it contains all you need.=20
That said, you can find handmade rpms in rawhide.


--=20
Trond Eivind Glomsr=F8d
Red Hat, Inc.


From martin@loewis.home.cs.tu-berlin.de  Sun May 27 09:56:01 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 27 May 2001 10:56:01 +0200
Subject: [XML-SIG] SRPMs
In-Reply-To: <20010527032946.A12024@lilith.hqd-internal> (message from
 Gerhard	=?ISO-8859-1?Q?H=E4ring?= on Sun, 27 May 2001 03:29:47 +0200)
References: <20010527032946.A12024@lilith.hqd-internal>
Message-ID: <200105270856.f4R8u1q01134@mira.informatik.hu-berlin.de>

> Sorry if this is not the right place to ask.

Hi Gerhard,

This is certainly the right place to ask.

> >From the SourceForge page, I can download a Windows installer and RPMs for
> PyXML. There isn't a source RPM available, however. Could this possibly be
> fixed?

No. The source RPM does not add any additional information, so I won't
upload it.

> I don't want to write yet another SPEC file myself if I can avoid it.

You don't have to. To build a source RPM, just unpack the sources, and
invoke

python setup.py bdist_rpm

Of course, if you merely want to install the package, doing

python setup.py install

is good enough.

Hope this helps,
Martin


From haering_python@gmx.de  Sun May 27 17:12:06 2001
From: haering_python@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Sun, 27 May 2001 18:12:06 +0200
Subject: [XML-SIG] SRPMs
In-Reply-To: <200105270856.f4R8u1q01134@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Sun, May 27, 2001 at 10:56:01AM +0200
References: <20010527032946.A12024@lilith.hqd-internal> <200105270856.f4R8u1q01134@mira.informatik.hu-berlin.de>
Message-ID: <20010527181206.A1304@lilith.hqd-internal>

On Sun, May 27, 2001 at 10:56:01AM +0200, Martin v. Loewis wrote:
> > I don't want to write yet another SPEC file myself if I can avoid it.
> 
> You don't have to. To build a source RPM, just unpack the sources, and
> invoke
> 
> python setup.py bdist_rpm

Oh, I forgot about that feature of distutils. Works very nicely.

Thanks,

Gerhard
-- 
mail:   gerhard <at> bigfoot <dot> de       registered Linux user #64239
web:    http://highqualdev.com              public key at homepage
public key fingerprint: DEC1 1D02 5743 1159 CD20  A4B6 7B22 6575 86AB 43C0
reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))


From linudom@hotmail.com  Mon May 28 19:17:50 2001
From: linudom@hotmail.com (Dom Linu)
Date: Mon, 28 May 2001 18:17:50 -0000
Subject: [XML-SIG] getAttribute??
Message-ID: <F153weDc7kJIZAqscCQ000101a3@hotmail.com>

<html><DIV>I have tried this many different ways, but it never seems to work and I always abandon PyXML&nbsp;in favor of something else... so I'll ask here, why does this fail:</DIV>
<DIV>&nbsp;</DIV>
<DIV>&gt;&gt;&gt; from xml.dom.ext.reader.Sax2 import FromXml<BR>&gt;&gt;&gt; doc = FromXml("&lt;mydoc id='123'&gt;text here&lt;/mydoc&gt;")<BR>&gt;&gt;&gt; elem = doc.documentElement<BR>&gt;&gt;&gt; attr = elem.getAttribute("id")<BR>&gt;&gt;&gt; print attr</DIV>
<DIV>&nbsp;</DIV>
<DIV>&gt;&gt;&gt; type(attr)<BR>&lt;type 'string'&gt;</DIV>
<DIV>&nbsp;</DIV>
<DIV>I've tried other document, other platforms (both Unix and Win32), and other techniques, but I just can't seem to get an attribute.&nbsp; Any enlightenment would be illuminating.</DIV>
<DIV>&nbsp;</DIV>
<DIV>thx.</DIV>
<DIV>&nbsp;</DIV><br clear=all><hr>Get your FREE download of MSN Explorer at <a href="http://explorer.msn.com">http://explorer.msn.com</a><br></p></html>


From Alexandre.Fayolle@logilab.fr  Mon May 28 19:51:32 2001
From: Alexandre.Fayolle@logilab.fr (Alexandre Fayolle)
Date: Mon, 28 May 2001 20:51:32 +0200 (CEST)
Subject: [XML-SIG] getAttribute??
In-Reply-To: <F153weDc7kJIZAqscCQ000101a3@hotmail.com>
Message-ID: <Pine.LNX.4.21.0105282050510.27654-100000@leo.logilab.fr>

On Mon, 28 May 2001, Dom Linu wrote:

> I have tried this many different ways, but it never seems to work and I
> always abandon PyXML�in favor of something else... so I'll ask here, why
> does this fail:
> �
> >>> from xml.dom.ext.reader.Sax2 import FromXml
> >>> doc = FromXml("<mydoc id='123'>text here</mydoc>")
> >>> elem = doc.documentElement
> >>> attr = elem.getAttribute("id")

Try this: 

attr = elem.getAttributeNS('','id')

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).


From dag@orion.no  Mon May 28 20:19:24 2001
From: dag@orion.no (Dag Sunde)
Date: Mon, 28 May 2001 21:19:24 +0200
Subject: [XML-SIG] getAttribute??
References: <Pine.LNX.4.21.0105282050510.27654-100000@leo.logilab.fr>
Message-ID: <055901c0e7ab$1af82db0$43145c3e@orion.no>

Ah!

I got interested in Dom Linu's problem, and was able
to get the attribute with:

>>> atr = elem.attributes['','id'].value

but couldn't for my life understand the first param...
It's the NameSpace! :-)

But why isn't getAttribute('id') working when the NS
is an empty string?

Does getAttribute('id') work if the NS somehow is None?


Dag.

----- Original Message -----
From: "Alexandre Fayolle" <Alexandre.Fayolle@logilab.fr>
To: "Dom Linu" <linudom@hotmail.com>
Cc: <xml-sig@python.org>
Sent: Monday, May 28, 2001 8:51 PM
Subject: Re: [XML-SIG] getAttribute??


> On Mon, 28 May 2001, Dom Linu wrote:
>
> > I have tried this many different ways, but it never seems to work and I
> > always abandon PyXML in favor of something else... so I'll ask here, why
> > does this fail:
> >
> > >>> from xml.dom.ext.reader.Sax2 import FromXml
> > >>> doc = FromXml("<mydoc id='123'>text here</mydoc>")
> > >>> elem = doc.documentElement
> > >>> attr = elem.getAttribute("id")
>
> Try this:
>
> attr = elem.getAttributeNS('','id')
>
> Alexandre Fayolle
> --
> http://www.logilab.com
> Narval is the first software agent available as free software (GPL).
> LOGILAB, Paris (France).
>
>
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

Admin
Orion Energy Consulting  AS
**********************************************************************


From linudom@hotmail.com  Mon May 28 20:54:50 2001
From: linudom@hotmail.com (Dom Linu)
Date: Mon, 28 May 2001 19:54:50 -0000
Subject: [XML-SIG] getAttribute??
Message-ID: <F52EFk2H3b98KtdhQYZ000100d0@hotmail.com>

<html><DIV>
<P>As always, the SIG rules.&nbsp; Bravo!&nbsp; Still perplexing is the interesting&nbsp;"feature" of getAttribute(&lt;attrname&gt;)...</P>
<P>Thanks again!<BR></P></DIV>
<DIV></DIV>
<DIV></DIV>&gt;From: Alexandre Fayolle <ALEXANDRE.FAYOLLE@LOGILAB.FR>
<DIV></DIV>&gt;To: Dom Linu <LINUDOM@HOTMAIL.COM>
<DIV></DIV>&gt;CC: xml-sig@python.org 
<DIV></DIV>&gt;Subject: Re: [XML-SIG] getAttribute?? 
<DIV></DIV>&gt;Date: Mon, 28 May 2001 20:51:32 +0200 (CEST) 
<DIV></DIV>&gt; 
<DIV></DIV>&gt;On Mon, 28 May 2001, Dom Linu wrote: 
<DIV></DIV>&gt; 
<DIV></DIV>&gt; &gt; I have tried this many different ways, but it never seems to work and I 
<DIV></DIV>&gt; &gt; always abandon PyXML&nbsp;in favor of something else... so I'll ask here, why 
<DIV></DIV>&gt; &gt; does this fail: 
<DIV></DIV>&gt; &gt; &nbsp; 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; from xml.dom.ext.reader.Sax2 import FromXml 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; doc = FromXml("<MYDOC id=123>text here</MYDOC>") 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; elem = doc.documentElement 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; attr = elem.getAttribute("id") 
<DIV></DIV>&gt; 
<DIV></DIV>&gt;Try this: 
<DIV></DIV>&gt; 
<DIV></DIV>&gt;attr = elem.getAttributeNS('','id') 
<DIV></DIV>&gt; 
<DIV></DIV>&gt;Alexandre Fayolle 
<DIV></DIV>&gt;-- 
<DIV></DIV>&gt;http://www.logilab.com 
<DIV></DIV>&gt;Narval is the first software agent available as free software (GPL). 
<DIV></DIV>&gt;LOGILAB, Paris (France). 
<DIV></DIV>&gt; 
<DIV></DIV>&gt; 
<DIV></DIV>&gt;_______________________________________________ 
<DIV></DIV>&gt;XML-SIG maillist - XML-SIG@python.org 
<DIV></DIV>&gt;http://mail.python.org/mailman/listinfo/xml-sig 
<DIV></DIV><br clear=all><hr>Get your FREE download of MSN Explorer at <a href="http://explorer.msn.com">http://explorer.msn.com</a><br></p></html>


From Mike.Olson@fourthought.com  Mon May 28 21:48:52 2001
From: Mike.Olson@fourthought.com (Mike Olson)
Date: Mon, 28 May 2001 14:48:52 -0600
Subject: [XML-SIG] getAttribute??
References: <F153weDc7kJIZAqscCQ000101a3@hotmail.com>
Message-ID: <3B12B9B4.A5C38A8E@FourThought.com>

Dom Linu wrote:
> 
> I have tried this many different ways, but it never seems to work and
> I always abandon PyXML in favor of something else... so I'll ask here,
> why does this fail:
> 
> >>> from xml.dom.ext.reader.Sax2 import FromXml
> >>> doc = FromXml("<mydoc id='123'>text here</mydoc>")
> >>> elem = doc.documentElement
> >>> attr = elem.getAttribute("id")
> >>> print attr
> 
> >>> type(attr)
> <type 'string'>

Because the Sax2 reader is namespace aware so you need to use the DOM
level II interface of getAttributeNS('','id')


> 
> I've tried other document, other platforms (both Unix and Win32), and
> other techniques, but I just can't seem to get an attribute.  Any
> enlightenment would be illuminating.
> 
> thx.
> 
> 
> ----------------------------------------------------------------------
> Get your FREE download of MSN Explorer at http://explorer.msn.com
> 
> _______________________________________________ XML-SIG maillist -
> XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig

-- 
Mike Olson				 Principal Consultant
mike.olson@fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From linudom@hotmail.com  Mon May 28 22:11:07 2001
From: linudom@hotmail.com (Dom Linu)
Date: Mon, 28 May 2001 21:11:07 -0000
Subject: [XML-SIG] getAttribute??
Message-ID: <F61AxRYG6jj69EHaz8k00010246@hotmail.com>

<html><DIV>
<P>Wow -- very informative.&nbsp; Thank you.&nbsp; I was working on the assumption that if namespaces weren't in use, that you use non-namespace functions.&nbsp; That seems to have worked for everything else that I'm doing, but to be honest I can't remember if I've always been using the Sax2 reader-- I would have to dig.&nbsp; I mean, with the Sax2 reader (implied by using FromXml)&nbsp;getElementsByTagName works, without using getElementsByTagNameNS I'm pretty sure...&nbsp;&nbsp;is this inconsistent, or am I missing something?&nbsp; (the latter probably being true!)</P>
<P>dl.<BR><BR></P></DIV>
<DIV></DIV>
<DIV></DIV>&gt;From: Mike Olson <MIKE.OLSON@FOURTHOUGHT.COM>
<DIV></DIV>&gt;To: Dom Linu <LINUDOM@HOTMAIL.COM>
<DIV></DIV>&gt;CC: xml-sig@python.org 
<DIV></DIV>&gt;Subject: Re: [XML-SIG] getAttribute?? 
<DIV></DIV>&gt;Date: Mon, 28 May 2001 14:48:52 -0600 
<DIV></DIV>&gt; 
<DIV></DIV>&gt;Dom Linu wrote: 
<DIV></DIV>&gt; &gt; 
<DIV></DIV>&gt; &gt; I have tried this many different ways, but it never seems to work and 
<DIV></DIV>&gt; &gt; I always abandon PyXML in favor of something else... so I'll ask here, 
<DIV></DIV>&gt; &gt; why does this fail: 
<DIV></DIV>&gt; &gt; 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; from xml.dom.ext.reader.Sax2 import FromXml 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; doc = FromXml("<MYDOC id=123>text here</MYDOC>") 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; elem = doc.documentElement 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; attr = elem.getAttribute("id") 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; print attr 
<DIV></DIV>&gt; &gt; 
<DIV></DIV>&gt; &gt; &gt;&gt;&gt; type(attr) 
<DIV></DIV>&gt; &gt; <TYPE ?string?>
<DIV></DIV>&gt; 
<DIV></DIV>&gt;Because the Sax2 reader is namespace aware so you need to use the DOM 
<DIV></DIV>&gt;level II interface of getAttributeNS('','id') 
<DIV></DIV>&gt; 
<DIV></DIV>&gt; 
<DIV></DIV>&gt; &gt; 
<DIV></DIV>&gt; &gt; I've tried other document, other platforms (both Unix and Win32), and 
<DIV></DIV>&gt; &gt; other techniques, but I just can't seem to get an attribute. Any 
<DIV></DIV>&gt; &gt; enlightenment would be illuminating. 
<DIV></DIV>&gt; &gt; 
<DIV></DIV>&gt; &gt; thx. 
<DIV></DIV>&gt; &gt; 
<DIV></DIV>&gt; &gt; 
<DIV></DIV>&gt; &gt; ---------------------------------------------------------------------- 
<DIV></DIV>&gt; &gt; Get your FREE download of MSN Explorer at http://explorer.msn.com 
<DIV></DIV>&gt; &gt; 
<DIV></DIV>&gt; &gt; _______________________________________________ XML-SIG maillist - 
<DIV></DIV>&gt; &gt; XML-SIG@python.org http://mail.python.org/mailman/listinfo/xml-sig 
<DIV></DIV>&gt; 
<DIV></DIV>&gt;-- 
<DIV></DIV>&gt;Mike Olson Principal Consultant 
<DIV></DIV>&gt;mike.olson@fourthought.com (303)583-9900 x 102 
<DIV></DIV>&gt;Fourthought, Inc. http://Fourthought.com 
<DIV></DIV>&gt;Software-engineering, knowledge-management, XML, CORBA, Linux, Python 
<DIV></DIV>&gt; 
<DIV></DIV>&gt;_______________________________________________ 
<DIV></DIV>&gt;XML-SIG maillist - XML-SIG@python.org 
<DIV></DIV>&gt;http://mail.python.org/mailman/listinfo/xml-sig 
<DIV></DIV><br clear=all><hr>Get your FREE download of MSN Explorer at <a href="http://explorer.msn.com">http://explorer.msn.com</a><br></p></html>


From eliot@isogen.com  Tue May 29 00:24:29 2001
From: eliot@isogen.com (W. Eliot Kimber)
Date: Mon, 28 May 2001 18:24:29 -0500
Subject: [XML-SIG] getAttribute??
References: <F61AxRYG6jj69EHaz8k00010246@hotmail.com>
Message-ID: <3B12DE2D.6B57DF91@isogen.com>

Dom Linu wrote:
> 
> Wow -- very informative.  Thank you.  I was working on the assumption
> that if namespaces weren't in use, that you use non-namespace
> functions.  That seems to have worked for everything else that I'm
> doing, but to be honest I can't remember if I've always been using the
> Sax2 reader-- I would have to dig.  I mean, with the Sax2 reader
> (implied by using FromXml) getElementsByTagName works, without using
> getElementsByTagNameNS I'm pretty sure...  is this inconsistent, or am
> I missing something?  (the latter probably being true!)

I considered the current behavior a bug (that non-namespace functions
require a null namespace value) and fixed it in my local copy of the
code. Unfortunately, I haven't had a chance to package up these fixes
and submit them back to the SIG yet.

The problem I found was that the dictionaries where things were indexed
all assumed a tupple key with a possibly null namespace value. The fix
was easy: just synthesize the tupple for the non-namespace lookup
methods.

Cheers,

Eliot

-- 
. . . . . . . . . . . . . . . . . . . . . . . .

W. Eliot Kimber | Lead Brain

1016 La Posada Dr. | Suite 240 | Austin TX  78752
    T 512.656.4139 |  F 512.419.1860 | eliot@isogen.com

w w w . d a t a c h a n n e l . c o m


From larsga@garshol.priv.no  Tue May 29 08:14:45 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 29 May 2001 09:14:45 +0200
Subject: [XML-SIG] building XML docs using ?
In-Reply-To: <200105171709.f4HH9SX17328@localhost.local>
References: <200105171709.f4HH9SX17328@localhost.local>
Message-ID: <m34ru4y6ca.fsf@lambda.garshol.priv.no>

(I've been at XML Europe, and so didn't see your response until now.)

* Uche Ogbuji
| 
| Why not?  Because most XML handling tools are not very scalable,
| XSLT being the foremost example.
 
That is true, but it still doesn't mean that there is something wrong
with documents that are 100MB in size, just that there is something
wrong with part of the tool set. The other part of the tool set will
handle this just fine.

I've been working with things like encyclopedias needing to be
imported into CMSs as well as turning the Open Directory Project data
into a topic map, and in these cases the documents naturally become
very big. 

Processing these documents using SAX was no problem at all, although
it admittedly took a while. In fact, an event-based representation was
quite natural for these applications, though I admit that this will
not apply to all applications.

| Also because XML eliminates the need, which I think quite
| unneccesary, of storing mountains of data in a single file.
| Inclusion, transclusion, other linking mechanisms, and many tools
| are available for breaking XML into manageable packets.

Packets of 100MB are quite manageable with the right tools.
 
| Opinion of others might vary, of course.

It does. :-)

--Lars M.


From larsga@garshol.priv.no  Tue May 29 08:16:58 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 29 May 2001 09:16:58 +0200
Subject: [XML-SIG] XML Canonicalization
In-Reply-To: <3B03D8B4.9108432D@zolera.com>
References: <3B03D8B4.9108432D@zolera.com>
Message-ID: <m33d9oy68l.fsf@lambda.garshol.priv.no>

* Rich Salz
| 
| I would be more than happy to add this to PyXML if there's interest.
| Since it operates on DOM nodes, perhaps xml.dom.utils ?

I know this is a little late now, but anyway: why did we do this based
on the DOM? Isn't SAX far more natural for something as simple as this?
It's faster, it works for DOM representations as well, and it scales
much better.

--Lars M.


From cadeau@kipix.com  Tue May 29 18:12:33 2001
From: cadeau@kipix.com (cadeau@kipix.com)
Date: Tue, 29 May 2001 10:12:33 PDT
Subject: [XML-SIG] Kipix(r) va vous aider a VENDRE PLUS...
Message-ID: <3b1361873b1a9576@andira.wanadoo.fr> (added by andira.wanadoo.fr)

<html>
<head>
<title>Document sans-titre</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<script language="JavaScript">
<!--
function MM_timelinePlay(tmLnName, myID) { //v1.2
  //Copyright 1997 Macromedia, Inc. All rights reserved.
  var i,j,tmLn,props,keyFrm,sprite,numKeyFr,firstKeyFr,propNum,theObj,firstTime=false;
  if (document.MM_Time == null) MM_initTimelines(); //if *very* 1st time
  tmLn = document.MM_Time[tmLnName];
  if (myID == null) { myID = ++tmLn.ID; firstTime=true;}//if new call, incr ID
  if (myID == tmLn.ID) { //if Im newest
    setTimeout('MM_timelinePlay("'+tmLnName+'",'+myID+')',tmLn.delay);
    fNew = ++tmLn.curFrame;
    for (i=0; i<tmLn.length; i++) {
      sprite = tmLn[i];
      if (sprite.charAt(0) == 's') {
        if (sprite.obj) {
          numKeyFr = sprite.keyFrames.length; firstKeyFr = sprite.keyFrames[0];
          if (fNew >= firstKeyFr && fNew <= sprite.keyFrames[numKeyFr-1]) {//in range
            keyFrm=1;
            for (j=0; j<sprite.values.length; j++) {
              props = sprite.values[j]; 
              if (numKeyFr != props.length) {
                if (props.prop2 == null) sprite.obj[props.prop] = props[fNew-firstKeyFr];
                else        sprite.obj[props.prop2][props.prop] = props[fNew-firstKeyFr];
              } else {
                while (keyFrm<numKeyFr && fNew>=sprite.keyFrames[keyFrm]) keyFrm++;
                if (firstTime || fNew==sprite.keyFrames[keyFrm-1]) {
                  if (props.prop2 == null) sprite.obj[props.prop] = props[keyFrm-1];
                  else        sprite.obj[props.prop2][props.prop] = props[keyFrm-1];
        } } } } }
      } else if (sprite.charAt(0)=='b' && fNew == sprite.frame) eval(sprite.value);
      if (fNew > tmLn.lastFrame) tmLn.ID = 0;
  } }
}

function MM_timelineGoto(tmLnName, fNew, numGotos) { //v2.0
  //Copyright 1997 Macromedia, Inc. All rights reserved.
  var i,j,tmLn,props,keyFrm,sprite,numKeyFr,firstKeyFr,lastKeyFr,propNum,theObj;
  if (document.MM_Time == null) MM_initTimelines(); //if *very* 1st time
  tmLn = document.MM_Time[tmLnName];
  if (numGotos != null)
    if (tmLn.gotoCount == null) tmLn.gotoCount = 1;
    else if (tmLn.gotoCount++ >= numGotos) {tmLn.gotoCount=0; return}
  jmpFwd = (fNew > tmLn.curFrame);
  for (i = 0; i < tmLn.length; i++) {
    sprite = (jmpFwd)? tmLn[i] : tmLn[(tmLn.length-1)-i]; //count bkwds if jumping back
    if (sprite.charAt(0) == "s") {
      numKeyFr = sprite.keyFrames.length;
      firstKeyFr = sprite.keyFrames[0];
      lastKeyFr = sprite.keyFrames[numKeyFr - 1];
      if ((jmpFwd && fNew<firstKeyFr) || (!jmpFwd && lastKeyFr<fNew)) continue; //skip if untouchd
      for (keyFrm=1; keyFrm<numKeyFr && fNew>=sprite.keyFrames[keyFrm]; keyFrm++);
      for (j=0; j<sprite.values.length; j++) {
        props = sprite.values[j];
        if (numKeyFr == props.length) propNum = keyFrm-1 //keyframes only
        else propNum = Math.min(Math.max(0,fNew-firstKeyFr),props.length-1); //or keep in legal range
        if (sprite.obj != null) {
          if (props.prop2 == null) sprite.obj[props.prop] = props[propNum];
          else        sprite.obj[props.prop2][props.prop] = props[propNum];
      } }
    } else if (sprite.charAt(0)=='b' && fNew == sprite.frame) eval(sprite.value);
  }
  tmLn.curFrame = fNew;
  if (tmLn.ID == 0) eval('MM_timelinePlay(tmLnName)');
}

function MM_initTimelines() {
    //MM_initTimelines() Copyright 1997 Macromedia, Inc. All rights reserved.
    var ns = navigator.appName == "Netscape";
    document.MM_Time = new Array(1);
    document.MM_Time[0] = new Array(5);
    document.MM_Time["Timeline1"] = document.MM_Time[0];
    document.MM_Time[0].MM_Name = "Timeline1";
    document.MM_Time[0].fps = 15;
    document.MM_Time[0][0] = new String("behavior");
    document.MM_Time[0][0].frame = 14;
    document.MM_Time[0][0].value = "MM_timelineGoto('Timeline1','1')";
    document.MM_Time[0][1] = new String("sprite");
    document.MM_Time[0][1].slot = 1;
    if (ns)
        document.MM_Time[0][1].obj = document["Layer1"];
    else
        document.MM_Time[0][1].obj = document.all ? document.all["Layer1"] : null;
    document.MM_Time[0][1].keyFrames = new Array(1, 10);
    document.MM_Time[0][1].values = new Array(1);
    document.MM_Time[0][1].values[0] = new Array("visible","visible");
    document.MM_Time[0][1].values[0].prop = "visibility";
    if (!ns)
        document.MM_Time[0][1].values[0].prop2 = "style";
    document.MM_Time[0][2] = new String("sprite");
    document.MM_Time[0][2].slot = 1;
    if (ns)
        document.MM_Time[0][2].obj = document["Layer1"];
    else
        document.MM_Time[0][2].obj = document.all ? document.all["Layer1"] : null;
    document.MM_Time[0][2].keyFrames = new Array(11, 13);
    document.MM_Time[0][2].values = new Array(1);
    document.MM_Time[0][2].values[0] = new Array("hidden","hidden");
    document.MM_Time[0][2].values[0].prop = "visibility";
    if (!ns)
        document.MM_Time[0][2].values[0].prop2 = "style";
    document.MM_Time[0][3] = new String("sprite");
    document.MM_Time[0][3].slot = 2;
    if (ns)
        document.MM_Time[0][3].obj = document["Layer2"];
    else
        document.MM_Time[0][3].obj = document.all ? document.all["Layer2"] : null;
    document.MM_Time[0][3].keyFrames = new Array(1, 10);
    document.MM_Time[0][3].values = new Array(1);
    document.MM_Time[0][3].values[0] = new Array("hidden","hidden");
    document.MM_Time[0][3].values[0].prop = "visibility";
    if (!ns)
        document.MM_Time[0][3].values[0].prop2 = "style";
    document.MM_Time[0][4] = new String("sprite");
    document.MM_Time[0][4].slot = 2;
    if (ns)
        document.MM_Time[0][4].obj = document["Layer2"];
    else
        document.MM_Time[0][4].obj = document.all ? document.all["Layer2"] : null;
    document.MM_Time[0][4].keyFrames = new Array(11, 13);
    document.MM_Time[0][4].values = new Array(1);
    document.MM_Time[0][4].values[0] = new Array("visible","visible");
    document.MM_Time[0][4].values[0].prop = "visibility";
    if (!ns)
        document.MM_Time[0][4].values[0].prop2 = "style";
    document.MM_Time[0].lastFrame = 14;
    for (i=0; i<document.MM_Time.length; i++) {
        document.MM_Time[i].ID = null;
        document.MM_Time[i].curFrame = 0;
        document.MM_Time[i].delay = 1000/document.MM_Time[i].fps;
    }
}
//-->
</script>
</head>

<body bgcolor="#FFFFFF" onLoad="MM_timelinePlay('Timeline1')">
<table width="565" border="0">
  <tr>
    <td> <font color="#0000FF"><b><font color="#000000">De :</font></b> Laurette 
      Hassan <b> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<font color="#000000">A&gt;</font></b> 
      Directeur Marketing/Communication <b><font color="#000000"> &nbsp;Copie 
      :</font></b> Dir. Commercial<br>
      <font size="4" color="#000000">Kipix</font><font size="1" color="#000000">�</font><font size="4" color="#000000"> 
      est le nouveau cadeau publicitaire qui va vous aider � <b>vendre plus !</b></font><br>
      (explications sur <a href="http://www.kipix.com">www.kipix.com</a> et ci-apr&egrave;s...)</font> 
      <font color="#0000FF"> &nbsp;If&nbsp;you speak <b>English</b> go on <a href="http://www.kipix.com">www.kipix.com</a></font><br>
      <br>
      <font size="4">Comment placer l'adresse de <b>votre site Internet</b> (ou 
      le n� de t�l�phone de votre service client�le, etc...) directement sous 
      les yeux de vos prospects ?</font><br>
      <div id="Layer1" style="position:absolute; width:293px; height:220px; z-index:1; top: 145px; left: 0px; visibility: visible"><img src="http://www.mbiz.fr/avant.jpg" width="293" height="220"></div>
      <div id="Layer2" style="position:absolute; width:293px; height:220px; z-index:2; left: 0px; top: 145px; visibility: hidden"><img src="http://www.mbiz.fr/apres.jpg" width="293" height="220"></div>
      <div id="Layer3" style="position:absolute; width:85px; height:70px; z-index:3; left: 298px; top: 145px"><img src="http://www.mbiz.fr/logokipix.jpg" width="85" height="70"></div>
      <div id="Layer4" style="position:absolute; width:154px; height:54px; z-index:4; left: 385px; top: 153px"> 
        <p align="center"><font size="2" color="#CC0000">2 grammes de<br>
          concentr&eacute; de communication<br>
          </font><font size="2"><a href="http://www.kipix.com"><font color="#0000FF" size="4">www.kipix.com</font></a></font></p>
      </div>
      <div id="Layer6" style="position:absolute; width:242px; height:94px; z-index:6; left: 297px; top: 270px">C'est 
        parce qu'il permet &agrave; vos prospects d'afficher ainsi leurs notes 
        d'un seul geste, que Kipix� assure la pr�sence de <b>votre message publicitaire 
        � un endroit strat�gique...</b></div>
      <p><br>
        <br>
        <br>
        <br>
        <br>
        <font size="3">Comment placer l'adresse de</font><br>
        <font size="3">de <b>votre site Internet</b></font><br>
        <font size="3">(ou le n� de t�l�phone</font><br>
        <font size="3">de votre service client�le, etc...)</font><br>
        <font size="3">directement sous les yeux de vos prospects ?</font> <br>
        <br>
        <br>
        &nbsp;&nbsp;&nbsp;La soci�t� Kozatis s.a.s. (sp�cialis�e dans la conception 
        de supports publicitaires innovants) vous propose une solution efficace 
        et originale <b>pour que votre message publicitaire soit vraiment VU et 
        LU</b>... Tr&egrave;s fr&eacute;quemment VU et tr&egrave;s fr&eacute;quemment 
        LU !!!</p>
      <p>&nbsp;&nbsp;&nbsp;Une des raisons du succ&egrave;s publicitaire du porte-notes 
        Kipix� (<b>Syst&egrave;me Brevet&eacute;</b>,<b> M�daille d'Or des Inventions 
        </b>et <b>Prix du Pr&eacute;sident du Concours L�pine 2000</b>) est qu'il 
        est per�u par vos (futurs) clients comme un cadeau original et tr&egrave;s 
        pratique : il rend un service concret qui assurera <b>votre pr�sence permanente</b> 
        aupr�s de vos prospects... </p>
      <p>&nbsp;&nbsp;&nbsp;En effet, Kipix� sera rapidement adopt� par vos clients 
        ou prospects car il leur permet de mettre en �vidence leurs notes, m�mos 
        et feuillets <b>d'un seul geste</b> ; et ce � un endroit strat�gique pour 
        votre communication : sur le pourtour de l'�cran de leur ordinateur ! 
        (avez-vous remarqu&eacute; la quantit&eacute; de documents qu'ils essaient 
        d'afficher quotidiennement &agrave; cet endroit ?)</p>
      <p> &nbsp;&nbsp;&nbsp;Vous b&eacute;n&eacute;ficierez de l' &quot;effet 
        Kipix�&quot; de multiples fa�ons :<br>
        <br>
        <font face="Bookman Old Style" size=3><font face="Symbol" size="2">&#183;</font></font> 
        en prospection : vos commerciaux laisseront dor&eacute;navant une trace 
        visible, durable et positive de leur passage... (avec Kipix� <b>votre 
        message publicitaire sera bien en vue jusqu'au jour o&ugrave; votre prospect 
        aura besoin de vos produits/services</b>. Vendre, c'est souvent &ecirc;tre 
        l&agrave; au bon moment : Kipix� est justement con&ccedil;u pour &ecirc;tre 
        l&agrave; au bon moment !)...<br>
        <font face="Bookman Old Style" size=3><font face="Symbol" size="2">&#183;</font></font> 
        pendant vos salons -ou autres &eacute;v&eacute;nements- Kipix� fera merveille 
        : dispos&eacute; dans un r&eacute;cipient transparent, et aper�u depuis 
        les all�es d'un salon, <b>Kipix� intrigue les visiteurs et les pousse 
        � s'approcher</b>, augmentant ainsi le nombre de vos contacts !...<br>
        <font face="Bookman Old Style" size=3><font face="Symbol" size="2">&#183;</font></font> 
        dans vos courriers et vos mailings (il est extra plat et p�se moins lourd 
        qu'une feuille A4 : &agrave; peine 4 grammes packaging inclu !). De plus, 
        Kipix� procure une sensation tactile tr&egrave;s particuli&egrave;re au 
        travers d'une enveloppe, <b>ce qui &quot;force&quot; litt&eacute;ralement 
        vos prospects &agrave; ouvrir les courriers que vous leur adressez</b>...<br>
        <font face="Bookman Old Style" size=3><font face="Symbol" size="2">&#183;</font></font> 
        en tant que prime directe...<br>
        <font face="Bookman Old Style" size=3><font face="Symbol" size="2">&#183;</font></font> 
        etc... </p>
      <p>&nbsp;&nbsp;&nbsp;N'h�sitez-pas � me t�l�phoner pour toute information 
        suppl�mentaire <b>ou pour recevoir un &eacute;chantillon gratuit</b>,</p>
      <p>&nbsp;&nbsp;&nbsp;Sinc�res salutations <b><font color="#0000FF">:-)</font></b> 
      </p>
      <p><img src="http://www.mbiz.fr/signaturelolopouremails.jpg" width="104" height="59"></p>
      <p>Laurette Hassan - Directrice Commerciale de Kozatis s.a.s.<br>
        +33(0)6 61 93 46 69 ou +33(0)1 58 53 52 62 <a href="mailto:cadeau@kipix.com?subject=Kipix%AE%20sur%20l'Internet"><font color="#0000FF">cadeau@kipix.com</font></a></p>
      <p>&nbsp;&nbsp;&nbsp;P.S. : quelques-uns des annonceurs qui font confiance 
        &agrave; Kipix� : <b>www.nomade.fr, SNCF, Microsoft, IBM, Cegetel, FFF, 
        Johnson &amp; Johnson, Nortel, BNP, CIC, Cr&eacute;dit Agricole, Lufthansa, 
        Groupe CASINO, ORT Reuters, UNIX,</b> <b>Badoit,</b> etc... (pour d&eacute;couvrir 
        quelques-uns des visuels de leurs Kipix�, visitez <a href="http://www.kipix.com"><font color="#0000FF">www.kipix.com</font></a>)</p>
      <p><font color="#0000FF">&nbsp;&nbsp;&nbsp;Une petite <b>DEMONSTRATION VIDEO</b> 
        r&eacute;alis&eacute;e &quot;au pied lev&eacute;&quot; vous donnera un 
        aper&ccedil;u dynamique des qualit&eacute;s fonctionnelles de Kipix� ; 
        pour la d&eacute;couvrir cliquez sur le lien suivant : <a href="http://www.mbiz.fr/kipixdemovideo.mpg"><font color="#CC00CC">T&eacute;l&eacute;chargement 
        de la d&eacute;monstration vid&eacute;o (fichier &quot;.mpg&quot; ; environ 
        1 minute)</font></a><br>
        (c'est peu probable, mais si la d&eacute;monstration ne d&eacute;marrait 
        pas automatiquement, utilisez <i>Windows Media Player</i> (int&eacute;gr&eacute; 
        &agrave; Windows Millenium) ou <i>RealPlayer 8 Basic</i> (disponible gratuitement 
        sur <a href="http://www.realplayer.com"><font color="#CC00CC">www.realplayer.com</font></a>, 
        ou plus directement en cliquant sur le lien suivant : <a href="http://huxley.real.com/real/player/player.html?src=001201realhome_1,001201rpchoice_h2&amp;dc=124123122"><font color="#D000D0">http://www...</font></a>))</font></p>
      <p>&nbsp;</p>
      <p align="center"><font size="4">If you don't speak french, but english, 
        please visit our web site :<br>
        <a href="http://www.kipix.com">www.kipix.com</a></font></p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
      <p>&nbsp;</p>
</td>
  </tr>
</table>
</body>
</html>
 <BR>   ___________________________________  <BR>      Si vous ne d�sirez plus recevoir de courriers de cette liste : <BR> To remove your email address from this list : <BR><A HREF="cadeau@kipix.com"> cadeau@kipix.com</A>


From rsalz@zolera.com  Tue May 29 11:48:17 2001
From: rsalz@zolera.com (Rich Salz)
Date: Tue, 29 May 2001 06:48:17 -0400
Subject: [XML-SIG] XML Canonicalization
References: <3B03D8B4.9108432D@zolera.com> <m33d9oy68l.fsf@lambda.garshol.priv.no>
Message-ID: <3B137E71.C76FC3C7@zolera.com>

> I know this is a little late now, but anyway: why did we do this based
> on the DOM? Isn't SAX far more natural for something as simple as this?
> It's faster, it works for DOM representations as well, and it scales
> much better.

I met my needs at the time and I thought the community would appreciate
it.

Hopefully someone will get useful ideas from my code and do a SAX one.
	/r$


From larsga@garshol.priv.no  Tue May 29 13:27:00 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 29 May 2001 14:27:00 +0200
Subject: [XML-SIG] XML Canonicalization
In-Reply-To: <3B137E71.C76FC3C7@zolera.com>
References: <3B03D8B4.9108432D@zolera.com> <m33d9oy68l.fsf@lambda.garshol.priv.no> <3B137E71.C76FC3C7@zolera.com>
Message-ID: <m3ofsc1gtn.fsf@lambda.garshol.priv.no>

* Rich Salz
| 
| I met my needs at the time and I thought the community would appreciate
| it.

Sorry, Rich, I didn't mean to be ungrateful, and I do appreciate
this.  It is a useful piece of code, and we've seen already that there
are people interested in this.
 
| Hopefully someone will get useful ideas from my code and do a SAX
| one.

Indeed, that was the intention behind my posting, even if it may not
have been very clear. Sorry about that.

--Lars M.


From larsga@garshol.priv.no  Tue May 29 13:36:25 2001
From: larsga@garshol.priv.no (Lars Marius Garshol)
Date: 29 May 2001 14:36:25 +0200
Subject: [XML-SIG] external entities and CDATA sections
In-Reply-To: <Pine.LNX.4.21.0105250930240.6432-100000@orion.logilab.fr>
References: <Pine.LNX.4.21.0105250930240.6432-100000@orion.logilab.fr>
Message-ID: <m3n17w1gdy.fsf@lambda.garshol.priv.no>

* Alexandre Fayolle
| 
| While writing some documentation, I wanted to include some python
| code in a docbook document.

Some ideas:

 - reference it using unparsed entities (you must then pull in the
   code yourself)

 - reference the code using XInclude, with the type attribute set to
   'text' and write a simple SAX parser filter that does the inclusions
   for you (I have demo code that does this, email me if interested)

 - preprocess the source code and use entity references to the
   processed code

I hope this helps and isn't too late.

--Lars M.


From rsalz@zolera.com  Tue May 29 15:22:10 2001
From: rsalz@zolera.com (Rich Salz)
Date: Tue, 29 May 2001 10:22:10 -0400
Subject: [XML-SIG] XML Canonicalization
References: <3B03D8B4.9108432D@zolera.com> <m33d9oy68l.fsf@lambda.garshol.priv.no> <3B137E71.C76FC3C7@zolera.com> <m3ofsc1gtn.fsf@lambda.garshol.priv.no>
Message-ID: <3B13B092.7DC96963@zolera.com>

> Indeed, that was the intention behind my posting, even if it may not
> have been very clear. Sorry about that.

It was totally clear, and I'm only wasting bandwidth on this list
because you apologized twice in the same message. :)

No problem at all.
	/r$


From cipher@redback.com  Tue May 29 18:31:29 2001
From: cipher@redback.com (J B Bell)
Date: Tue, 29 May 2001 10:31:29 -0700
Subject: [XML-SIG] "simple" config file parser problems
Message-ID: <20010529103128.A5656@login002.redback.com>

I'm having the very devil of a time trying to do something that I
assume would be simple (if I knew what I was doing) with xml.sax under
Python 2.0 & 2.1.

I'd go into the structure I'm looking to get from the XML, but at this
point, the event-handling methods I have don't come into play before
something deep inside xml.expat explodes.  Likely the object I'm using
lacks a needed trait (it appears to be something to do with name,
though that seems to be there), but I'm not sure what.

Without further ado, too much code, followed by a stack trace.  Any
help at all is greatly appreciated.  If this isn't the appropriate
list, please accept my copious apologies, and if you are kindly
disposed, a pointer to the right place to get assistance would be a
bonus.

--JB

# Note, I have tried with both saxlib.HandlerBase and the presumably
# more generic ContentHandler.  Both give the exact same error.

from xml.sax import make_parser
from xml.sax import saxlib
from xml.sax.handler import feature_namespaces
from xml.sax import ContentHandler

class Config:
    """A base class for all types of configuration information, whether to be
       found in plain files, xml, or databases.  Subclass as appropriate."""

    def parseConfig(self, args):
        """Override this in your subclassed Config"""
        pass

    def __init__(self, *args):
        newConfig = self.parseConfig(args)
        return newConfig

#class RsyncConfigHandler(ContentHandler):
class RsyncConfigHandler(saxlib.HandlerBase):
    """Read in & return a config file for rsync jobs"""

    # Errors should be signaled, so we'll output a message and raise
    # the exception to stop processing
    def fatalError(self, exception):
        sys.stderr.write('ERROR: '+ str(exception)+'\n')
        sys.exit(1)
    error = fatalError
    warning = fatalError

    def startDocument(self):
        self.jobList = []

    def startElement(self, name, attrs):
        methodName = "start" + str(name).capitalize()
        try:
            method = getattr(self, methodName)
        except:
            raise "Unknown element name '<%s>'" % name
        self.attrs = attrs
        if DEBUG: print "Invoking %s with attrs %s" % (methodName, str(attrs))
        apply(method, attrs)

    def endElement(self, name):
        methodName = "start" + str(name).capitalize()
        try:
            method = getattr(self, methodName)
        except:
            raise "Unknown element name '</%s>'" % name
        if DEBUG: print "Invoking %s with attrs %s" % (methodName, str(attrs))
        apply(method, attrs)

    def startConfig(self, attrs):
        """<config> just starts the whole shebang, no need to do anything."""
        pass

    def endConfig(self):
        pass

    def startQueue(self, attrs):
        pass

    def endQueue(self):
        pass

    def startJob(self, attrs):
        pass

    def endJob(self):
        pass
 
class RsyncConfig(Config):
    """Return an rsync configuration object"""

    def parseConfig(self, args):
        parser = make_parser()
        parser.setFeature(feature_namespaces, 0)
        dh = RsyncConfigHandler()    # Might want arguments here one day
        parser.setContentHandler(dh)
        configFile = "/home/cipher/cvs/itdoc/servers/rsync_config.xml"
        parser.parse(configFile)

[And now the stack trace:]

Python 2.0 (#1, Nov  3 2000, 12:11:00) 
[GCC egcs-2.91.66 19990314 (egcs-1.1.2 release)] on netbsd1
Type "copyright", "credits" or "license" for more information.
>>> from rsynct import RsyncConfig
>>> foo = RsyncConfig()
Invoking startConfig with attrs <xml.sax.xmlreader.AttributesImpl
instance at 0x8309bcc>
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "rsynct.py", line 65, in __init__
    newConfig = self.parseConfig(args)
  File "rsynct.py", line 129, in parseConfig
    parser.parse(configFile)
  File
"/usr/pkg/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py",
line 43, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File
"/usr/pkg/lib/python2.0/site-packages/_xmlplus/sax/xmlreader.py", line
121, in parse
    self.feed(buffer)
  File
"/usr/pkg/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py",
line 87, in feed
    self._parser.Parse(data, isFinal)
  File
"/usr/pkg/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py",
line 155, in start_element
    self._cont_handler.startElement(name, AttributesImpl(attrs))
  File "rsynct.py", line 90, in startElement
    apply(method, attrs)
  File
"/usr/pkg/lib/python2.0/site-packages/_xmlplus/sax/xmlreader.py", line
314, in __getitem__
    return self._attrs[name]
KeyError: 0


From uche.ogbuji@fourthought.com  Tue May 29 20:20:55 2001
From: uche.ogbuji@fourthought.com (Uche Ogbuji)
Date: Tue, 29 May 2001 13:20:55 -0600
Subject: [XML-SIG] getAttribute??
In-Reply-To: Message from "W. Eliot Kimber" <eliot@isogen.com>
 of "Mon, 28 May 2001 18:24:29 CDT." <3B12DE2D.6B57DF91@isogen.com>
Message-ID: <200105291920.f4TJKtL05250@localhost.local>

> Dom Linu wrote:
> > 
> > Wow -- very informative.  Thank you.  I was working on the assumption
> > that if namespaces weren't in use, that you use non-namespace
> > functions.  That seems to have worked for everything else that I'm
> > doing, but to be honest I can't remember if I've always been using the
> > Sax2 reader-- I would have to dig.  I mean, with the Sax2 reader
> > (implied by using FromXml) getElementsByTagName works, without using
> > getElementsByTagNameNS I'm pretty sure...  is this inconsistent, or am
> > I missing something?  (the latter probably being true!)
> 
> I considered the current behavior a bug (that non-namespace functions
> require a null namespace value) and fixed it in my local copy of the
> code. Unfortunately, I haven't had a chance to package up these fixes
> and submit them back to the SIG yet.

I consider this a bug in DOM, not the implementation.  Certainly, the current 
behavior of the Python DOMs is fully conformant.  We've been through this 
dance before.  Basically, as the DOM itself sternly warns: you don't mix NS 
and non-NS DOM usage unless you want trouble.

I'm not sure that I'd be willing to support any "fixes" that basically hack 
around this DOM confusion.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python


From eliot@isogen.com  Tue May 29 20:27:07 2001
From: eliot@isogen.com (W. Eliot Kimber)
Date: Tue, 29 May 2001 14:27:07 -0500
Subject: [XML-SIG] getAttribute??
References: <200105291920.f4TJKtL05250@localhost.local>
Message-ID: <3B13F80B.C224C543@isogen.com>

Uche Ogbuji wrote:

> I consider this a bug in DOM, not the implementation.  Certainly, the current
> behavior of the Python DOMs is fully conformant.  We've been through this
> dance before.  Basically, as the DOM itself sternly warns: you don't mix NS
> and non-NS DOM usage unless you want trouble.
> 
> I'm not sure that I'd be willing to support any "fixes" that basically hack
> around this DOM confusion.

I'll have to look at the code again, but it looked like a bug to me: the
API of the DOM-1 calls was not changed but they started failing when
using the DOM-2 code, and there was no reason for them to fail. They
failed because the DOM implementation code was not accounting for the
null namespace qualifier in the dictionary. 

But it's possible I've misunderstood how the code is supposed to work
and inappropriately fixed it. 

Cheers,

E.
-- 
. . . . . . . . . . . . . . . . . . . . . . . .

W. Eliot Kimber | Lead Brain

1016 La Posada Dr. | Suite 240 | Austin TX  78752
    T 512.656.4139 |  F 512.419.1860 | eliot@isogen.com

w w w . d a t a c h a n n e l . c o m


From Joern.Schrader@R-KOM.de  Wed May 30 15:44:11 2001
From: Joern.Schrader@R-KOM.de (=?iso-8859-1?Q?J=F6rn_Schrader?=)
Date: Wed, 30 May 2001 16:44:11 +0200
Subject: [XML-SIG] PyExpat and german umlaute
Message-ID: <09DBFC8BDA15D411A41D0090275130BC205CB8@ffserver>

I try to use pyexpat, an ISO-8859-1 encoded xml-file. but if there are any
german umlaute,
PyExpat raises an exception: UnicodeError: ASCII encoding error: ordinal not
in range(128).
PyExpat has got version 2.4.

What is wrong with it.


From noreply@sourceforge.net  Wed May 30 17:19:45 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Wed, 30 May 2001 09:19:45 -0700
Subject: [XML-SIG] [ pyxml-Bugs-428712 ] Installer problems: missing features
Message-ID: <E1558hF-0000Xs-00@usw-sf-web1.sourceforge.net>

Bugs item #428712, was updated on 2001-05-30 09:19
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=428712&group_id=6473

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mats Wichmann (mwichmann)
Assigned to: Nobody/Anonymous (nobody)
Summary: Installer problems: missing features

Initial Comment:
This is the result of operator error, but nonetheless...
I accidentally launched an install of PyXML on a w2k system where
it was already installed.  I know the instructions say to remove
old installations first (and this was not even an old installation)
as I said, Operator Error.  However, at this point
(a) the existing installation is not detected with a
bailout option
(b) there's no way to abort the installation once it starts
(c) you are prompted for EACH file as to whether to
replace or not; there is no "yes to all" (or "no to all") so one 
would potentially have to click "yes" or "no" hundreds of 
times to complete.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=106473&aid=428712&group_id=6473


From martin@loewis.home.cs.tu-berlin.de  Wed May 30 18:17:38 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 30 May 2001 19:17:38 +0200
Subject: [XML-SIG] PyExpat and german umlaute
In-Reply-To: <09DBFC8BDA15D411A41D0090275130BC205CB8@ffserver> (message from
 =?ISO-8859-1?Q?J=F6rn?= Schrader on Wed, 30 May 2001 16:44:11 +0200)
References: <09DBFC8BDA15D411A41D0090275130BC205CB8@ffserver>
Message-ID: <200105301717.f4UHHcM01025@mira.informatik.hu-berlin.de>

> I try to use pyexpat, an ISO-8859-1 encoded xml-file. but if there
> are any german umlaute, PyExpat raises an exception: UnicodeError:
> ASCII encoding error: ordinal not in range(128).  PyExpat has got
> version 2.4.
>
> What is wrong with it.

It is an error in your code. You should not try to write Unicode
objects directly into byte-oriented files; instead, you should invoke
an appropriate .encode method first.

Regards,
Martin


From noreply@sourceforge.net  Thu May 31 19:44:44 2001
From: noreply@sourceforge.net (noreply@sourceforge.net)
Date: Thu, 31 May 2001 11:44:44 -0700
Subject: [XML-SIG] [ pyxml-Patches-429102 ] Node.appendChild: raise if ancestor
Message-ID: <E155XR6-0004oD-00@usw-sf-web1.sourceforge.net>

Patches item #429102, was updated on 2001-05-31 11:44
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=429102&group_id=6473

Category: 4Suite
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Karl Anderson (karlanderson)
Assigned to: Nobody/Anonymous (nobody)
Summary: Node.appendChild: raise if ancestor

Initial Comment:
This patch raises a HierarchyRequestErr on an attempt
to appendChild with self or an ancestor.

This is required behavior, and besides, such attempts
were causing hangs during the _4dom_fireMutationEvent
call.

Found when running my PyXML checkout through the
Zope ParsedXML DOM test suite.  With this patch,
PyXML completes the suite without hanging.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=306473&aid=429102&group_id=6473