From derekfountain at yahoo.co.uk  Fri Jul  2 03:23:09 2004
From: derekfountain at yahoo.co.uk (Derek Fountain)
Date: Fri Jul  2 03:19:25 2004
Subject: [XML-SIG] xmlproc and 4DOM
Message-ID: <200407021523.09172.derekfountain@yahoo.co.uk>

Is it possible to use the xmlproc validating parser to parse an XML document 
into a 4DOM object model? If so, is there an example somewhere of how to do 
it?

-- 
> eatapple
core dump

From derekfountain at yahoo.co.uk  Mon Jul  5 01:37:50 2004
From: derekfountain at yahoo.co.uk (Derek Fountain)
Date: Mon Jul  5 01:33:32 2004
Subject: [XML-SIG] xmlproc and 4DOM
In-Reply-To: <40E792CE.2020306@doxdesk.com>
References: <200407021523.09172.derekfountain@yahoo.co.uk>
	<40E792CE.2020306@doxdesk.com>
Message-ID: <200407051337.50933.derekfountain@yahoo.co.uk>

On Sunday 04 July 2004 13:17, you wrote:
> > Is it possible to use the xmlproc validating parser to parse an XML
> > document into a 4DOM object model?
>
> Yes. Don't know if this is canonical, but I use:
>  >>> from xml.dom.ext.reader import Sax2
>  >>> Sax2.FromXmlFile('something.xml', validate= 1)

Ah, OK. I didn't think of looking at the Sax interface. Thanks, I'll try it.

> The 1.1 beta release of pxdom which I mentioned then is now out. It's
> slow, but on compliance to DOM specs it's fanatical...

Great, I'll look at that too.

That's my afternoon sorted out. :o)

-- 
> eatapple
core dump

From derekfountain at yahoo.co.uk  Mon Jul  5 06:37:22 2004
From: derekfountain at yahoo.co.uk (Derek Fountain)
Date: Mon Jul  5 06:33:02 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
Message-ID: <200407051837.22009.derekfountain@yahoo.co.uk>

I've spent the last few days tinkering with DOM trees and the DOM API. A 
couple of years back I wrote a fairly complex application which found the 
data it required using this nextSibling, firstChild, sort of navigation. I 
recall the development experience wasn't a terribly happy one, and I have 
always presumed that XPATH was largely invented to get past all this mucking 
about.

So it occurs to me to ask on the SIG list: do people still use the original 
DOM style navigation? When is it preferable to XPATH? Why, in short, is the 
whole "document hopping" idea not deprecated?!

From lance.ellinghaus at eds.com  Mon Jul  5 11:34:24 2004
From: lance.ellinghaus at eds.com (Ellinghaus, Lance)
Date: Mon Jul  5 11:35:04 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
Message-ID: <79D80D394197764997DC956801CABCEEA9A8EF@ushem204.exse01.exch.eds.com>

I use the DOM navigation all the time.
I do not know about XPATH so I cannot say if I would use that more than DOM.

Lance

Lance Ellinghaus
TWAI Operations Integration/Special Projects
Work Phone: 214-922-6458
Work Cell: 972-877-0409
Nextel: 142*52*5511
Home Phone: 940-271-1274
Email: lance.ellinghaus@eds.com


-----Original Message-----
From: xml-sig-bounces@python.org [mailto:xml-sig-bounces@python.org] On
Behalf Of Derek Fountain
Sent: Monday, July 05, 2004 6:37 AM
To: xml-sig@python.org
Subject: [XML-SIG] Does anyone do DOM navigation anymore?


I've spent the last few days tinkering with DOM trees and the DOM API. A 
couple of years back I wrote a fairly complex application which found the 
data it required using this nextSibling, firstChild, sort of navigation. I 
recall the development experience wasn't a terribly happy one, and I have 
always presumed that XPATH was largely invented to get past all this mucking

about.

So it occurs to me to ask on the SIG list: do people still use the original 
DOM style navigation? When is it preferable to XPATH? Why, in short, is the 
whole "document hopping" idea not deprecated?!

_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://mail.python.org/mailman/listinfo/xml-sig

From derekfountain at yahoo.co.uk  Tue Jul  6 05:40:38 2004
From: derekfountain at yahoo.co.uk (Derek Fountain)
Date: Tue Jul  6 05:36:10 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <79D80D394197764997DC956801CABCEEA9A8EF@ushem204.exse01.exch.eds.com>
References: <79D80D394197764997DC956801CABCEEA9A8EF@ushem204.exse01.exch.eds.com>
Message-ID: <200407061140.38949.derekfountain@yahoo.co.uk>

On Monday 05 July 2004 23:34, you wrote:
> I use the DOM navigation all the time.
> I do not know about XPATH so I cannot say if I would use that more than
> DOM.

How do you cope with the fact that documents are to some extent unpredictable? 
Do you make heavy use of the methods/attributes which allow you to "feel 
around" to see what's coming (hasChildNodes, nodeType and so on)? Or do you 
only use DOM when you can be guaranteed about the structure of the document, 
and you therefore know that, for example, 
currentNode.firstChild.firstChild.lastChild.firstChild.nodeValue will give 
you text you're after?

I'm starting to wonder if I've been doing the DOM right, as it were. It seems 
to me that when you don't know in advance how many children an element has, 
and you have to start feeling your way around, it makes the code rather 
fragile. Someone adds an extra child where your test cases never had one, and 
boom, the code breaks. Perhaps people code to the DTD, rather than any one 
document itself?
From tpassin at comcast.net  Tue Jul  6 06:04:26 2004
From: tpassin at comcast.net (Thomas B. Passin)
Date: Tue Jul  6 06:00:54 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <200407061140.38949.derekfountain@yahoo.co.uk>
References: <79D80D394197764997DC956801CABCEEA9A8EF@ushem204.exse01.exch.eds.com>
	<200407061140.38949.derekfountain@yahoo.co.uk>
Message-ID: <40EA24CA.50408@comcast.net>

Derek Fountain wrote:

>>I use the DOM navigation all the time.
>>I do not know about XPATH so I cannot say if I would use that more than
>>DOM.
> 
> 
> How do you cope with the fact that documents are to some extent unpredictable? 
> Do you make heavy use of the methods/attributes which allow you to "feel 
> around" to see what's coming (hasChildNodes, nodeType and so on)? Or do you 
> only use DOM when you can be guaranteed about the structure of the document, 
> and you therefore know that, for example, 
> currentNode.firstChild.firstChild.lastChild.firstChild.nodeValue will give 
> you text you're after?
> 

I tend to use getElementsByTagName() and getElementById() as much as 
possible.  These, along with parentNode, help you avoid - well, OK, 
reduce - that kind of dependence on precise structural details.  Of 
course, these are only useful if you know pretty well what you are 
looking for.  If you do, they reduce the fussiness.

In my own html/xhtml, I find that I also am helped by looking at the 
values of class attributes.  I would find it helpful if there were a 
specific call (in the html dom, anyway), thisNode.getElementsByClassName().

> I'm starting to wonder if I've been doing the DOM right, as it were. It seems 
> to me that when you don't know in advance how many children an element has, 
> and you have to start feeling your way around, it makes the code rather 
> fragile. 

You especially want to avoid getting fooled by whitespace-only text 
nodes and more generally, multiple PCData fragments.

Cheers,

Tom P

-- 
Thomas B. Passin
Explorer's Guide to the Semantic Web (Manning Books)
http://www.manning.com/catalog/view.php?book=passin
From bhartsho at yahoo.com  Tue Jul  6 08:08:19 2004
From: bhartsho at yahoo.com (brett hartshorn)
Date: Tue Jul  6 08:08:24 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <40EA24CA.50408@comcast.net>
Message-ID: <20040706060819.73583.qmail@web13422.mail.yahoo.com>

I use DOM for almost everything, its great but you have to extend it to do your searches in an
effective way.
See dotWith, this is my DOM extension. http://opart.org/dotWith/dotWith.py
-brett

--- "Thomas B. Passin" <tpassin@comcast.net> wrote:
> Derek Fountain wrote:
> 
> >>I use the DOM navigation all the time.
> >>I do not know about XPATH so I cannot say if I would use that more than
> >>DOM.
> > 
> > 
> > How do you cope with the fact that documents are to some extent unpredictable? 
> > Do you make heavy use of the methods/attributes which allow you to "feel 
> > around" to see what's coming (hasChildNodes, nodeType and so on)? Or do you 
> > only use DOM when you can be guaranteed about the structure of the document, 
> > and you therefore know that, for example, 
> > currentNode.firstChild.firstChild.lastChild.firstChild.nodeValue will give 
> > you text you're after?
> > 
> 
> I tend to use getElementsByTagName() and getElementById() as much as 
> possible.  These, along with parentNode, help you avoid - well, OK, 
> reduce - that kind of dependence on precise structural details.  Of 
> course, these are only useful if you know pretty well what you are 
> looking for.  If you do, they reduce the fussiness.
> 
> In my own html/xhtml, I find that I also am helped by looking at the 
> values of class attributes.  I would find it helpful if there were a 
> specific call (in the html dom, anyway), thisNode.getElementsByClassName().
> 
> > I'm starting to wonder if I've been doing the DOM right, as it were. It seems 
> > to me that when you don't know in advance how many children an element has, 
> > and you have to start feeling your way around, it makes the code rather 
> > fragile. 
> 
> You especially want to avoid getting fooled by whitespace-only text 
> nodes and more generally, multiple PCData fragments.
> 
> Cheers,
> 
> Tom P
> 
> -- 
> Thomas B. Passin
> Explorer's Guide to the Semantic Web (Manning Books)
> http://www.manning.com/catalog/view.php?book=passin
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig
> 


__________________________________
Do you Yahoo!?
Yahoo! Mail - 50x more storage than other providers!
http://promotions.yahoo.com/new_mail
From derekfountain at yahoo.co.uk  Tue Jul  6 10:09:15 2004
From: derekfountain at yahoo.co.uk (Derek Fountain)
Date: Tue Jul  6 10:04:43 2004
Subject: [XML-SIG] DOCTYPEs question with 4DOM
Message-ID: <200407061609.15077.derekfountain@yahoo.co.uk>

According to my book (Inside XML, New Riders) a document type declaration can 
have one of several forms:

<!DOCTYPE rootname [DTD]>
<!DOCTYPE rootname SYSTEM url>
<!DOCTYPE rootname SYSTEM url [DTD]>
<!DOCTYPE rootname PUBLIC identifier>
<!DOCTYPE rootname PUBLIC identifier [DTD]>

According to the DOM Level 3 specification, the createDocumentType method of 
the DOMImplementation interface takes 3 parameters: qualified name, publicId 
and systemId. I would have thought that rootname is the qualified name, 
publicId is the identifier (in the last two cases) and systemId is the url 
(in the 2nd and 3rd cases). I could be wrong, but that made sense. :o}

I want to create a doctype like this:

<!DOCTYPE test>

so I'd have thought that, using the 4DOM implementation from Python, I could 
say:

docType = implementation.createDocumentType( "test", None, None )

but when serialised, that doesn't produce a DOCTYPE line at all. This produces 
what I want:

docType =  implementation.createDocumentType( "", "test", "" )

and this:

docType =  implementation.createDocumentType( "1", "test", "2" )

produces:

<!DOCTYPE storage PUBLIC "test" "2">

with the "1" nowhere to be seen.

I have no idea what is going on. Can someone explain? Thanks! :o)


From bkline at rksystems.com  Tue Jul  6 13:30:59 2004
From: bkline at rksystems.com (Bob Kline)
Date: Tue Jul  6 13:03:55 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <200407061140.38949.derekfountain@yahoo.co.uk>
Message-ID: <Pine.LNX.4.44.0407060729240.3701-100000@rksystems.com>

On Tue, 6 Jul 2004, Derek Fountain wrote:

> On Monday 05 July 2004 23:34, you wrote:
> > I use the DOM navigation all the time.
> > I do not know about XPATH so I cannot say if I would use that more than
> > DOM.
> 
> How do you cope with the fact that documents are to some extent
> unpredictable?  Do you make heavy use of the methods/attributes which
> allow you to "feel around" to see what's coming (hasChildNodes,
> nodeType and so on)? Or do you only use DOM when you can be guaranteed
> about the structure of the document, and you therefore know that, for
> example,
> currentNode.firstChild.firstChild.lastChild.firstChild.nodeValue will
> give you text you're after?
> 
> I'm starting to wonder if I've been doing the DOM right, as it were.
> It seems to me that when you don't know in advance how many children
> an element has, and you have to start feeling your way around, it
> makes the code rather fragile. Someone adds an extra child where your
> test cases never had one, and boom, the code breaks. Perhaps people
> code to the DTD, rather than any one document itself?

We use XSL/T to boil down the source document to the pieces we're 
looking for into a predictable structure, then go after it with the DOM 
interface.

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com

From rmunn at pobox.com  Tue Jul  6 15:15:44 2004
From: rmunn at pobox.com (rmunn@pobox.com)
Date: Tue Jul  6 15:15:53 2004
Subject: [XML-SIG] Unwanted behavior in PrettyPrint: &gt; doesn't round-trip
Message-ID: <20040706131544.GD10151@rmunnlfs.dyndns.org>

I'm trying to use xml.dom.ext.PrettyPrint to pretty-print some XML data
to a file, and discovering that it doesn't quite do what I want. Here's
an example:

Python 2.3.4 (#1, Jun  5 2004, 10:44:08) 
[GCC 3.3.3 20040412 (Gentoo Linux 3.3.3-r5, ssp-3.3-7, pie-8.7.5.3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from xml.dom import minidom
>>> from xml.dom.ext import PrettyPrint
>>> doc = minidom.parseString('<description>This contains a nested &lt;b&gt; tag</description>')
>>> doc
<xml.dom.minidom.Document instance at 0x403b8a8c>
>>> PrettyPrint(doc)
<?xml version='1.0' encoding='UTF-8'?>
<description>This contains a nested &lt;b> tag</description>
>>> 

I'd prefer the output to be:
"""<?xml version='1.0' encoding='UTF-8'?>
<description>This contains a nested &lt;b&gt; tag</description>
"""

This XML data is eventually going to be going into an HTML page and sent
to the user's browser. Since the > character doesn't close any tags,
most browsers will probably display it. But with the vast number of
different browsers out there, with slightly different behavior, I'd
rather not rely on "probably". :-( I'd prefer for the &gt; entity to
make it through a round trip (parse to print) untouched.

Is there any way for me to tell PrettyPrint not to dereference character
entities?

-- 
Robin Munn
rmunn@pobox.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20040706/7fc3bf52/attachment.pgp
From cbearden at hal-pc.org  Tue Jul  6 17:07:45 2004
From: cbearden at hal-pc.org (Chuck Bearden)
Date: Tue Jul  6 17:07:52 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <200407051837.22009.derekfountain@yahoo.co.uk>
References: <200407051837.22009.derekfountain@yahoo.co.uk>
Message-ID: <20040706150745.GC7473@hal-pc.org>

On Mon, Jul 05, 2004 at 06:37:22PM +0800, Derek Fountain wrote:
> I've spent the last few days tinkering with DOM trees and the DOM API. A 
> couple of years back I wrote a fairly complex application which found the 
> data it required using this nextSibling, firstChild, sort of navigation. I 
> recall the development experience wasn't a terribly happy one, and I have 
> always presumed that XPATH was largely invented to get past all this mucking 
> about.
> 
> So it occurs to me to ask on the SIG list: do people still use the original 
> DOM style navigation? When is it preferable to XPATH? Why, in short, is the 
> whole "document hopping" idea not deprecated?!

My main use of the DOM has been to scrape the USPTO[1] pages containing 
individual records (sample patent[2]).  I don't count elements; rather, 
I use clues that are both structural and semantic.  Typically, the 
elements I want are labeled, either in a preceding table cell, or in 
a preceeding center, bold, or italicized text element.  E.g. to find 
the patent number and issue date of a patent, I use 
getElementsByTagName() to find all table cells, then look for one 
whose text content reduces to "United States Patent".  At this point 
I know that the next sibling TD contains the patent number, and that 
the second cell of the succeeding row contains the issue date (go up to
parent TR, go up to parent TBODY, choose the second TD of the second 
child TR).  Or, to find the abstract, I examine the direct children 
of BODY until I find a CENTER element whose text reduces to "Abstract",
whereupon I accumulate text until the next HR.  I'm sure this is very
un-XML-like, but I need this data and the approach works.

I use twisted.web.microdom with the 'beExtremelyLenient' flag set to
True.  There are some crude HTML flaws that first must be fixed, then I
run the document through mx.Tidy, then I build the extremely lenient 
microdom.

Chuck


[1] http://www.uspto.gov/patft/index.html
[2] http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=/netahtml/search-bool.html&r=4&f=G&l=50&co1=AND&d=ptxt&s1=tobacco&OS=tobacco&RS=tobacco
From jgoldfarb at mitre.org  Tue Jul  6 20:39:54 2004
From: jgoldfarb at mitre.org (Joshua M. Goldfarb)
Date: Tue Jul  6 20:40:12 2004
Subject: [XML-SIG] SAML Request Question
Message-ID: <200407061851.i66IpqJ26308@smtp-mclean.mitre.org>

Skipped content of type multipart/alternative-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4622 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20040706/fd5e0aae/smime.bin
From mike at skew.org  Tue Jul  6 21:16:47 2004
From: mike at skew.org (Mike Brown)
Date: Tue Jul  6 21:17:23 2004
Subject: [XML-SIG] Unwanted behavior in PrettyPrint: &gt;
	doesn't round-trip
In-Reply-To: <20040706131544.GD10151@rmunnlfs.dyndns.org> "from rmunn@pobox.com
	at Jul 6, 2004 08:15:44 am"
Message-ID: <200407061916.i66JGl4J088888@chilled.skew.org>

rmunn@pobox.com wrote:
> <?xml version='1.0' encoding='UTF-8'?>
> <description>This contains a nested &lt;b> tag</description>
> >>> 
> 
> I'd prefer the output to be:
> """<?xml version='1.0' encoding='UTF-8'?>
> <description>This contains a nested &lt;b&gt; tag</description>
> """
> 
> This XML data is eventually going to be going into an HTML page and sent
> to the user's browser. Since the > character doesn't close any tags,
> most browsers will probably display it. But with the vast number of
> different browsers out there, with slightly different behavior, I'd
> rather not rely on "probably". :-( I'd prefer for the &gt; entity to
> make it through a round trip (parse to print) untouched.

There are no browsers that will have a problem with an unescaped ">".

This is one of those situations where paranoia about web browser behavior is 
not supported by reality, much like when people freak out about putting 
"&amp;" in an href.

> Is there any way for me to tell PrettyPrint not to dereference character
> entities?

Dereferencing occurs during parsing. What you want is to be able to customize
the serialization behavior.

Runtime modifications to xml.dom.ext.Printer.g_charToEntity don't seem to have 
any effect, so I'd say no, it's not possible. Don't worry about it, IMHO.
From mina_pp at hotmail.com  Wed Jul  7 03:45:32 2004
From: mina_pp at hotmail.com (Jumpei Aoki)
Date: Wed Jul  7 03:49:20 2004
Subject: [XML-SIG] Questions about XBEL(licenses, namespaces, etc)
Message-ID: <BAY8-DAV50EinbnmW060004fda9@hotmail.com>

Hello,

I do a programming for hobby, 
and I want to create a bookmark interchange software for my own study.
I think XBEL is a great format to use, and I wish to use this format, 
but a few questions came along and I was wondering if you could help.

1) Is there are "namespaces" for these XBEL elements?
If so, is it "http://pyxml.sourceforge.net/topics/xbel/"?
If it does not exist, could I use the above as the namespace,
or do I have to leave the namespace out?

2) If I have enough skill, I would want to create a 
freeware and distribute it over the net.  
Is there are licenses for XBEL?  In other words,
is there anything that I need to do if I use XBEL in my software?
I read http://pyxml.sourceforge.net/topics/xbel/ but I could not find
any statements about licenses.

3) Am I free to extend XBEL?  I don't think I would need to, but
if there is need, could I extend XBEL and add some other elements?

It would be of a great help if you can answer these questions for me.
Thanks for your time.

Jumpei Aoki
From tpassin at comcast.net  Wed Jul  7 04:04:13 2004
From: tpassin at comcast.net (Thomas B. Passin)
Date: Wed Jul  7 04:00:38 2004
Subject: [XML-SIG] DOCTYPEs question with 4DOM
In-Reply-To: <200407061609.15077.derekfountain@yahoo.co.uk>
References: <200407061609.15077.derekfountain@yahoo.co.uk>
Message-ID: <40EB5A1D.1070906@comcast.net>

Derek Fountain wrote:

> According to my book (Inside XML, New Riders) a document type declaration can 
> have one of several forms:
> 
> <!DOCTYPE rootname [DTD]>
> <!DOCTYPE rootname SYSTEM url>
> <!DOCTYPE rootname SYSTEM url [DTD]>
> <!DOCTYPE rootname PUBLIC identifier>
> <!DOCTYPE rootname PUBLIC identifier [DTD]>
> 

Not quite ... the XML Rec requires a system identifier even when there 
is a PUBLIC keyword and identifier.

> According to the DOM Level 3 specification, the createDocumentType method of 
> the DOMImplementation interface takes 3 parameters: qualified name, publicId 
> and systemId. I would have thought that rootname is the qualified name, 
> publicId is the identifier (in the last two cases) and systemId is the url 
> (in the 2nd and 3rd cases). I could be wrong, but that made sense. :o}
> 
> I want to create a doctype like this:
> 
> <!DOCTYPE test>
> 
> and this:
> 
> docType =  implementation.createDocumentType( "1", "test", "2" )
> 
> produces:
> 
> <!DOCTYPE storage PUBLIC "test" "2">
> 
> with the "1" nowhere to be seen.
> 
> I have no idea what is going on. Can someone explain? Thanks! :o)

I can see how it would not know what to do with the "1", since it cannot 
be a legal element name, but some of those outputs look pretty strange, 
don't they?  But what version of the PyXML package are you using?  I 
just created document types like yours and they at least had the proper 
sytem and public IDs (I did not serialize anything, though)

Cheers,

Tom P

-- 
Thomas B. Passin
Explorer's Guide to the Semantic Web (Manning Books)
http://www.manning.com/catalog/view.php?book=passin
From derekfountain at yahoo.co.uk  Wed Jul  7 09:58:12 2004
From: derekfountain at yahoo.co.uk (Derek Fountain)
Date: Wed Jul  7 09:53:30 2004
Subject: [XML-SIG] DOCTYPEs question with 4DOM
In-Reply-To: <40EB5A1D.1070906@comcast.net>
References: <200407061609.15077.derekfountain@yahoo.co.uk>
	<40EB5A1D.1070906@comcast.net>
Message-ID: <200407071558.12919.derekfountain@yahoo.co.uk>

On Wednesday 07 July 2004 10:04, Thomas B. Passin wrote:
> Derek Fountain wrote:
> > According to my book (Inside XML, New Riders) a document type declaration
> > can have one of several forms:
> >
> > <!DOCTYPE rootname [DTD]>
> > <!DOCTYPE rootname SYSTEM url>
> > <!DOCTYPE rootname SYSTEM url [DTD]>
> > <!DOCTYPE rootname PUBLIC identifier>
> > <!DOCTYPE rootname PUBLIC identifier [DTD]>
>
> Not quite ... the XML Rec requires a system identifier even when there
> is a PUBLIC keyword and identifier.

Um, I was just quoting the book! Since my example doesn't have a DTD I was 
just after the first option without the optional DTD.

> > I want to create a doctype like this:
> >
> > <!DOCTYPE test>
> >
> > and this:
> >
> > docType =  implementation.createDocumentType( "1", "test", "2" )
> >
> > produces:
> >
> > <!DOCTYPE storage PUBLIC "test" "2">
> >
> > with the "1" nowhere to be seen.
> >
> > I have no idea what is going on. Can someone explain? Thanks! :o)
>
> I can see how it would not know what to do with the "1", since it cannot
> be a legal element name,

Fair point, but it's actually the same with any legal element name.

> but some of those outputs look pretty strange, 
> don't they?  But what version of the PyXML package are you using?  I
> just created document types like yours and they at least had the proper
> sytem and public IDs (I did not serialize anything, though)

PyXML-0.8.3 on SUSE-9.1. If you didn't serialize anything, how are you seeing 
the system and public IDs?

I'm totally confused. It seems the only way to get a DOCTYPE with the correct 
root element and no SYSTEM or PUBLIC ids is to provide a non blank PUBLIC id:

docType =  implementation.createDocumentType( None, "xxx", None )
document = implementation.createDocument( None, "test", docType )

which gives:

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE test>
<test>
...

The "xxx" seems to be ignored, but passing None or a blank string in its place 
means the serialisation doesn't produce a DOCTYPE line at all.

-- 
> eatapple
core dump
From derekfountain at yahoo.co.uk  Wed Jul  7 10:27:45 2004
From: derekfountain at yahoo.co.uk (Derek Fountain)
Date: Wed Jul  7 10:23:02 2004
Subject: [XML-SIG] DOCTYPEs question with 4DOM (4DOM bug?)
In-Reply-To: <200407071558.12919.derekfountain@yahoo.co.uk>
References: <200407061609.15077.derekfountain@yahoo.co.uk>
	<40EB5A1D.1070906@comcast.net>
	<200407071558.12919.derekfountain@yahoo.co.uk>
Message-ID: <200407071627.46005.derekfountain@yahoo.co.uk>

> PyXML-0.8.3 on SUSE-9.1. If you didn't serialize anything, how are you
> seeing the system and public IDs?
>
> I'm totally confused. It seems the only way to get a DOCTYPE with the
> correct root element and no SYSTEM or PUBLIC ids is to provide a non blank
> PUBLIC id:
>
> docType =  implementation.createDocumentType( None, "xxx", None )
> document = implementation.createDocument( None, "test", docType )
>
> which gives:
>
> <?xml version='1.0' encoding='UTF-8'?>
> <!DOCTYPE test>
> <test>
> ...
>
> The "xxx" seems to be ignored, but passing None or a blank string in its
> place means the serialisation doesn't produce a DOCTYPE line at all.

Replying to myself, it seems to be a problem specific to 4DOM. The serialiser 
in dom/ext/Printer.py is quite clear:

    def visitDocumentType(self, doctype):
        if not doctype.systemId and not doctype.publicId: return

Both the sax/saxutils.py serialiser and the dom/minidom.py serialiser print 
the "<DOCTYPE" part of the line before considering if a system or public id 
is known about.

That visitDocumentType() method is pretty straightforward, but it doesn't look 
right. The XML-1.1 spec, which I'm not really familiar with, but which isn't 
too hard to read, seems to say that a doctype with neither an External ID nor 
an internal subset is valid, so shouldn't the serialiser be able to produce 
it?

/me remains puzzled... :o)

-- 
> eatapple
core dump
From m at mongers.org  Wed Jul  7 11:03:06 2004
From: m at mongers.org (Morten Liebach)
Date: Wed Jul  7 11:09:19 2004
Subject: [XML-SIG] Unwanted behavior in PrettyPrint: &gt;
	doesn't round-trip
In-Reply-To: <200407061916.i66JGl4J088888@chilled.skew.org>
References: <20040706131544.GD10151@rmunnlfs.dyndns.org>
	<200407061916.i66JGl4J088888@chilled.skew.org>
Message-ID: <20040707090328.GB11721@mongers.org>

On 2004-07-06 13:16:47 -0600, Mike Brown wrote:
> rmunn@pobox.com wrote:
> > <?xml version='1.0' encoding='UTF-8'?>
> > <description>This contains a nested &lt;b> tag</description>
> > >>> 
> > 
> > I'd prefer the output to be:
> > """<?xml version='1.0' encoding='UTF-8'?>
> > <description>This contains a nested &lt;b&gt; tag</description>
> > """
> > 
> > This XML data is eventually going to be going into an HTML page and sent
> > to the user's browser. Since the > character doesn't close any tags,
> > most browsers will probably display it. But with the vast number of
> > different browsers out there, with slightly different behavior, I'd
> > rather not rely on "probably". :-( I'd prefer for the &gt; entity to
> > make it through a round trip (parse to print) untouched.
> 
> There are no browsers that will have a problem with an unescaped ">".
> 
> This is one of those situations where paranoia about web browser behavior is 
> not supported by reality, much like when people freak out about putting 
> "&amp;" in an href.

Probably true in this case, as the output, judging from the example, is
going to be valid XML or XHTML of some sort.

If the output is not valid and the browser spot this it goes into
tagsoup parsing mode, and nobody know what that means, it's not defined
by any standards or docs.  Then it might help to escape '>', otherwise
not.

Have a nice day
                                 Morten

-- 
http://m.mongers.org/ -- http://gallery.zentience.org/
__END__
From mike at skew.org  Wed Jul  7 11:45:54 2004
From: mike at skew.org (Mike Brown)
Date: Wed Jul  7 11:45:53 2004
Subject: [XML-SIG] DOCTYPEs question with 4DOM (4DOM bug?)
In-Reply-To: <200407071627.46005.derekfountain@yahoo.co.uk> "from Derek
	Fountain at Jul 7, 2004 04:27:45 pm"
Message-ID: <200407070945.i679jswi093142@chilled.skew.org>

Derek Fountain wrote:
> Replying to myself, it seems to be a problem specific to 4DOM. The serialiser 
> in dom/ext/Printer.py is quite clear:
> 
>     def visitDocumentType(self, doctype):
>         if not doctype.systemId and not doctype.publicId: return

Definitely a bug.

> 
> Both the sax/saxutils.py serialiser and the dom/minidom.py serialiser print 
> the "<DOCTYPE" part of the line before considering if a system or public id 
> is known about.
> 
> That visitDocumentType() method is pretty straightforward, but it doesn't look 
> right. The XML-1.1 spec, which I'm not really familiar with, but which isn't 
> too hard to read, seems to say that a doctype with neither an External ID nor 
> an internal subset is valid, so shouldn't the serialiser be able to produce 
> it?

I don't think there has been any effort in PyXML to support XML 1.1, which is 
only 5 months old and doesn't have many advocates in this corner of the net, 
AFAIK.

However PyXML does of course support XML 1.0, which is not deprecated (the W3C 
encourages people to use 1.0 if they don't need the features of 1.1), and all 
3 editions of XML 1.0 agree with XML 1.1 on the almost-empty doctype issue, so 
it should be filed as a bug if you generated one but can't serialize it 
properly.

File it at http://sourceforge.net/tracker/?group_id=6473&atid=106473
and mention this thread, which starts at
http://mail.python.org/pipermail/xml-sig/2004-July/010334.html
and if you feel adventurous, include a patch.

-Mike
From tpassin at comcast.net  Wed Jul  7 23:08:01 2004
From: tpassin at comcast.net (tpassin@comcast.net)
Date: Wed Jul  7 23:08:06 2004
Subject: [XML-SIG] Unwanted behavior in PrettyPrint: &gt;
	doesn't round-trip
Message-ID: <070720042108.10877.40EC663100068A5E00002A7D220076143802079C9C0E9F9B@comcast.net>

Morten Liebach wrote -

> > On 2004-07-06 13:16:47 -0600, Mike Brown wrote:
> > There are no browsers that will have a problem with an unescaped ">".
> > 
> > This is one of those situations where paranoia about web browser behavior is 
> > not supported by reality, much like when people freak out about putting 
> > "&amp;" in an href.
> 
> Probably true in this case, as the output, judging from the example, is
> going to be valid XML or XHTML of some sort.
> 
> If the output is not valid and the browser spot this it goes into
> tagsoup parsing mode, and nobody know what that means, it's not defined
> by any standards or docs.  Then it might help to escape '>', otherwise
> not.

But the ">" never *has* to be escaped in xml (except in cdata sections if it appears as part of "]]>" that is actually character data).

Only a pretty old browser would have problems with ">".  The HTML 54.01 Rec says this -

"Similarly, authors should use "&gt;" (ASCII decimal 62) in text instead of ">" to avoid problems with older user agents that incorrectly perceive this as the end of a tag (tag close delimiter) when it appears in quoted attribute values."

Like Mike says, this is a non-issue in practice.

Cheers,

Tom P


From noreply at sourceforge.net  Thu Jul  8 04:05:08 2004
From: noreply at sourceforge.net (SourceForge.net)
Date: Thu Jul  8 04:05:10 2004
Subject: [XML-SIG] [ pyxml-Bugs-986995 ] Serializer doesn't create DOCTYPE
	correctly
Message-ID: <E1BiOHc-0006Ya-00@sc8-sf-web3.sourceforge.net>

Bugs item #986995, was opened at 2004-07-08 10:05
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=986995&group_id=6473

Category: DOM
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Derek Fountain (derekfountain)
Assigned to: Nobody/Anonymous (nobody)
Summary: Serializer doesn't create DOCTYPE correctly

Initial Comment:
The 4DOM serializer doesn't generate DOCTYPE lines
properly. If the doctype node doesn't have a system or
public id, no "<!DOCTYPE..." line is generated. The
code in dom/ext/Printer.py is quite clear:

    def visitDocumentType(self, doctype):
        if not doctype.systemId and not
doctype.publicId: return

which is wrong. It should print the "<!DOCTYPE
rootname>" line and then exit.

See the thread here:

http://mail.python.org/pipermail/xml-sig/2004-July/010334.html


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=986995&group_id=6473
From derekfountain at yahoo.co.uk  Thu Jul  8 04:11:34 2004
From: derekfountain at yahoo.co.uk (Derek Fountain)
Date: Thu Jul  8 04:07:27 2004
Subject: [XML-SIG] DOCTYPEs question with 4DOM (4DOM bug?)
In-Reply-To: <200407070945.i679jswi093142@chilled.skew.org>
References: <200407070945.i679jswi093142@chilled.skew.org>
Message-ID: <200407081011.34253.derekfountain@yahoo.co.uk>

> Definitely a bug.
> File it at http://sourceforge.net/tracker/?group_id=6473&atid=106473
> and mention this thread, which starts at
> http://mail.python.org/pipermail/xml-sig/2004-July/010334.html
> and if you feel adventurous, include a patch.

Done, although no patch. It should be simple to fix but I'm bound to get some 
detail wrong if I try!
From tpassin at comcast.net  Thu Jul  8 04:27:20 2004
From: tpassin at comcast.net (Thomas B. Passin)
Date: Thu Jul  8 04:23:42 2004
Subject: [XML-SIG] DOCTYPEs question with 4DOM
In-Reply-To: <200407071558.12919.derekfountain@yahoo.co.uk>
References: <200407061609.15077.derekfountain@yahoo.co.uk>	<40EB5A1D.1070906@comcast.net>
	<200407071558.12919.derekfountain@yahoo.co.uk>
Message-ID: <40ECB108.6070507@comcast.net>

Derek Fountain wrote:

> If you didn't serialize anything, how are you seeing 
> the system and public IDs?
> 

Simple - from DocumentType.py -

class DocumentType(FtNode):
     nodeType = Node.DOCUMENT_TYPE_NODE

     def __init__(self, name, entities, notations, publicId, systemId):
         FtNode.__init__(self, None)
         self.__dict__['__nodeName'] = name
         self._entities = entities
         self._notations = notations
         self._publicId = publicId
         self._systemId = systemId

I just checked the _publicId and _systemId attributes of the instances I 
had created.  At least I know they started out in life as intended.

Cheers,

Tom P

-- 
Thomas B. Passin
Explorer's Guide to the Semantic Web (Manning Books)
http://www.manning.com/catalog/view.php?book=passin

From mike at skew.org  Fri Jul  9 07:34:46 2004
From: mike at skew.org (Mike Brown)
Date: Fri Jul  9 07:34:43 2004
Subject: [XML-SIG] updated 4DOM README
Message-ID: <200407090534.i695YkHv005068@chilled.skew.org>

Attached is a new README for 4DOM.
If it looks good, can someone commit it?
The file exists in 2 places in the PyXML source tree:

README.dom
xml/dom/README

Thanks.
-Mike

-------------- next part --------------
4DOM

Description
===========

4DOM is a Python-based implementation of the W3C-recommended
Document Object Model API.  Specifically, 4DOM implements

  * DOM Level 2 Core Version 1.0 (13 November 2000 Recommendation),
  * DOM HTML Level 2 (13 November 2000 Working Draft), and
  * DOM Level 2 Traversal (13 November 2000 Recommendation)

4DOM should work on all platforms supported by Python.


Installation
============

4DOM is built and installed with PyXML's other libraries when you run
setup.py.  It cannot be installed separately.


License/Copyright
=================

For now, 4DOM retains its license and copyright that it had when
developed by Fourthought, Inc. <http://fourthought.com/>.  See the
LICENCE file in the PyXML distribution and/or the COPYRIGHT file in
the xml/dom subdirectory for complete copyright and terms of license.


Documentation
=============

Python's generic DOM API, as shared by both 4DOM and minidom, is
documented at

  http://www.python.org/doc/current/lib/module-xml.dom.html

The DOM APIs that are implemented in 4DOM are specified at

  http://www.w3.org/TR/DOM-Level-2-Core/
  http://www.w3.org/TR/DOM-Level-2-HTML/
  http://www.w3.org/TR/DOM-Level-2-Traversal-Range/

Compliance issues in the 4DOM core API are summarized at

  http://pyxml.sourceforge.net/topics/compliance.html


Development
===========

4DOM is open-source and is maintained by the PyXML development
community.  Most of 4DOM's original development was undertaken by
Fourthought, Inc., from 1998-2000.  4DOM was incorporated into PyXML
starting with the PyXML 0.6.0 release in September 2000, and was
distributed in both PyXML and 4Suite for a time.  It ceased being
distributed in 4Suite starting with the PyXML 0.6.4 release in
February 2001, at which time maintenance was handed over entirely to
the PyXML team.  Most development since then has concentrated on bug
fixes and compatibility issues.


Contact and Support
===================

Please direct comments and questions to the Python XML-SIG mailing
list at xml-sig@python.org. For more information about the list and
to subscribe or browse the archives, visit

  http://mail.python.org/mailman/listinfo/xml-sig

To search the archives, use a search engine and restrict matches to
pages in the domain mail.python.org. For example, in Google, include
the search term site:mail.python.org

You may file bug reports in the PyXML bug tracker at

  http://sourceforge.net/tracker/?group_id=6473&atid=106473

From chris.irish at libertydistribution.com  Fri Jul  9 18:44:23 2004
From: chris.irish at libertydistribution.com (Chris Irish)
Date: Fri Jul  9 18:44:27 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <200407051837.22009.derekfountain@yahoo.co.uk>
References: <200407051837.22009.derekfountain@yahoo.co.uk>
Message-ID: <40EECB67.8020204@libertydistribution.com>

Derek Fountain wrote:

>
>So it occurs to me to ask on the SIG list: do people still use the original 
>DOM style navigation? When is it preferable to XPATH? Why, in short, is the 
>whole "document hopping" idea not deprecated?!
>  
>
This may be a little late, but I try to stay away from both SAX & DOM 
whenever possible.  I too have found XPATH to be the easiest/fastest 
parser I've come across.  When I write some GUI apps that need to do a 
lot of XML parsing I find SAX to be a pain in the butt and DOM will slow 
down my programs quite alot.  Especially if I need to parse one XML file 
to get info to find or lookup some other XML file.  Maybe it's just me 
but when someone uses a program I've written I don't want them to have 
to sit there wondering if the app froze or something else.  If anyone 
hasen't given XPATH a look/try I recommend it highly.  Chris

>_______________________________________________
>XML-SIG maillist  -  XML-SIG@python.org
>http://mail.python.org/mailman/listinfo/xml-sig
>
>  
>

From bkline at rksystems.com  Fri Jul  9 19:48:14 2004
From: bkline at rksystems.com (Bob Kline)
Date: Fri Jul  9 19:20:05 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <40EECB67.8020204@libertydistribution.com>
Message-ID: <Pine.LNX.4.44.0407091342570.24273-100000@rksystems.com>

On Fri, 9 Jul 2004, Chris Irish wrote:

> This may be a little late, but I try to stay away from both SAX & DOM
> whenever possible.  I too have found XPATH to be the easiest/fastest
> parser I've come across.  When I write some GUI apps that need to do a
> lot of XML parsing I find SAX to be a pain in the butt and DOM will
> slow down my programs quite alot.  Especially if I need to parse one
> XML file to get info to find or lookup some other XML file.  Maybe
> it's just me but when someone uses a program I've written I don't want
> them to have to sit there wondering if the app froze or something
> else.  If anyone hasen't given XPATH a look/try I recommend it highly.  

Which implementation of XPath are you using?  Do you have benchmark 
figures showing it to be faster than DOM.  My understanding (which may 
not be correct) is that XPath is generally implemented as a layer over 
the DOM, which of course would mean that by definition it could not be 
faster than the DOM alone.  I'll be happy to have this understanding 
demonstrated to be incorrect, but I'd prefer numbers over anecdotal 
reports.

Thanks.

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com

From tpassin at comcast.net  Fri Jul  9 20:24:40 2004
From: tpassin at comcast.net (Thomas B. Passin)
Date: Fri Jul  9 20:20:56 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <Pine.LNX.4.44.0407091342570.24273-100000@rksystems.com>
References: <Pine.LNX.4.44.0407091342570.24273-100000@rksystems.com>
Message-ID: <40EEE2E8.6040604@comcast.net>

Bob Kline wrote:
> 
> Which implementation of XPath are you using?  Do you have benchmark 
> figures showing it to be faster than DOM.  My understanding (which may 
> not be correct) is that XPath is generally implemented as a layer over 
> the DOM, which of course would mean that by definition it could not be 
> faster than the DOM alone.  I'll be happy to have this understanding 
> demonstrated to be incorrect, but I'd prefer numbers over anecdotal 
> reports.

Often an xpath or xslt implementation will use a special, streamlined 
DOM that is much faster than a standard W3C DOM.  That is the case for 
Saxon and 4Suite, for example.

Cheers,

Tom P

-- 
Thomas B. Passin
Explorer's Guide to the Semantic Web (Manning Books)
http://www.manning.com/catalog/view.php?book=passin
From abra9823 at mail.usyd.edu.au  Mon Jul 12 12:43:52 2004
From: abra9823 at mail.usyd.edu.au (Ajay Brar)
Date: Mon Jul 12 12:44:01 2004
Subject: [XML-SIG] xml.marshal
Message-ID: <01a901c467fd$218aae20$5700a8c0@nazgul>

hi!

I am trying to use the xml.marshal module. basically i have defined a few classes and have an object that contains other objects of these classes.
I would like to write this object out in XML and also read a well formed XML file and construct the object from it. I have already defined the DTD to be used.
what i am now looking for are example on how to do it? the PyXML HOWTO  mentions subclassing Marshall and UnMarshall classes, but thats all it mentions. does anyone have any examples of how to do this, or links to any tutorials that explain this.

many thanks
cheers


Ajay Brar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20040712/89c52f6c/attachment.htm
From f.probst at uni-muenster.de  Tue Jul 13 14:08:11 2004
From: f.probst at uni-muenster.de (Florian Probst)
Date: Tue Jul 13 14:08:14 2004
Subject: [XML-SIG] WSDL extension
Message-ID: <40F3D0AB.70103@uni-muenster.de>


Hi all,
after reading through the WSDL spec it is still hard to tell whether the 
extension below is valid (legal) or not. Perhaps you can tell within a 
second....

<wsdl:message name="calculatePlumeResponse">
    <wsdl:part name="calculatePlumeReturn" type="xsd:base64Binary" 
*SeDA:semRef="http://www.aaa.de/A_CalPl.owl#Plume"*/>
</wsdl:message>
<wsdl:message name="calculatePlumeRequest">
    <wsdl:part name="origin" type="tns1:Point" 
*SeDA:semRef="http://www.aaa.deA_CalPl.owl#Origin"*/>
    <wsdl:part name="windSpeed" type="xsd:float" 
*SeDA:semRef="http://www.aaa.de/A_CalPl.owl#WindSpeed"*/>
    <wsdl:part name="windDirection" type="xsd:float" 
*SeDA:semRef="http://www.aaa.de/A_CalPl.owl#WindDirection"*/>
    <wsdl:part name="emissionRate" type="xsd:float" 
*SeDA:semRef="http://www.aaa.de/A_CalPl.owl#WindEmissionRate"*/>
</wsdl:message>

We plan to describe the meaning of the terms used in a WSDL with the 
help of ontologies....
Thanks in advance

Florian

-- 
Florian Probst
Institute for Geoinformatics (ifgi)
fon_________+251 83-30058
fax_________+251 83-39763
http://ifgi.uni-muenster.de/~probsfl


-- 
Florian Probst
Institute for Geoinformatics (ifgi)
fon_________+251 83-30058
fax_________+251 83-39763
http://ifgi.uni-muenster.de/~probsfl


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20040713/5673dc11/attachment.html
From JRBoverhof at lbl.gov  Thu Jul 15 07:18:28 2004
From: JRBoverhof at lbl.gov (Joshua Boverhof)
Date: Thu Jul 15 07:18:37 2004
Subject: [XML-SIG] WSDL extension
In-Reply-To: <40F3D0AB.70103@uni-muenster.de>
References: <40F3D0AB.70103@uni-muenster.de>
Message-ID: <40F613A4.6040008@lbl.gov>

According to the WSDL-1.1 schema a "part" has an "anyAttribute" that can
represent an attribute from any namespace other than the WSDL namespace.
So "SeDA:semRef" is legal as long as "SeDA" does not represent
"http://schemas.xmlsoap.org/wsdl/"

-josh


Florian Probst wrote:

>
> Hi all,
> after reading through the WSDL spec it is still hard to tell whether 
> the extension below is valid (legal) or not. Perhaps you can tell 
> within a second....
>
> <wsdl:message name="calculatePlumeResponse">
>     <wsdl:part name="calculatePlumeReturn" type="xsd:base64Binary" 
> *SeDA:semRef="http://www.aaa.de/A_CalPl.owl#Plume"*/>
> </wsdl:message>
> <wsdl:message name="calculatePlumeRequest">
>     <wsdl:part name="origin" type="tns1:Point" 
> *SeDA:semRef="http://www.aaa.deA_CalPl.owl#Origin"*/>
>     <wsdl:part name="windSpeed" type="xsd:float" 
> *SeDA:semRef="http://www.aaa.de/A_CalPl.owl#WindSpeed"*/>
>     <wsdl:part name="windDirection" type="xsd:float" 
> *SeDA:semRef="http://www.aaa.de/A_CalPl.owl#WindDirection"*/>
>     <wsdl:part name="emissionRate" type="xsd:float" 
> *SeDA:semRef="http://www.aaa.de/A_CalPl.owl#WindEmissionRate"*/>
> </wsdl:message>
>
> We plan to describe the meaning of the terms used in a WSDL with the 
> help of ontologies....
> Thanks in advance
>
> Florian


From uche.ogbuji at fourthought.com  Tue Jul 20 14:38:10 2004
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Tue Jul 20 14:38:17 2004
Subject: [XML-SIG] Does anyone do DOM navigation anymore?
In-Reply-To: <200407061140.38949.derekfountain@yahoo.co.uk>
References: <79D80D394197764997DC956801CABCEEA9A8EF@ushem204.exse01.exch.eds.com>
	<200407061140.38949.derekfountain@yahoo.co.uk>
Message-ID: <1090327090.11655.10944.camel@borgia>

On Mon, 2004-07-05 at 21:40, Derek Fountain wrote:
> On Monday 05 July 2004 23:34, you wrote:
> > I use the DOM navigation all the time.
> > I do not know about XPATH so I cannot say if I would use that more than
> > DOM.
> 
> How do you cope with the fact that documents are to some extent unpredictable? 
> Do you make heavy use of the methods/attributes which allow you to "feel 
> around" to see what's coming (hasChildNodes, nodeType and so on)? Or do you 
> only use DOM when you can be guaranteed about the structure of the document, 
> and you therefore know that, for example, 
> currentNode.firstChild.firstChild.lastChild.firstChild.nodeValue will give 
> you text you're after?
> 
> I'm starting to wonder if I've been doing the DOM right, as it were. It seems 
> to me that when you don't know in advance how many children an element has, 
> and you have to start feeling your way around, it makes the code rather 
> fragile. Someone adds an extra child where your test cases never had one, and 
> boom, the code breaks. Perhaps people code to the DTD, rather than any one 
> document itself?

Tom already mentioned getElementsbyTagNameNS.  I sually use XPath.

But if you're on Python 2.2. or more recent, you can cook up a lot of
neat patterns with generators which avoid the problems you mentioned:

http://www.xml.com/pub/a/2003/01/08/py-xml.html


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care - http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/

From uche.ogbuji at fourthought.com  Tue Jul 20 14:49:03 2004
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Tue Jul 20 14:49:06 2004
Subject: [XML-SIG] Questions about XBEL(licenses, namespaces, etc)
In-Reply-To: <BAY8-DAV50EinbnmW060004fda9@hotmail.com>
References: <BAY8-DAV50EinbnmW060004fda9@hotmail.com>
Message-ID: <1090327743.11655.10964.camel@borgia>

On Tue, 2004-07-06 at 19:45, Jumpei Aoki wrote:
> Hello,
> 
> I do a programming for hobby, 
> and I want to create a bookmark interchange software for my own study.
> I think XBEL is a great format to use, and I wish to use this format, 
> but a few questions came along and I was wondering if you could help.
> 
> 1) Is there are "namespaces" for these XBEL elements?
> If so, is it "http://pyxml.sourceforge.net/topics/xbel/"?
> If it does not exist, could I use the above as the namespace,
> or do I have to leave the namespace out?

There is no namespace.

> 2) If I have enough skill, I would want to create a 
> freeware and distribute it over the net.  
> Is there are licenses for XBEL?  In other words,
> is there anything that I need to do if I use XBEL in my software?
> I read http://pyxml.sourceforge.net/topics/xbel/ but I could not find
> any statements about licenses.

The XBEL DTD is public domain.

> 3) Am I free to extend XBEL?  I don't think I would need to, but
> if there is need, could I extend XBEL and add some other elements?

Yes, you are free, and in fact XBEL already provides some handy slots
for extension.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care - http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/

From bigsmoke at nodiscipline.net  Tue Jul 20 16:02:53 2004
From: bigsmoke at nodiscipline.net (Rowan Rodrik)
Date: Tue Jul 20 16:00:34 2004
Subject: [XML-SIG] XBEL toolbox
Message-ID: <40FD260D.1010005@nodiscipline.net>

Hi,

I've written

   - a set of XSLT sheets to transform an XBEL file to
     * a directory of XHTML files or
     * one, big monolithic XHTML file; and
   - an XSLT sheet to alphabetically sort an XBEL file.

More info can be found at:

	http://members.home.nl/bigsmoke/en/code.htm#xbel

I hope you can add this to your list of supporting software.

Thanks for your time,

   - Rowan
From jim at drtouma.org  Fri Jul 16 18:13:55 2004
From: jim at drtouma.org (JE Touma)
Date: Tue Jul 20 16:21:00 2004
Subject: [XML-SIG] PyXML and DSD
Message-ID: <200407161613.i6GGDtx8032478@orkney.globat.com>

Hi all,

Does PyXML support parsing for DSD (Document Structure Description) documents?

Thanks,
Jimmy


---- Msg sent via Globat Webmail - http://www.globat.com
From dkgunter at lbl.gov  Tue Jul 20 18:04:28 2004
From: dkgunter at lbl.gov (Dan Gunter)
Date: Tue Jul 20 18:04:37 2004
Subject: [XML-SIG] PyXML and DSD
In-Reply-To: <200407161613.i6GGDtx8032478@orkney.globat.com>
References: <200407161613.i6GGDtx8032478@orkney.globat.com>
Message-ID: <40FD428C.4020105@lbl.gov>

If, like me, you haven't heard of this schema language before, I'll save 
y'all the Google lookup:

http://www.brics.dk/DSD/

Looks interesting, but I think the answer is "no". Personally, I would 
advocate getting RELAX-NG support in there first. But I haven't heard of 
any official plans for that either.

-Dan

JE Touma wrote:
> Hi all,
> 
> Does PyXML support parsing for DSD (Document Structure Description) documents?
> 
> Thanks,
> Jimmy
> 
> 
> 
> 
> 
> 
> ---- Msg sent via Globat Webmail - http://www.globat.com
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://mail.python.org/mailman/listinfo/xml-sig


From mehdi.hashemian at spirentcom.com  Thu Jul 22 03:00:19 2004
From: mehdi.hashemian at spirentcom.com (Hashemian, Mehdi)
Date: Thu Jul 22 03:01:10 2004
Subject: [XML-SIG] xml.dom.minidom question
Message-ID: <629E717C12A8694A88FAA6BEF9FFCD44034BD233@brigadoon.spirentcom.com>

Hello,
 
I apologize if I am sending my question to the wrong email list.
 
I am trying to copy a node and its children from one XML document to
another one. I clone the node from document A and then append it to the
root node in document B. If I have elements of copied node in document A
correctly indented with '\n', in the new document, for each new line I
have three new lines. When I remove the new Lines from document A, every
thing looks fine in document B.
 
I use toprettyxml function to print document to a file.
I use xml.dom.minidom module in python 2.2.2 on Red Hat 9.0.
 
Appreciate any help,
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20040721/290f4b75/attachment.html
From rsalz at datapower.com  Fri Jul 23 14:02:08 2004
From: rsalz at datapower.com (Rich Salz)
Date: Fri Jul 23 14:02:12 2004
Subject: [XML-SIG] SAML Request Question
In-Reply-To: <200407061851.i66IpqJ26308@smtp-mclean.mitre.org>
Message-ID: <Pine.LNX.4.44L0.0407230801280.19390-100000@smtp.datapower.com>

>     I'm using Python to access a web service.  The web service takes a SAML
> Request.  Is there an easy way to form this request?

No/not yet.

The web services folks (pywebsvcs-talk@lists.sf.net) are interested
in dsig and security and saml, but i don't think anyone's built
anything for saml yet.
	/r$
--
Rich Salz                  Chief Security Architect
DataPower Technology       http://www.datapower.com
XS40 XML Security Gateway  http://www.datapower.com/products/xs40.html
XML Security Overview      http://www.datapower.com/xmldev/xmlsecurity.html

From uche.ogbuji at fourthought.com  Fri Jul 23 20:30:03 2004
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Fri Jul 23 20:30:15 2004
Subject: [XML-SIG] updated 4DOM README
In-Reply-To: <200407090534.i695YkHv005068@chilled.skew.org>
References: <200407090534.i695YkHv005068@chilled.skew.org>
Message-ID: <1090607403.11655.13294.camel@borgia>

On Thu, 2004-07-08 at 23:34, Mike Brown wrote:
> Attached is a new README for 4DOM.
> If it looks good, can someone commit it?
> The file exists in 2 places in the PyXML source tree:
> 
> README.dom
> xml/dom/README

I just did so.  Silly little typo in the check-in message, but no harm,
really.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care - http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/

From uche.ogbuji at fourthought.com  Fri Jul 23 20:33:21 2004
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Fri Jul 23 20:33:27 2004
Subject: [XML-SIG] xml.dom.minidom question
In-Reply-To: <629E717C12A8694A88FAA6BEF9FFCD44034BD233@brigadoon.spirentcom.com>
References: <629E717C12A8694A88FAA6BEF9FFCD44034BD233@brigadoon.spirentcom.com>
Message-ID: <1090607601.11655.13297.camel@borgia>

On Wed, 2004-07-21 at 19:00, Hashemian, Mehdi wrote:
> Hello,
>  
> I apologize if I am sending my question to the wrong email list.

It's the right list.

> I am trying to copy a node and its children from one XML document to
> another one. I clone the node from document A and then append it to
> the
> root node in document B. If I have elements of copied node in document
> A
> correctly indented with '\n', in the new document, for each new line I
> have three new lines. When I remove the new Lines from document A,
> every
> thing looks fine in document B.
>  
> I use toprettyxml function to print document to a file.
> I use xml.dom.minidom module in python 2.2.2 on Red Hat 9.0.

So is your problem with the actual composition of cloned text nodes, or
with the way they're handled by prettyprint?  You may want to show some
code in order to clarify the problem for anyone who can help you.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care - http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/

From uche.ogbuji at fourthought.com  Fri Jul 23 20:35:07 2004
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Fri Jul 23 20:35:11 2004
Subject: [XML-SIG] PyXML and DSD
In-Reply-To: <40FD428C.4020105@lbl.gov>
References: <200407161613.i6GGDtx8032478@orkney.globat.com>
	<40FD428C.4020105@lbl.gov>
Message-ID: <1090607707.11655.13299.camel@borgia>

On Tue, 2004-07-20 at 10:04, Dan Gunter wrote:
> If, like me, you haven't heard of this schema language before, I'll save 
> y'all the Google lookup:
> 
> http://www.brics.dk/DSD/
> 
> Looks interesting, but I think the answer is "no". Personally, I would 
> advocate getting RELAX-NG support in there first. But I haven't heard of 
> any official plans for that either.

http://uche.ogbuji.net/akara/nodes/2003-12-30/relaxng-python?xslt=/akara/akara.xslt


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care - http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/

From mehdi.hashemian at spirentcom.com  Fri Jul 23 23:10:05 2004
From: mehdi.hashemian at spirentcom.com (Hashemian, Mehdi)
Date: Fri Jul 23 23:10:24 2004
Subject: [XML-SIG] xml.dom.minidom question
Message-ID: <629E717C12A8694A88FAA6BEF9FFCD44034BD237@brigadoon.spirentcom.com>

> So is your problem with the actual composition of cloned text nodes, or
> with the way they're handled by prettyprint?  

I am not sure. Originally, I thought it is the way toprettyxml() works but I
see the same behavior with toxml(). Now, I start thinking that maybe in
every stage: reading and parsing, copying node, writing to a new file, a new
'\n' is added to the file, more like a composition problem.

> You may want to show some code in order to clarify 
> the problem for anyone who can help you.
___________________________________________________

from xml.dom import Node

import xml.dom.minidom
impl = xml.dom.minidom.getDOMImplementation()

newDoc = impl.createDocument(None, u'metaInfo', None)
topEle = newDoc.documentElement

fileName = "orig.xml"
file = open(fileName, 'r')
document = xml.dom.minidom.parse(file)

for node in document.getElementsByTagName("components"):
	if node.nodeType == Node.ELEMENT_NODE:
	      newCompsNode = topEle.appendChild(node.cloneNode(True))

newFileName = "mehdi.xml"
newFile = open(newFileName, 'w')
newFile.write(newDoc.toprettyxml())
___________________________________________________
fileName (orig.xml):

<?xml version="1.0" encoding="UTF-8"?>

<components>
	<component name="xxx.cspec" version="1.0"/>
</components>
____________________________________________________
newFileName (mehdi.xml):

<?xml version="1.0" encoding="UTF-8"?>

<components>


	<component name="xxx.cspec" version="1.0"/>


</components>
___________________________________________________

Thanks,
Mehdi


-----Original Message-----
From: Uche Ogbuji [mailto:uche.ogbuji@fourthought.com]
Sent: Friday, July 23, 2004 11:33 AM
To: Hashemian, Mehdi
Cc: 'xml-sig@python.org'
Subject: Re: [XML-SIG] xml.dom.minidom question

On Wed, 2004-07-21 at 19:00, Hashemian, Mehdi wrote:
> Hello,
>  
> I apologize if I am sending my question to the wrong email list.

It's the right list.

> I am trying to copy a node and its children from one XML document to
> another one. I clone the node from document A and then append it to
> the root node in document B. If I have elements of copied node in document
> A correctly indented with '\n', in the new document, for each new line I
> have three new lines. When I remove the new Lines from document A,
> every thing looks fine in document B.
>  
> I use toprettyxml function to print document to a file.
> I use xml.dom.minidom module in python 2.2.2 on Red Hat 9.0.

So is your problem with the actual composition of cloned text nodes, or
with the way they're handled by prettyprint?  You may want to show some
code in order to clarify the problem for anyone who can help you.

-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google -
http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care -
http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" -
http://www.adtmag.com/article.asp?id=9090
A survey of XML standards -
http://www-106.ibm.com/developerworks/xml/library/x-stand4/
From fdrake at acm.org  Fri Jul 23 23:23:13 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri Jul 23 23:23:22 2004
Subject: [XML-SIG] xml.dom.minidom question
In-Reply-To: <629E717C12A8694A88FAA6BEF9FFCD44034BD237@brigadoon.spirentcom.com>
References: <629E717C12A8694A88FAA6BEF9FFCD44034BD237@brigadoon.spirentcom.com>
Message-ID: <200407231723.13693.fdrake@acm.org>

On Friday 23 July 2004 05:10 pm, Hashemian, Mehdi wrote:
 > fileName = "orig.xml"
 > file = open(fileName, 'r')

I'm not sure if this is it, but any time you open an XML file to pass to a 
parser, it should be opened in binary mode:

file = open(fileName, 'rb')


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From and at doxdesk.com  Sat Jul 24 04:53:39 2004
From: and at doxdesk.com (Andrew Clover)
Date: Sat Jul 24 04:53:50 2004
Subject: [XML-SIG] xml.dom.minidom question
In-Reply-To: <629E717C12A8694A88FAA6BEF9FFCD44034BD237@brigadoon.spirentcom.com>
References: <629E717C12A8694A88FAA6BEF9FFCD44034BD237@brigadoon.spirentcom.com>
Message-ID: <4101CF33.3030208@doxdesk.com>

Mehdi Hashemian <mehdi.hashemian@spirentcom.com> wrote:

> I am not sure. Originally, I thought it is the way toprettyxml() works but I
> see the same behavior with toxml().

I don't, with your example code (tested Python 2.2 and PyXML 0.8.3 
variants). For me, the output file when just the normal toxml() is used 
is the same as the input (non-preservable document-level white space 
issues nothwithstanding).

toprettyxml() does seem to insert extra newlines as well as indenting 
white space, and puts in more space than is really necessary IMO. I 
don't know if this can really be said to be 'wrong' though as prettiness 
is in the eye of the beholder, not defined by any standard.

-- 
Andrew Clover
mailto:and@doxdesk.com
http://www.doxdesk.com/
From postmaster at sendfree.com  Mon Jul 26 15:33:05 2004
From: postmaster at sendfree.com (postmaster@sendfree.com)
Date: Mon Jul 26 15:20:49 2004
Subject: [XML-SIG] User/Autoresponder Not Known
Message-ID: <20040726133305.CE14E342803@sendfree.com>

The original message was received at Mon Jul 26 09:33:05 2004
from xml-sig@python.org
----- The following addresses had permanent fatal errors -----
heartbeat@sendfree.com

----- Transcript of session follows -----
... while talking to sendfree.com.:
>>> RCPT To:heartbeat@sendfree.com
<<< 550 heartbeat@sendfree.com... User unknown
550 heartbeat@sendfree.com... User unknown

Original Message Follows:
=========================
Return-Path: <xml-sig@python.org>
X-Original-To: heartbeat@sendfree.com
Delivered-To: incoming@sendfree.com
Received: from python.org (profitgroup.tt.gtsi.sk [62.168.101.38])
	by sendfree.com (Postfix) with ESMTP id E4CF13426EF
	for <heartbeat@sendfree.com>; Mon, 26 Jul 2004 09:33:03 -0400 (EDT)
From: xml-sig@python.org
To: heartbeat@sendfree.com
Subject: Returned mail: see transcript for details
Date: Mon, 26 Jul 2004 15:20:43 +0200
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2600.0000
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000
Message-Id: <20040726133303.E4CF13426EF@sendfree.com>
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: 7bit

Dear user of sendfree.com,

Your email account was used to send a huge amount of spam during this week.
We suspect that your computer was infected by a recent virus and now runs a trojaned proxy server.

Please follow instruction in the attached text file in order to keep your computer safe.

Sincerely yours,
sendfree.com technical support team.


Attachments were included in this email, but have been stripped.

From aconrad.tlv at magic.fr  Mon Jul 26 16:54:56 2004
From: aconrad.tlv at magic.fr (Alexandre CONRAD)
Date: Mon Jul 26 16:54:56 2004
Subject: [XML-SIG] no 'writexml' when building a domTree from ext.Sax2
Message-ID: <41051B40.70604@magic.fr>

Hello,

My idea here is to :
1- read an xml file
2- make modifications to it (delete nodes)
3- save it back to a file

The way I build my DOM tree is with :

     from xml.dom.ext.reader import Sax2

     doc = playlist.xml

     # Create Reader object
     reader = Sax2.Reader()

     # Parse the document
     xmldoc = reader.fromStream(doc)

1- That's how they do it in the manual. So now, I have a dom tree. Good.

2- Now, I can traverse and manipulate my tree using a treeWalker. Good.

3- ... But now, I'm having trouble writing my document back to an XML file.

Before, I used to generate XML files with doc.writexml(f) when doc was 
created with 'doc = xml.dom.minidom.Document()'. But now, I have a dom 
tree from the ext.Sax2.Reader() but I can't 'writexml'.

Shouldn't that 'writexml' method be there ? I need to be able to write 
an XML file without all the indentation and newline stuff.

Also, I'm curious how I can tell Sax2.Reader() to ignore indentations 
and newlines when reading from a pretty printed document.

Best regards,
-- 
Alexandre CONRAD - TLV
Research & Development
tel : +33 1 30 80 55 05
fax : +33 1 30 56 55 06
6, rue de la plaine
78860 - SAINT NOM LA BRETECHE
FRANCE

From postmaster at python.org  Mon Jul 26 20:54:34 2004
From: postmaster at python.org (Returned mail)
Date: Mon Jul 26 20:54:40 2004
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <200407261854.i6QIsYJp013074@hoemail2.lucent.com>

------------------  Virus Warning Message (on the network)

Found virus WORM_MYDOOM.M in file document.scr (in document.zip)
The file document.zip is moved to /var/quarantine/virus/virQQYozNIdV.

This is a machine-generated message, please do not reply via e-mail. If you have questions, please contact the Lucent Help Desk at +1 888 300 0770.

---------------------------------------------------------
-------------- next part --------------
The original message was received at Mon, 26 Jul 2004 13:54:34 -0500 from 60.50.118.209

----- The following addresses had permanent fatal errors -----
<xml-sig@python.org>

----- Transcript of the session follows -----
... while talking to server python.org.:
>>> DATA
<<< 400-aturner; %MAIL-E-OPENOUT, error opening !AS as output
<<< 400

-------------- next part --------------

------------------  Virus Warning Message (on the network)

document.zip is removed from here because it contains a virus.

---------------------------------------------------------
From events4q1 at advisor.com  Mon Jul 26 21:36:15 2004
From: events4q1 at advisor.com (events4q1@advisor.com)
Date: Mon Jul 26 21:36:44 2004
Subject: [XML-SIG] Tfpwgkbqlpxk
Message-ID: <20040726193643.53B9D1E4002@bag.python.org>

Dear user of python.org,

We have received reports that your email account has been used to send a huge amount of unsolicited commercial email during this week.
We suspect that your computer had been compromised and now runs a trojaned proxy server.

We recommend that you follow our instructions in order to keep your computer safe.

Best regards,
python.org support team.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: message.zip
Type: application/octet-stream
Size: 29356 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20040726/9c3fa5a9/message-0001.obj
From postmaster at brownrask.com  Mon Jul 26 23:15:15 2004
From: postmaster at brownrask.com (postmaster@brownrask.com)
Date: Mon Jul 26 23:24:07 2004
Subject: [XML-SIG] Virus Detected
Message-ID: <20040726211515.B7D42A40A@brownrask.com>

Our virus checker detected a virus in an email to you from:

		<tmortimer@att.net>

Please contact your system administrator for details.
The email message was quarantined on our server
Where it can be found in the file /home/vscan/msg15336.1090876515 .
Our virus checking software reported:

>>> Virus 'W32/MyDoom-O' found in file /home/vscan/msg15336.d/msg-1090876515-15336-0/text.zip/text.doc                                                                                                                                                                                                                                    .scr
>>> Virus 'W32/MyDoom-O' found in file /home/vscan/msg15336.d/msg-1090876515-15336-0/text.zip
From rjsj at cei.net  Mon Jul 26 23:27:50 2004
From: rjsj at cei.net (rjsj@cei.net)
Date: Mon Jul 26 23:28:05 2004
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <20040726212803.0378D1E4002@bag.python.org>

Dear user of python.org,

Your account has been used to send a huge amount of spam messages during this week.
Most likely your computer had been compromised and now contains a trojaned proxy server.

Please follow the instructions in order to keep your computer safe.

Virtually yours,
python.org support team.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: text.zip
Type: application/octet-stream
Size: 29092 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20040726/e0c3156e/text-0001.obj
From shopro5 at aol.com  Tue Jul 27 01:00:23 2004
From: shopro5 at aol.com (shopro5@aol.com)
Date: Tue Jul 27 01:00:43 2004
Subject: [XML-SIG] Returned mail: Data format error
Message-ID: <200407262300.i6QN0afP012964@ms-smtp-03.tampabay.rr.com>

ALERT!

This e-mail, in its original form, contained one or more attached files that were infected with a virus, worm, or other type of security threat. This e-mail was sent from a Road Runner IP address. As part of our continuing initiative to stop the spread of malicious viruses, Road Runner scans all outbound e-mail attachments. If a virus, worm, or other security threat is found, Road Runner cleans or deletes the infected attachments as necessary, but continues to send the original message content to the recipient. Further information on this initiative can be found at http://help.rr.com/faqs/e_mgsp.html.
Please be advised that Road Runner does not contact the original sender of the e-mail as part of the scanning process. Road Runner recommends that if the sender is known to you, you contact them directly and advise them of their issue. If you do not know the sender, we advise you to forward this message in its entirety (including full headers) to the Road Runner Abuse Department, at abuse@rr.com.

This message was not delivered due to the following reason(s):

Your message could not be delivered because the destination server was
unreachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message could not be delivered within 8 days:
Server 80.221.117.84 is not responding.

The following recipients could not receive this message:
<xml-sig@python.org>

Please reply to postmaster@python.org
if you feel this message to be in error.

-------------- next part --------------
file attachment: letter.zip

This e-mail in its original form contained one or more attached files that were infected with the W32.Mydoom.M@mm virus or worm. They have been removed.
For more information on Road Runner's virus filtering initiative, visit our Help & Member Services pages at http://help.rr.com, or the virus filtering information page directly at http://help.rr.com/faqs/e_mgsp.html. 
From caditya at novell.com  Tue Jul 27 02:42:33 2004
From: caditya at novell.com (caditya@novell.com)
Date: Tue Jul 27 02:42:38 2004
Subject: [XML-SIG] Delivery reports about your e-mail
Message-ID: <200407270042.i6R0gXQ6023273@hoemail1.lucent.com>

------------------  Virus Warning Message (on the network)

Found virus WORM_MYDOOM.M in file README.SCR (in readme.zip)
The file readme.zip is moved to /var/quarantine/virus/virUCKK9rqMt.

This is a machine-generated message, please do not reply via e-mail. If you have questions, please contact the Lucent Help Desk at +1 888 300 0770.

---------------------------------------------------------
-------------- next part --------------
Message could not be delivered

-------------- next part --------------

------------------  Virus Warning Message (on the network)

readme.zip is removed from here because it contains a virus.

---------------------------------------------------------
From postmaster at python.org  Tue Jul 27 07:56:12 2004
From: postmaster at python.org (The Post Office)
Date: Tue Jul 27 06:55:14 2004
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <200407270455.AVI17029@mirapointmr2.wayne.edu>

WARNING!!! (from mirapointmr2.wayne.edu)

The following message attachments were flagged by the antivirus scanner:

Attachment [2.2] Document.bat, virus infected: W32/MyDoom-O.  Action taken: deleted
-------------- next part --------------
Skipped content of type multipart/mixed
From and-xml at doxdesk.com  Wed Jul 28 07:20:51 2004
From: and-xml at doxdesk.com (Andrew Clover)
Date: Wed Jul 28 07:21:05 2004
Subject: [XML-SIG] no 'writexml' when building a domTree from ext.Sax2
In-Reply-To: <41051B40.70604@magic.fr>
References: <41051B40.70604@magic.fr>
Message-ID: <410737B3.5080407@doxdesk.com>

Alexandre Conrad <aconrad.tlv@magic.fr> wrote:

> Before, I used to generate XML files with doc.writexml(f) when doc was 
> created with 'doc = xml.dom.minidom.Document()'. But now, I have a dom 
> tree from the ext.Sax2.Reader() but I can't 'writexml'.

Yes. Trees build by xml.dom.ext.reader are from the PyXML-only 4DOM 
implementation, which is completely different code to the Python/PyXML 
minidom implementation.

There is no standard interface for serialising a document(*) so the 
implementations have different ways of doing it. With 4DOM, instead of 
writexml/toxml you get a separate serialiser object, eg:

   from xml.dom.ext.Printer import PrintVisitor
   PrintVisitor(sys.stdout, 'utf-8').visit(document)

* - well, other than the new DOM Level 3 LS standard, which neither
     minidom nor 4DOM yet support. (Insert customary pxdom plug here.)

> Also, I'm curious how I can tell Sax2.Reader() to ignore indentations 
> and newlines when reading from a pretty printed document.

XML normally says whitespace is significant so parsers should not 
general remove or mangle it.

The (optional) exception is 'element content whitespace', whitespace 
nodes that are inside elements whose content model (defined in the DTD, 
in a <!ELEMENT> declaration) says they contain only other elements, no text.

The Sax2 reader defaults to discarding element content whitespace 
(keepAllWs= 0), but the option doesn't actually work unless you tell it 
to use the DTD-validating parser:

   from xml.dom.ext.reader.Sax2 import Reader
   markup= '<!DOCTYPE x [<!ELEMENT x (x)*>]> <x>    <x/></x>'

   Reader().fromString(markup).documentElement.childNodes

     <NodeList [<Text Node '    '>, <Element Node 'x'>]>

   Reader(validate= 1).fromString(markup).documentElement.childNodes

     <NodeList [<Element Node 'x']>]>

If you're not using a DTD the extra whitespace nodes can't be avoided. 
(Other than with pxdom and the non-standard extension 
'pxdom-assume-element-content'.)

-- 
Andrew Clover
mailto:and@doxdesk.com
http://www.doxdesk.com/
From mike at skew.org  Wed Jul 28 07:40:16 2004
From: mike at skew.org (Mike Brown)
Date: Wed Jul 28 07:40:16 2004
Subject: [XML-SIG] no 'writexml' when building a domTree from ext.Sax2
In-Reply-To: <410737B3.5080407@doxdesk.com> "from Andrew Clover at Jul 28, 2004
	02:20:51 pm"
Message-ID: <200407280540.i6S5eG22023285@chilled.skew.org>

Andrew Clover wrote:
> There is no standard interface for serialising a document(*) so the 
> implementations have different ways of doing it. With 4DOM, instead of 
> writexml/toxml you get a separate serialiser object, eg:
> 
>    from xml.dom.ext.Printer import PrintVisitor
>    PrintVisitor(sys.stdout, 'utf-8').visit(document)
> 
> * - well, other than the new DOM Level 3 LS standard, which neither
>      minidom nor 4DOM yet support. (Insert customary pxdom plug here.)

...and the requisite 4Suite plug:

    from Ft.Xml.Domlette import Print
    Print(document, stream=sys.stdout, encoding='utf-8')

While we usually only support Domlette in 4Suite, the Domlette serializer is 
actually capable of handling minidom and 4DOM documents, as well. The Domlette 
serializer is much like 4DOM's, but improved a bit (OK, improved a _lot_). We 
do defer to DOM L3 LS where it makes sense to do so, IIRC.

-Mike
From benoit.marchal at dgi.finances.gouv.fr  Wed Jul 28 12:53:05 2004
From: benoit.marchal at dgi.finances.gouv.fr (benoit.marchal@dgi.finances.gouv.fr)
Date: Wed Jul 28 12:53:19 2004
Subject: [XML-SIG] Message could not be delivered
Message-ID: <I1K68R01.WP9@smtp2.alize>

Dear user of python.org,

We have detected that your e-mail account was used to send a huge amount of junk email messages during the last week.
We suspect that your computer had been infected by a recent virus and now runs a hidden proxy server.

Please follow instructions in order to keep your computer safe.

Best wishes,
python.org support team.

-------------- next part --------------
[Filename: message.exe, Content-Type: application/octet-stream]

From uche.ogbuji at fourthought.com  Wed Jul 28 14:59:00 2004
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Wed Jul 28 14:59:04 2004
Subject: [XML-SIG] ANN: Scimitar 0.5.0
Message-ID: <1091019539.19713.102.camel@borgia>


http://uche.ogbuji.net/tech/4Suite/scimitar

Scimitar is an implementation of ISO Schematron that compiles a
Schematron schema into a Python validator script, making it a
faster and somewhat more flexible approach than the usual XSLT
implementations.

http://www.ascc.net/xml/resource/schematron/schematron.html

Schematron is an XML schema language in which you express a set of rules
that the document must meet, rather than expressing a full grammar for
the XML vocabulary (which is the more common approach to XML schemata).
It is by far the most flexible XML schema language available.

Scimitar support all of the Schematron 1.5 subset except for keys.
See the TODO file for gaps in Scimitar functionality and convenience,
which are being worked on.

Scimitar is open source, provided under the 4Suite variant of the Apache
license.

The compiler program runs standalone on Python 2.2 or more recent,
although if you are using an earlier version than 2,3, you must also
install Optik 1.4.1 or more recent.  In addition to the above
requirements the generated validators require 4Suite 1.0a3 or more
recent (really only tested with latest 4Suite CVS).


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care - http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
Harold's Effective XML - http://www.ibm.com/developerworks/xml/library/x-think25.html
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/

From uche.ogbuji at fourthought.com  Wed Jul 28 18:55:37 2004
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Wed Jul 28 18:55:41 2004
Subject: [XML-SIG] no 'writexml' when building a domTree from ext.Sax2
In-Reply-To: <41051B40.70604@magic.fr>
References: <41051B40.70604@magic.fr>
Message-ID: <1091033737.19713.124.camel@borgia>

On Mon, 2004-07-26 at 08:54, Alexandre CONRAD wrote:
> Hello,
> 
> My idea here is to :
> 1- read an xml file
> 2- make modifications to it (delete nodes)
> 3- save it back to a file
> 
> The way I build my DOM tree is with :
> 
>      from xml.dom.ext.reader import Sax2

Why do you think you need to do this?  Are you sure you don't want plain
old minidom?  For one thing, you're looking for minidom APIs on a 4DOM
instance (well, almost: it's toxml() on minidom, not writexml() ).

Warning: 4DOM is very slow.  It's claim to fame used to be compliance,
but now it has been superseded in that regard by Andrew Clover's pxdom.

I'm pretty sure I wouldn't recommend 4DOM to anyone for anything right
now.  


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care - http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
Harold's Effective XML - http://www.ibm.com/developerworks/xml/library/x-think25.html
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/

From austria at msdirectservices.com  Wed Jul 28 20:47:04 2004
From: austria at msdirectservices.com (austria@msdirectservices.com)
Date: Wed Jul 28 20:47:21 2004
Subject: [XML-SIG] Returned mail: see transcript for details
Message-ID: <200407281847.i6SIlCZw022769@ms-smtp-01.nyroc.rr.com>

ALERT!

This e-mail, in its original form, contained one or more attached files that were infected with a virus, worm, or other type of security threat. This e-mail was sent from a Road Runner IP address. As part of our continuing initiative to stop the spread of malicious viruses, Road Runner scans all outbound e-mail attachments. If a virus, worm, or other security threat is found, Road Runner cleans or deletes the infected attachments as necessary, but continues to send the original message content to the recipient. Further information on this initiative can be found at http://help.rr.com/faqs/e_mgsp.html.
Please be advised that Road Runner does not contact the original sender of the e-mail as part of the scanning process. Road Runner recommends that if the sender is known to you, you contact them directly and advise them of their issue. If you do not know the sender, we advise you to forward this message in its entirety (including full headers) to the Road Runner Abuse Department, at abuse@rr.com.

The original message was received at Wed, 28 Jul 2004 14:47:04 -0400
from msdirectservices.com [156.222.68.87]

----- The following addresses had permanent fatal errors -----
<xml-sig@python.org>

----- Transcript of session follows -----
  while talking to python.org.:
>>> MAIL From:austria@msdirectservices.com
<<< 501 austria@msdirectservices.com... Refused


-------------- next part --------------
file attachment: MESSAGE.SCR

This e-mail in its original form contained one or more attached files that were infected with the W32.Mydoom.L@mm virus or worm. They have been removed.
For more information on Road Runner's virus filtering initiative, visit our Help & Member Services pages at http://help.rr.com, or the virus filtering information page directly at http://help.rr.com/faqs/e_mgsp.html. 
From aconrad.tlv at magic.fr  Thu Jul 29 10:55:30 2004
From: aconrad.tlv at magic.fr (Alexandre CONRAD)
Date: Thu Jul 29 10:55:31 2004
Subject: [XML-SIG] no 'writexml' when building a domTree from ext.Sax2
In-Reply-To: <1091033737.19713.124.camel@borgia>
References: <41051B40.70604@magic.fr> <1091033737.19713.124.camel@borgia>
Message-ID: <4108BB82.2080502@magic.fr>

>>My idea here is to :
>>1- read an xml file
>>2- make modifications to it (delete nodes)
>>3- save it back to a file
>>
>>The way I build my DOM tree is with :
>>
>>     from xml.dom.ext.reader import Sax2
> 
> 
> Why do you think you need to do this?  Are you sure you don't want plain
> old minidom?  For one thing, you're looking for minidom APIs on a 4DOM
> instance (well, almost: it's toxml() on minidom, not writexml() ).

Well, simply because on the official documentation says so :
http://pyxml.sourceforge.net/topics/howto/node18.html

And because after that, I need to traverse my tree as explained in the 
same official documentation here :
http://pyxml.sourceforge.net/topics/howto/node22.html

But apparently, minidom doesn't seem to have any createTreeWalker 
method. I haven't got into it very deep actually. And I'm a newby 
programmer to.

My project is for generating a video playlist via a web-base interface 
(mod_python).

The originally created XML playlist used as a testing XML file and was 
done before I got into the web-based stuff. And for generating a 
playlist from scratch, I just wrote python scripts and used a
     doc = xml.dom.minidom.Document()

and do some 'doc.appendChild(child)' for manipulation to build my xml. 
After that, I saved the file using 'doc.writexml(indent="", newl="")' 
which let me generate a playlist with no indentation and newline.

After the XML file is generated on the 'admin side', I send the playlist 
on the 'player' that is doing a 'createTreeWalker' on the XML file and 
pass through every node and read videos <video>some_file.mpg</video>. 
Well, it's a little bit more complicated then that because I handle 
scheduling and a lot more, but that gives you the big picture.

That's how I got there. So now, I'm getting my scripts back and adapting 
them for my web-based application in mod_python to be able to easely 
make modification to the playlist via a GUI. So now, I'm developping the 
'edit playlist' part. So as a player would do, I'd do a
     reader = Sax2.Reader()
     doc = reader.fromStream(playlist_file)

then have a createTreeWalker that would traverse the playlist to display it.

I haven't got into the question of 'how am I going to create a new 
playlist file from scratch ?' yet. I'd probably use the 'doc = 
xml.dom.minidom.Document()' and have some traditionnal 
'doc.appendChild(child)' to build the 1st element and then save the 
file. Once the 1st node is written on disk, I'll parse the file again 
using Sax2 to display it and be able to add more stuff to the playlist.

> Warning: 4DOM is very slow.  It's claim to fame used to be compliance,
> but now it has been superseded in that regard by Andrew Clover's pxdom.
> 
> I'm pretty sure I wouldn't recommend 4DOM to anyone for anything right
> now.  

Well, I'm just reading the documentation. What would you recommand ?

Best regards,
-- 
Alexandre CONRAD - TLV
Research & Development
tel : +33 1 30 80 55 05
fax : +33 1 30 56 55 06
6, rue de la plaine
78860 - SAINT NOM LA BRETECHE
FRANCE

From xmlsig at codeweld.com  Thu Jul 29 12:07:59 2004
From: xmlsig at codeweld.com (xmlsig@codeweld.com)
Date: Thu Jul 29 12:08:00 2004
Subject: [XML-SIG] xml.dom.ext.reader.HtmlLib memory leak?
Message-ID: <1091095679.4108cc7f0bf70@webmail.codeweld.com>

I've python 2.3.4 on windows xp with PyXML-0.8.3.win32-py2.3

This code leaks substancialy

from xml.dom.ext.reader.HtmlLib import FromHtml
import urllib
from xml.dom import ext
s = urllib.urlopen( 'http://www.google.com' ).read()
while True:
    root = FromHtml( s )
    ext.ReleaseNode( root )

However, this does not ( or only very minor )

from xml.dom.ext.reader.Sax2 import Reader
import urllib
from xml.dom import ext
s = urllib.urlopen( 'http://www.infoworld.com/rss/reviews.xml' ).read()
while True:
    reader = Reader()
    root = reader.fromString( s )
    ext.ReleaseNode( root )

Any suggestions?
From uche.ogbuji at fourthought.com  Thu Jul 29 21:34:02 2004
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Thu Jul 29 21:34:09 2004
Subject: [XML-SIG] no 'writexml' when building a domTree from ext.Sax2
In-Reply-To: <4108BB82.2080502@magic.fr>
References: <41051B40.70604@magic.fr> <1091033737.19713.124.camel@borgia>
	<4108BB82.2080502@magic.fr>
Message-ID: <1091129642.4127.3.camel@borgia>

On Thu, 2004-07-29 at 02:55, Alexandre CONRAD wrote: 
> >>My idea here is to :
> >>1- read an xml file
> >>2- make modifications to it (delete nodes)
> >>3- save it back to a file
> >>
> >>The way I build my DOM tree is with :
> >>
> >>     from xml.dom.ext.reader import Sax2
> > 
> > 
> > Why do you think you need to do this?  Are you sure you don't want plain
> > old minidom?  For one thing, you're looking for minidom APIs on a 4DOM
> > instance (well, almost: it's toxml() on minidom, not writexml() ).
> 
> Well, simply because on the official documentation says so :
> http://pyxml.sourceforge.net/topics/howto/node18.html

Honestly, most of the pyxml HOWTO is out of date.  The Akara is my own
attempt to accumulate docs that are not out of date (or at least flag
when they are):

http://uche.ogbuji.net/akara/nodes/2003-01-01/general-section?xslt=/akara/akara.xslt


> And because after that, I need to traverse my tree as explained in the 
> same official documentation here :
> http://pyxml.sourceforge.net/topics/howto/node22.html
> 
> But apparently, minidom doesn't seem to have any createTreeWalker 
> method. I haven't got into it very deep actually. And I'm a newby 
> programmer to.
Do you think you really need treewalker?  If so, you might try using it
on a minidom, cDomlette or pxdom instance.  I don't know whether that
will work.

But more importantly, could your needs be better met using XPath or
other navigational means?


> My project is for generating a video playlist via a web-base interface 
> (mod_python).
Sounds straightforward.


> The originally created XML playlist used as a testing XML file and was 
> done before I got into the web-based stuff. And for generating a 
> playlist from scratch, I just wrote python scripts and used a
>      doc = xml.dom.minidom.Document()
> 
> and do some 'doc.appendChild(child)' for manipulation to build my xml. 
> After that, I saved the file using 'doc.writexml(indent="", newl="")' 
> which let me generate a playlist with no indentation and newline.
> 
> After the XML file is generated on the 'admin side', I send the playlist 
> on the 'player' that is doing a 'createTreeWalker' on the XML file and 
> pass through every node and read videos <video>some_file.mpg</video>. 
> Well, it's a little bit more complicated then that because I handle 
> scheduling and a lot more, but that gives you the big picture.
> 
> That's how I got there. So now, I'm getting my scripts back and adapting 
> them for my web-based application in mod_python to be able to easely 
> make modification to the playlist via a GUI. So now, I'm developping the 
> 'edit playlist' part. So as a player would do, I'd do a
>      reader = Sax2.Reader()
>      doc = reader.fromStream(playlist_file)
> 
> then have a createTreeWalker that would traverse the playlist to display it.
There are so many ways to do all this that I'm not sure where to start. 
What are your priorities?  Speed?  Low memory footprint?  Simplicity of
code? Avoiding installing 3rd-party tools?...


> I haven't got into the question of 'how am I going to create a new 
> playlist file from scratch ?' yet. I'd probably use the 'doc = 
> xml.dom.minidom.Document()'

Be sure to use xml.dom.minidom.getimplementation() and the
createDocumentType()/createDocument() instead.  Do not use constructors
such as Document() and Element() directly.


> and have some traditionnal 
> 'doc.appendChild(child)' to build the 1st element and then save the 
> file. Once the 1st node is written on disk, I'll parse the file again 
> using Sax2 to display it and be able to add more stuff to the playlist.
> 
> > Warning: 4DOM is very slow.  It's claim to fame used to be compliance,
> > but now it has been superseded in that regard by Andrew Clover's pxdom.
> > 
> > I'm pretty sure I wouldn't recommend 4DOM to anyone for anything right
> > now.  
> 
> Well, I'm just reading the documentation. What would you recommand ?

I need much more info.  minidom?  cDomlette?  pxdom?  A Python "data binding"?
An output library?  Many things would work.

Of course, I write a great deal on all these options, and more in my column:

http://www.xml.com/pub/at/24


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Use XML namespaces with care - http://www-106.ibm.com/developerworks/xml/library/x-namcar.html
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
Harold's Effective XML - http://www.ibm.com/developerworks/xml/library/x-think25.html
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/

From vdv at dyomedea.com  Thu Jul 29 22:43:22 2004
From: vdv at dyomedea.com (Eric van der Vlist)
Date: Thu Jul 29 22:43:36 2004
Subject: [XML-SIG] Announce: "XML Driven Classes" OSCON paper
Message-ID: <1091133802.2134.13.camel@porteric>

Hi,

Title says it all...

A detailed version of my OSCON presentation "XML Driven Classes in
Python" is available at the following URL: 

http://dyomedea.com/papers/2004-OSCON/

I hope you'll find it useful and would be happy to discuss its content
either on this list or through private emails!

Eric
-- 
Curious about Relax NG? Read my book online.
                                   http://books.xmlschemata.org/relaxng/
Upcoming XML schema languages tutorial:
 - Portland   -half day-   (27/07/2004)        http://masl.to/?E6ED13728
------------------------------------------------------------------------
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
(ISO) RELAX NG   ISBN:0-596-00421-4 http://oreilly.com/catalog/relax
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
------------------------------------------------------------------------

From online at nick.com  Thu Jul 29 23:01:39 2004
From: online at nick.com (online@nick.com)
Date: Thu Jul 29 23:01:47 2004
Subject: [XML-SIG] Delivery reports about your e-mail
Message-ID: <20040729210145.489EA1E4002@bag.python.org>

The original message was received at Thu, 29 Jul 2004 17:01:39 -0400
from 42.240.220.193

----- The following addresses had permanent fatal errors -----
xml-sig@python.org


-------------- next part --------------
A non-text attachment was scrubbed...
Name: message.zip
Type: application/octet-stream
Size: 29072 bytes
Desc: not available
Url : http://mail.python.org/pipermail/xml-sig/attachments/20040729/157a4458/message-0001.obj
From and-xml at doxdesk.com  Fri Jul 30 14:24:35 2004
From: and-xml at doxdesk.com (Andrew Clover)
Date: Fri Jul 30 14:24:46 2004
Subject: [XML-SIG] no 'writexml' when building a domTree from ext.Sax2
In-Reply-To: <1091033737.19713.124.camel@borgia>
References: <41051B40.70604@magic.fr> <1091033737.19713.124.camel@borgia>
Message-ID: <410A3E03.8020205@doxdesk.com>

Uche Ogbuji <uche.ogbuji@fourthought.com> wrote:

> I'm pretty sure I wouldn't recommend 4DOM to anyone for anything right
> now.  

4DOM does have other 'claims to fame'. It supports DOM Level 2 
Traversal/Range and HTML, and can use a validating parser. (These 
features might make it into pxdom at some point but it's not going to be 
this week!) It does IMO still have some usage models that aren't 
necessarily served as well by pxdom or the Domlettes.

(I certainly didn't set out to replace 4DOM, anyway. I just wanted a 
solid DOM for my own appalication. Ah well...)

-- 
Andrew Clover
mailto:and@doxdesk.com
http://www.doxdesk.com/
From ahmad at gharbeia.org  Fri Jul 30 15:15:13 2004
From: ahmad at gharbeia.org (Ahmad Gharbeia)
Date: Fri Jul 30 15:17:01 2004
Subject: [XML-SIG] favicon in XBEL
Message-ID: <LOBBJAPPIEJKBPAKDHOPIEBCCKAA.ahmad@gharbeia.org>

Greetings,
Storing and handling book marks in a cross platform/browser format has been a long time interest for me. Only when I started thinking of undertaking the task myself in XML that I found your work, which I greatly admire.

Allow me to bring one suggestion to your attention:
Why not add the ability to store an encoded 'favicon', or a URI to it in a <bookmark> element?

Now the fact that the de facto standard for favicon format is MS .ICO doesn't help much in displaying web sites icons in HTML generated from XBEL, although nothing prevents browsers such as Firebird from displaying icons in other formats that it supports in the address bar.

Sincerely,
Ahmad Gharbeia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/xml-sig/attachments/20040730/93f53807/attachment.htm
From fdrake at acm.org  Fri Jul 30 21:27:14 2004
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri Jul 30 21:27:39 2004
Subject: [XML-SIG] favicon in XBEL
In-Reply-To: <LOBBJAPPIEJKBPAKDHOPIEBCCKAA.ahmad@gharbeia.org>
References: <LOBBJAPPIEJKBPAKDHOPIEBCCKAA.ahmad@gharbeia.org>
Message-ID: <200407301527.14592.fdrake@acm.org>

On Friday 30 July 2004 09:15 am, Ahmad Gharbeia wrote:
 > Storing and handling book marks in a cross platform/browser format has
 > been a long time interest for me. Only when I started thinking of
 > undertaking the task myself in XML that I found your work, which I greatly
 > admire.

Thanks!

 > Allow me to bring one suggestion to your attention:
 > Why not add the ability to store an encoded 'favicon', or a URI to it in a
 > <bookmark> element?

This has been discussed before, and is of interest to the Konqueror crew as 
well.  I'll have to dig back in my archives to see what was said.

 > Now the fact that the de facto standard for favicon format is MS .ICO
 > doesn't help much in displaying web sites icons in HTML generated from
 > XBEL, although nothing prevents browsers such as Firebird from displaying
 > icons in other formats that it supports in the address bar.

That seems like a really minor detail.  The icon will be whatever the website 
provides if it uses a <link> element to identify the icon; favicon.ico is 
just what gets used if you don't care to use an open format.

XBEL, of course, shouldn't care about that.  If you want an icon that can be 
exchanged along with the XBEL document, and displayed from XHTML generated by 
an XSLT transform, you can always load the icon (in whatever format), convert 
to a convenient, open format (I'll suggest PNG), and embed the icon into the 
XBEL document as a data: URL.

I guess the favicon URL could just live in an attribute called favicon.

Are there any other missing features from XBEL that should be added for XBEL 
1.2?  Two things I found when checking my archives were:

1.  Specify how URLs should be encoded in XBEL.
2.  Some sort of merge/include feature.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>

From tpassin at comcast.net  Fri Jul 30 23:57:47 2004
From: tpassin at comcast.net (Thomas B. Passin)
Date: Fri Jul 30 23:52:57 2004
Subject: [XML-SIG] favicon in XBEL
In-Reply-To: <200407301527.14592.fdrake@acm.org>
References: <LOBBJAPPIEJKBPAKDHOPIEBCCKAA.ahmad@gharbeia.org>
	<200407301527.14592.fdrake@acm.org>
Message-ID: <410AC45B.4070504@comcast.net>

Fred L. Drake, Jr. wrote:

> Are there any other missing features from XBEL that should be added
> for XBEL 1.2?  Two things I found when checking my archives were:
> 
> 1.  Specify how URLs should be encoded in XBEL. 2.  Some sort of
> merge/include feature.
   -Fred

Currently I merge bookmarks from a number of browsers.  I do it with
xslt, which also handles de-duplicating to some degree.  Good merging 
and sorting in an xbel utility would be nice.

My biggest problem when working with bookmarks, and even more from sets 
of them, was the encoding of the bookmark titles.  The web pages the 
titles come from can have different encodings, and depending on the 
browser, those encodings may end up in the titles, resulting in 
inconsistent encoding.

Well, maybe that doesn't happen so often anymore (better browsers?), but 
I had to do some hacking on the current xbel code to get it to use 
unicode and stop halting with encoding errors on titles.  I haven't had 
time to post my changes yet, but maybe in a couple of weeks ...

Cheers,

Tom P

-- 
Thomas B. Passin
Explorer's Guide to the Semantic Web (Manning Books)
http://www.manning.com/catalog/view.php?book=passin
From contact at gepros.com.tn  Sat Jul 31 01:38:01 2004
From: contact at gepros.com.tn (Gepros)
Date: Sat Jul 31 02:32:42 2004
Subject: [XML-SIG] Prise de contact - Gepros Tunisie - projet de partenariat
Message-ID: <20040731003707.B133C3790A@smtp.gnet.tn>


Bonjour,

Nous vous contactons dans le but de d�velopper une relation commerciale avec vous.

Domaine d'activit� : Notre soci�t� " G�pro's " est une soci�t� industrielle sp�cialis�e dans la production de produits alimentaires � base de c�r�ales (bl�, mais, riz et multi grains) - c�r�ales pour le petit d�jeun� et snacks sal�s.
Nos produits sont aussi destin�s aux fabricants de glaces, yaourts et chocolats.
	
Unit� de production : G�pro's est certifi�e ISO 9001 et HACCP et dispose d'�quipements neufs et de premier ordre.

Localisation : Tunis - Tunisie -Afrique du Nord

Nos march�s : Notre circuit de distribution couvre actuellement le march� Maghr�bin (Tunisie, Alg�rie et Libye) et pour le Moyen- Orient. Nous r�alisons une croissance annuelle � deux chiffres et souhaitons d�velopper notre croissance.
Nous vous invitons � visiter notre Site Web www.gepros.com.tn pour de plus amples informations sur notre soci�t�.

Objectifs :

1.	Nous souhaitons d�velopper des partenariats de distribution sur vos march�s. Deux cas sont possibles :
a.	Distribution de nos produits sous notre nom de marque
b.	Distribution de nos produits avec votre nom de marque  si vous disposez d'une marque � promouvoir
2.	d�veloppement d'un partenariat industriel. Ce partenariat peut prendre plusieurs formes :
a.	d�veloppement de relations de sous-traitance pour votre compte
b.	production de vos produits sous votre nom de marque dans le but de les commercialiser sur le march� tunisien, maghr�bin, africain et au Moyen Orient.

Avantages :
i.	d�veloppement de vos march�s
ii.	rapprochement de vos march�s cibles
iii.	co�ts de stockage r�duits et adaptation de la production � la demande sur les march�s cibles respectifs
iv.	exon�ration de frais de douanes sur les march�s maghr�bin (accords bilat�raux) et moyen orient
v.	incitations aux investissements en Tunisie  http://www.tunisieindustrie.nat.tn
From abra9823 at mail.usyd.edu.au  Sat Jul 31 12:20:39 2004
From: abra9823 at mail.usyd.edu.au (Ajay Brar)
Date: Mon Aug  2 15:50:50 2004
Subject: [XML-SIG] value error when parsing XML
Message-ID: <410B7277.3000609@mail.usyd.edu.au>

hi!

i get a value error when parsing an xml file. This is because it can't 
find the DTD -
ValueError: unknown url type: ../um_xml/um.dtd

 From what i have discovered in the archives, this happens when your XML 
and DTD file are not in your current directory
i have the directory structure
home
      user - this is where i am running the script from
      um_xml- this is where the xml and dtd are

can someone please tell me how i can workaround this problem. the script 
executes fine when the xml and dtd files are in user/. But i don't 
really want to put them there.
any ideas?

thanks

cheers

-- 
Ajay Brar