From dieter@handshake.de  Fri Jan  1 12:39:28 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Fri, 1 Jan 1999 13:39:28 +0100
Subject: [XML-SIG] Namespace support for DOM
In-Reply-To: <13963.51590.804687.473078@amarok.cnri.reston.va.us>
References: <Pine.LNX.3.91.981230224232.15531A-100000@amati.techno.com>
 <13963.51590.804687.473078@amarok.cnri.reston.va.us>
Message-ID: <199901011239.NAA00946@lindm.dm>

Andrew M. Kuchling writes:
 > 	Indeed; I'm frightened of adding some sort of clever,
 > invalidate-namespaces-on-a-move, scheme and opening the door to lots
 > of subtle bugs.  Also, the PyDOM representation has nodes with a list
 > of their children, and no parent pointers; this makes the traversing
 > of ancestors difficult.  I'm somewhat tempted to toss the recently
 > announced WeakDict object into the XML package and add parent
 > pointers, but it may be too late to undertake such a large change to
 > the DOM code.  Any opinions?

If we decide to use the WeakDict module, I could help to adapt the
DOM code.

There is, however, some subtle semantic difference between
the current implementation and a WeakDict based one.
This difference shows up, when we get a reference to an internal
node of a dom tree and then delete the dom tree (still holding
the reference).
In the current implementation, the referenced node contains
a "_document" ("ownerDocument") attribute which protects the complete tree
from being garbage collected.
In a WeakDict based implementation, the reference to the
"ownerDocument" is nutarally implemented as a week reference (as
are the parent references). Deleting the dom tree deletes
everything from its root down to the referenced internal node.
Thus, this node looses its parent and the "ownerDocument" reference.
It can only be used thereafter in a very restricted way.

Dieter


From spepping@scaprea.hobby.nl  Sat Jan  2 19:50:14 1999
From: spepping@scaprea.hobby.nl (Simon Pepping)
Date: Sat, 2 Jan 1999 20:50:14 +0100 (MET)
Subject: [XML-SIG] Documentation and problems
In-Reply-To: <009e01be3367$794b5220$529b90d1@synchrologic.com>
Message-ID: <Pine.LNX.3.95.990102204759.2690A-100000@scaprea.hobby.nl>

On Tue, 29 Dec 1998, Frank McGeough wrote:

> Simon,
> 
> In your doc at :
> http://www.hobby.nl/~scaprea/XML/t173.html
> 
> I believe the
> 
> 2. Call the parser factory with the name of a known driver module, e.g.,
> SAXparser=xml.sax.saxexts.make_parser("xml.sax.drivers.drv_xmlproc")
> 
> is incorrect.  The saxexts.py has the following code in it:
> parser_name = 'xml.sax.drivers.drv_' + parser_name
> 
> therefore you should create the parser with :
> 
> SAXparser=xml.sax.saxexts.make_parser("xmlproc")
> 
> This may have been a recent change. I just started in with
> Python XML stuff. I have downloaded the xml-0_5.zip
> version.

That must indeed be a change from 0.4 to 0.5. I have updated my docs.
Thanks for notifying me.
 
> Thanks for putting that doc on-line. I found it very helpful.

Good to hear.
 
Simon Pepping
email: spepping@scaprea.hobby.nl


From dieter@handshake.de  Mon Jan  4 19:55:35 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Mon, 4 Jan 1999 20:55:35 +0100
Subject: [XML-SIG] Wrong URL: addContentTable
Message-ID: <199901041955.UAA00368@lindm.dm>

Some days ago, I posted:
> Based on our xml-0.5 release, I have made a small tool which adds
> a hierarchical content table to HTML documents:
> 
>      URL:http://www.handshake.de/~dieter/pyprojects/addContentTable.html

Unfortunately, I was unaware that my ISP converts letters in file names
to lower case. Thus, the correct URL is:

       URL:http://www.handshake.de/~dieter/pyprojects/addcontenttable.html

Sorry for the inconvenience!
Dieter


From Jeff.Johnson@icn.siemens.com  Thu Jan  7 19:29:59 1999
From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com)
Date: Thu, 7 Jan 1999 14:29:59 -0500
Subject: [XML-SIG] XmlWriter update
Message-ID: <852566F2.006B115C.00@li01.lm.ssc.siemens.com>

--0__=fvpHh2vcDJQ5CEBan2fJcjeWZtCMC0eNns393xvSWedcGNOO0Rg9JzLq
Content-type: text/plain; charset=us-ascii
Content-Disposition: inline


I moved some code from xml.dom.HtmlWriter up to the super class
xml.dom.XmlWriter so that it is easier to specify where new lines should be
inserted when writing XML.  I hope you like it and it gets into the XML
package or I'll have to rewrite my code :).  This should be fully backwards
compatible too.

The following is an example of how the change allows us to specify that
'tree' elements should get new lines before and after the start tag and end
tag.  The 'node' element only gets a new line before the start tag.

     nl_dict = {
          'tree':(1,1,1,1),
          'node':(1,0,0,0),
          }
     w = XmlWriter(sys.stdout,nl_dict)
     w.write(doc)

The new xml.dom.writer.py is attached.
(See attached file: writer.py)

Thanks


--0__=fvpHh2vcDJQ5CEBan2fJcjeWZtCMC0eNns393xvSWedcGNOO0Rg9JzLq
Content-type: application/octet-stream; 
	name="writer.py"
Content-Disposition: attachment; filename="writer.py"
Content-transfer-encoding: base64

IiIid3JpdGVyOiB3cml0ZXIvbGluZWFyaXNlciBjbGFzc2VzIGZvciBkdW1waW5nIERPTSB0cmVl
IHRvIGZpbGUuDQoNCiIiIg0KDQpmcm9tIHhtbC5kb20uY29yZSBpbXBvcnQgKg0KZnJvbSB4bWwu
ZG9tLndhbGtlciBpbXBvcnQgV2Fsa2VyDQppbXBvcnQgc3RyaW5nLCByZSwgc3lzDQoNCmZyb20g
eG1sLnV0aWxzIGltcG9ydCBlc2NhcGUNCgkNCg0KY2xhc3MgT3V0cHV0U3RyZWFtOg0KCWRlZiBf
X2luaXRfXyhzZWxmLCBmaWxlKToNCgkJc2VsZi5maWxlID0gZmlsZQ0KCQlzZWxmLm5ld19saW5l
ID0gMQ0KDQoJZGVmIHdyaXRlKHNlbGYsIHMpOg0KCQkjcHJpbnQgJ3dyaXRlJywgYHNgDQoJCXNl
bGYuZmlsZS53cml0ZShyZS5zdWIoJ1xuKycsICdcbicsIHMpKQ0KCQlpZiBzIGFuZCBzWy0xXSA9
PSAnXG4nOg0KCQkJc2VsZi5uZXdfbGluZSA9IDENCgkJZWxzZToNCgkJCXNlbGYubmV3X2xpbmUg
PSAwDQoNCglkZWYgbmV3TGluZShzZWxmKToNCgkJaWYgbm90IHNlbGYubmV3X2xpbmU6DQoJCQlz
ZWxmLndyaXRlKCdcbicpDQoNCglkZWYgX19kZWxfXyhzZWxmKToNCgkJc2VsZi5maWxlLmZsdXNo
KCkNCg0KDQpjbGFzcyBYbWxXcml0ZXIoV2Fsa2VyKToNCg0KCWRlZiBfX2luaXRfXyhzZWxmLCBz
dHJlYW09c3lzLnN0ZG91dCwgbmxfZGljdD17fSk6DQoJCXNlbGYuc3RyZWFtID0gT3V0cHV0U3Ry
ZWFtKHN0cmVhbSkNCgkJc2VsZi5lbXB0aWVzID0gW10NCgkJc2VsZi5zdHJpcCA9IFtdDQoJCXNl
bGYueG1sX3N0eWxlX2VuZHRhZ3MgPSAxDQoJCXNlbGYubmV3bGluZV9iZWZvcmVfc3RhcnQgPSBb
XQ0KCQlzZWxmLm5ld2xpbmVfYWZ0ZXJfc3RhcnQgPSBbXQ0KCQlzZWxmLm5ld2xpbmVfYmVmb3Jl
X2VuZCA9IFtdDQoJCXNlbGYubmV3bGluZV9hZnRlcl9lbmQgPSBbXQ0KCQlzZWxmLm1hcF9hdHRy
ID0gc2VsZi5tYXBfdGFnID0gbGFtYmRhIHg6IHgNCgkJc2VsZi5fc2V0TmV3TGluZXMobmxfZGlj
dCkNCg0KCWRlZiBfc2V0TmV3TGluZXMoc2VsZixubF9kaWN0KToNCgkJZm9yIGssIHYgaW4gbmxf
ZGljdC5pdGVtcygpOg0KCQkJaWYgdlswXToNCgkJCQlzZWxmLm5ld2xpbmVfYmVmb3JlX3N0YXJ0
LmFwcGVuZChrKQ0KCQkJCXNlbGYubmV3bGluZV9iZWZvcmVfc3RhcnQuYXBwZW5kKHN0cmluZy51
cHBlcihrKSkNCgkJCWlmIHZbMV06DQoJCQkJc2VsZi5uZXdsaW5lX2FmdGVyX3N0YXJ0LmFwcGVu
ZChrKQ0KCQkJCXNlbGYubmV3bGluZV9hZnRlcl9zdGFydC5hcHBlbmQoc3RyaW5nLnVwcGVyKGsp
KQ0KCQkJaWYgdlsyXToNCgkJCQlzZWxmLm5ld2xpbmVfYmVmb3JlX2VuZC5hcHBlbmQoaykNCgkJ
CQlzZWxmLm5ld2xpbmVfYmVmb3JlX2VuZC5hcHBlbmQoc3RyaW5nLnVwcGVyKGspKQ0KCQkJaWYg
dlszXToNCgkJCQlzZWxmLm5ld2xpbmVfYWZ0ZXJfZW5kLmFwcGVuZChrKQ0KCQkJCXNlbGYubmV3
bGluZV9hZnRlcl9lbmQuYXBwZW5kKHN0cmluZy51cHBlcihrKSkNCg0KCWRlZiB3cml0ZShzZWxm
LCB4KToNCgkJaWYgdHlwZSh4KSA9PSB0eXBlKCcnKToNCgkJCXNlbGYuc3RyZWFtLndyaXRlKHgp
DQoJCWVsaWYgdHlwZSh4KSBpbiAodHlwZSgoKSksIHR5cGUoW10pKToNCgkJCWZvciB5IGluIHg6
DQoJCQkJc2VsZi53cml0ZSh5KQ0KCQllbHNlOg0KCQkJc2VsZi53YWxrKHgpDQoNCg0KCWRlZiBz
dGFydEVsZW1lbnQoc2VsZiwgZWxlbWVudCkgOg0KCQlhc3NlcnQgZWxlbWVudC5nZXRfbm9kZVR5
cGUoKSA9PSBFTEVNRU5UDQoNCgkJcyA9ICc8JXMnICUgc2VsZi5tYXBfdGFnKGVsZW1lbnQuZ2V0
X25vZGVOYW1lKCkgKQ0KCQkNCgkJZm9yIG5hbWUsIHZhbHVlIGluIGVsZW1lbnQuZ2V0X2F0dHJp
YnV0ZXMoKS5pdGVtcygpOg0KCQkJcyA9IHMgKyAnICVzPSIlcyInICUgKHNlbGYubWFwX2F0dHIo
bmFtZSksDQoJCQkJCSAgICAgIGVzY2FwZSh2YWx1ZS5nZXRfbm9kZVZhbHVlKCkgKSkNCg0KCQlp
ZiBzZWxmLnhtbF9zdHlsZV9lbmR0YWdzIGFuZCBub3QgZWxlbWVudC5nZXRfY2hpbGROb2Rlcygp
Og0KCQkJcyA9IHMgKyAnLz4nDQoJCWVsc2U6DQoJCQlzID0gcyArICc+Jw0KDQoJCWlmIGVsZW1l
bnQuZ2V0X25vZGVOYW1lKCkgaW4gc2VsZi5uZXdsaW5lX2JlZm9yZV9zdGFydDoNCgkJCXNlbGYu
c3RyZWFtLm5ld0xpbmUoKQ0KCQlzZWxmLnN0cmVhbS53cml0ZShzKQ0KCQlpZiBlbGVtZW50Lmdl
dF9ub2RlTmFtZSgpIGluIHNlbGYubmV3bGluZV9hZnRlcl9zdGFydDoNCgkJCXNlbGYuc3RyZWFt
Lm5ld0xpbmUoKQ0KDQoNCglkZWYgZW5kRWxlbWVudChzZWxmLCBlbGVtZW50KToNCgkJYXNzZXJ0
IGVsZW1lbnQuZ2V0X25vZGVUeXBlKCkgPT0gRUxFTUVOVA0KDQoJCXMgPSAnJw0KCQlpZiBlbGVt
ZW50LmdldF9ub2RlTmFtZSgpIGluIHNlbGYuZW1wdGllcyA6DQoJCQlwYXNzDQoJCWVsaWYgbGVu
KGVsZW1lbnQuZ2V0X2NoaWxkTm9kZXMoKSApID09IDAgYW5kIHNlbGYueG1sX3N0eWxlX2VuZHRh
Z3M6DQoJCQlwYXNzDQoJCWVsc2U6DQoJCQlzID0gcyArICc8LyVzPicgJSBzZWxmLm1hcF90YWco
ZWxlbWVudC5nZXRfbm9kZU5hbWUoKSApDQoNCgkJaWYgZWxlbWVudC5nZXRfbm9kZU5hbWUoKSBp
biBzZWxmLm5ld2xpbmVfYmVmb3JlX2VuZDoNCgkJCXNlbGYuc3RyZWFtLm5ld0xpbmUoKQ0KCQlz
ZWxmLnN0cmVhbS53cml0ZShzKQ0KCQlpZiBlbGVtZW50LmdldF9ub2RlTmFtZSgpIGluIHNlbGYu
bmV3bGluZV9hZnRlcl9lbmQ6DQoJCQlzZWxmLnN0cmVhbS5uZXdMaW5lKCkNCg0KDQoJZGVmIGRv
VGV4dChzZWxmLCB0ZXh0X25vZGUpOg0KCQkjaWYgdGV4dF9ub2RlLmdldFBhcmVudE5vZGUoKS50
YWdOYW1lIGluIHNlbGYuc3RyaXA6DQoJCSMJZGF0YSA9IHN0cmluZy5zdHJpcCh0ZXh0X25vZGUu
ZGF0YSkNCgkJI2Vsc2U6DQoJCWRhdGEgPSB0ZXh0X25vZGUuZ2V0X25vZGVWYWx1ZSgpDQoJCXNl
bGYuc3RyZWFtLndyaXRlKGVzY2FwZShkYXRhKSkNCg0KCWRlZiBkb0NvbW1lbnQoc2VsZiwgbm9k
ZSk6DQoJCXNlbGYuc3RyZWFtLndyaXRlKG5vZGUudG94bWwoKSkNCg0KDQpjbGFzcyBYbWxMaW5l
YXJpc2VyKFhtbFdyaXRlcik6DQoNCglkZWYgX19pbml0X18oc2VsZik6DQoJCWltcG9ydCBTdHJp
bmdJTw0KCQlzZWxmLmJ1ZmZlciA9IFN0cmluZ0lPLlN0cmluZ0lPKCkNCgkJWG1sV3JpdGVyLl9f
aW5pdF9fKHNlbGYsIHNlbGYuYnVmZmVyKQ0KDQoJZGVmIGxpbmVhcmlzZShzZWxmLCBub2RlKToN
CgkJc2VsZi53cml0ZShub2RlKQ0KCQlyZXR1cm4gc2VsZi5idWZmZXIuZ2V0dmFsdWUoKQ0KCQ0K
DQpjbGFzcyBIdG1sV3JpdGVyKFhtbFdyaXRlcik6DQoJZGVmIF9faW5pdF9fKHNlbGYsIHN0cmVh
bT1zeXMuc3Rkb3V0KToNCgkJWG1sV3JpdGVyLl9faW5pdF9fKHNlbGYsIHN0cmVhbSkNCgkJc2Vs
Zi5tYXBfYXR0ciA9IHNlbGYubWFwX3RhZyA9IHN0cmluZy51cHBlcg0KCQlzZWxmLnhtbF9zdHls
ZV9lbmR0YWdzID0gMA0KDQoJCXNlbGYuZW1wdGllcyA9IFsNCgkJCSdpbWcnLCAnYnInLCAnaHIn
LCAnaW5jbHVkZScsICdsaScsICdtZXRhJywgJ2lucHV0JywNCgkJCSdJTUcnLCAnQlInLCAnSFIn
LCAnSU5DTFVERScsICdMSScsICdNRVRBJywgJ0lOUFVUJywNCgkJXQ0KCQlzZWxmLnN0cmlwID0g
Ww0KCQkJJ2gxJywgJ2gyJywgJ2gzJywgJ2g0JywgJ2g1JywgJ2g2JywgDQoJCQknbGknLCAnYnIn
LCAncCcsICdhJywgJ3RpdGxlJywgJ2ZvbnQnLA0KCQkJJ0gxJywgJ0gyJywgJ0gzJywgJ0g0Jywg
J0g1JywgJ0g2JywgDQoJCQknTEknLCAnQlInLCAnUCcsICdBJywgJ1RJVExFJywgJ0ZPTlQnLA0K
CQldDQoNCgkJbmxfZGljdCA9IHsNCgkJCSdoZWFkJzogKDEsIDEsIDEsIDEpLA0KCQkJJ2JvZHkn
OiAoMSwgMSwgMSwgMSksDQoJCQkndGl0bGUnOiAoMSwgMSwgMSwgMSksDQoJCQknbWV0YSc6ICgx
LCAxLCAwLCAwKSwNCgkJCSd1bCc6ICgxLCAxLCAxLCAxKSwNCgkJCSdsaSc6ICgxLCAwLCAwLCAw
KSwNCgkJCSdoMSc6ICgxLCAwLCAwLCAxKSwNCgkJCSdoMic6ICgxLCAwLCAwLCAxKSwNCgkJCSdo
Myc6ICgxLCAwLCAwLCAxKSwNCgkJCSdoNCc6ICgxLCAwLCAwLCAxKSwNCgkJCSdoNSc6ICgxLCAw
LCAwLCAxKSwNCgkJCSdoNic6ICgxLCAwLCAwLCAxKSwNCgkJCSdwJzogKDEsIDAsIDAsIDEpLA0K
CQkJJ2JyJzogKDEsIDEsIDAsIDApLA0KCQl9DQoJCQ0KCQlzZWxmLl9zZXROZXdMaW5lcyhubF9k
aWN0KQ0KDQoNCmNsYXNzIEh0bWxMaW5lYXJpc2VyKEh0bWxXcml0ZXIpOg0KDQoJZGVmIF9faW5p
dF9fKHNlbGYpOg0KCQlpbXBvcnQgU3RyaW5nSU8NCgkJc2VsZi5idWZmZXIgPSBTdHJpbmdJTy5T
dHJpbmdJTygpDQoJCUh0bWxXcml0ZXIuX19pbml0X18oc2VsZiwgc2VsZi5idWZmZXIpDQoNCglk
ZWYgbGluZWFyaXNlKHNlbGYsIG5vZGUpOg0KCQlzZWxmLndyaXRlKG5vZGUpDQoJCXJldHVybiBz
ZWxmLmJ1ZmZlci5nZXR2YWx1ZSgpDQoJDQoNCmNsYXNzIEFTUFdyaXRlcihYbWxXcml0ZXIpOg0K
CWRlZiBfX2luaXRfXyhzZWxmLCByZXBfZmlsZSk6DQoJCXNlbGYucmVwX2RpY3QgPSB7fQ0KCQlz
ZWxmLnBhcnNlUmVwRmlsZShyZXBfZmlsZSkNCgkJDQoNCglkZWYgcGFyc2VSZXBGaWxlKHNlbGYs
IHJlcF9maWxlKToNCgkJcyA9ICcnDQoJCWZvciBsIGluIG9wZW4ocmVwX2ZpbGUpLnJlYWRsaW5l
cygpOg0KCQkJaWYgbFswXSA9PSAnPCc6DQoJCQkJcGx1c19iZWZvcmUgPSAwDQoJCQkJcGx1c19h
ZnRlciA9IDANCgkJCQluID0gc3RyaW5nLmluZGV4KGwsICc+JykNCgkJCQl0YWdfbmFtZSA9IGxb
MTpuXQ0KCQkJCXJlcCA9IHN0cmluZy5zdHJpcChsW24rMTpdKQ0KCQkJCWlmIHJlcCBhbmQgcmVw
WzBdID09ICcrJzoNCgkJCQkJcGx1c19iZWZvcmUgPSAxDQoJCQkJCXJlcCA9IHN0cmluZy5zdHJp
cChyZXBbMTpdKQ0KCQkJCWlmIHJlcCBhbmQgcmVwWy0xXSA9PSAnKyc6DQoJCQkJCXBsdXNfYWZ0
ZXIgPSAxDQoJCQkJCXJlcCA9IHN0cmluZy5zdHJpcChyZXBbOi0xXSkNCgkJCQlpZiByZXA6DQoJ
CQkJCXNlbGYucmVwX2RpY3RbdGFnX25hbWVdID0gKHBsdXNfYmVmb3JlLCBwbHVzX2FmdGVyLCBl
dmFsKHJlcCkpDQoJCQkJZWxzZToNCgkJCQkJc2VsZi5yZXBfZGljdFt0YWdfbmFtZV0gPSAocGx1
c19iZWZvcmUsIHBsdXNfYWZ0ZXIsICcnKQ0KCQkJCQkNCg0KCWRlZiBsaW5lYXJpc2VfZWxlbWVu
dChzZWxmLCBlbGVtZW50KSA6DQoJCWFzc2VydCBlbGVtZW50Lk5vZGVUeXBlID09IEVMRU1FTlQN
CgkJcyA9ICcnDQoJCQ0KCQkjIFN0YXJ0IHRhZw0KCQlwbHVzX2JlZm9yZSwgcGx1c19hZnRlciwg
cmVwbCA9IHNlbGYucmVwX2RpY3RbZWxlbWVudC5nZXRUYWdOYW1lKCldDQoNCgkJaWYgcyBhbmQg
c1stMV0gIT0gJ1xuJyBhbmQgcGx1c19iZWZvcmU6DQoJCQlzID0gcyArICdcbicNCgkJcyA9IHMg
KyByZXBsDQoJCWlmIHMgYW5kIHNbLTFdICE9ICdcbicgYW5kIHBsdXNfYWZ0ZXI6DQoJCQlzID0g
cyArICdcbicNCg0KCQlzMSA9ICcnDQoJCWZvciBjaGlsZCBpbiBlbGVtZW50LmdldENoaWxkcmVu
KCk6DQoJCQlpZiBjaGlsZC5Ob2RlVHlwZSBpcyBFTEVNRU5UOg0KCQkJCXMxID0gczEgKyBzZWxm
LmxpbmVhcmlzZV9lbGVtZW50KGNoaWxkKQ0KCQkJZWxpZiBjaGlsZC5Ob2RlVHlwZSBpcyBURVhU
Og0KCQkJCSNzMSA9IHMxICsgZXNjYXBlKGNoaWxkLmRhdGEpDQoJCQkJczEgPSBzMSArIGNoaWxk
LmRhdGENCgkJCWVsc2UgOg0KCQkJCXMxID0gczEgKyBzdHIoY2hpbGQpDQoJCQ0KCQlzID0gcyAr
IHMxDQoNCgkJIyBFbmQgdGFnLg0KCQlwbHVzX2JlZm9yZSwgcGx1c19hZnRlciwgcmVwbCA9IHNl
bGYucmVwX2RpY3RbJy8nICsgZWxlbWVudC5nZXRUYWdOYW1lKCldDQoJCWlmIHMgYW5kIHNbLTFd
ICE9ICdcbicgYW5kIHBsdXNfYmVmb3JlOg0KCQkJcyA9IHMgKyAnXG4nDQoJCXMgPSBzICsgcmVw
bA0KCQlpZiBzIGFuZCBzWy0xXSAhPSAnXG4nIGFuZCBwbHVzX2FmdGVyOg0KCQkJcyA9IHMgKyAn
XG4nDQoNCgkJcmV0dXJuIHMNCg0K

--0__=fvpHh2vcDJQ5CEBan2fJcjeWZtCMC0eNns393xvSWedcGNOO0Rg9JzLq--


From larsga@ifi.uio.no  Thu Jan  7 20:46:57 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 07 Jan 1999 21:46:57 +0100
Subject: [XML-SIG] Documentation and problems
In-Reply-To: <Pine.LNX.3.95.981228125504.724A-100000@scaprea.hobby.nl>
References: <Pine.LNX.3.95.981228125504.724A-100000@scaprea.hobby.nl>
Message-ID: <wk7luy4xq6.fsf@ifi.uio.no>

* Simon Pepping
| 
| I have spent quite some time with the XML package, mainly with the
| SAX interface and xmlproc. As a result I have written a(nother)
| document about the interaction of an application and a SAX parser,
| and how to write a SAX application. I also wrote a simple
| application to demonstrate it.

Great! I think this document is something people have wanted for some
time, and I think it complements AMKs documentation nicely.
 
| Pr. SAXParseException.__str__ reads:

Thanks! This is now fixed.
 
| Pr. pyexpat does not report the document name with the getSystemId
| method:

Not so strange, since pyexpat does not (as far as I can tell) make
this information available. However, I've now changed the driver to
remember the sysId passed to it as an argument to parse() and report
that. If no sysId is available (parseFile or reset/feed/close were
used) "Unknown" is returned.
 
| Pr. XMLValidator does not use my error handlers:

Hmmm. The code you cite does not match my current development version
nor the version in the public CVS tree. In fact, I suspect this to be
from a quite old release.

| Pr. XMLValidator does not accept spaces around #PCDATA as content in
| an element type declaration:

I cannot replicate this problem with my version and the error message
seems to be from version 0.51 or earlier.

Can you please check if you're using version 0.52? (Check the source
of xmlproc.py.) If not, can you please install 0.52 and try again?

| Pr. drv_xmlproc does not implement a getPublicId method:

This is because the parser does not keep track of this information at
the moment. I've added it to the todo list and hope to get this into
version 0.53.

| Pr. XMLValidator does not accept the following construction in an
| external DTD:

This is correct. Parameter entity references inside markup
declarations are not yet supported by xmlproc.
 
| <!ENTITY %  tekst                   "(#PCDATA|taxon|label|opsomming)*">
| <!ELEMENT   p                       (%tekst;)>
| 
| ERROR: Didn't match [A-Za-z_:][\-A-Za-z_:.0-9]* at waarnemingen.dtd:22:38
| TEXT: '%tekst;)>
| '
| (the declaration of p is line 22)
| 
| I am not sure whether this is allowed. nsgmls gives the warning:
| '#PCDATA in nested model group'. 

It's not. What you've written is equivalent to

<!ELEMENT   p   ((#PCDATA|taxon|label|opsomming)*)>

which does not match the grammar in the XML spec. (See productions
45-46 and 51.) Remove the parentheses around the PE reference and it
should be fine.
 
| I hope this is useful.

It most certainly was! This makes it abundantly clear that new
releases of both saxlib and saxdrv are needed, and I'm currently
working on both. This should probably take 2-3 weeks until the first
release. (Anyone who needs the new versions before then can email me.)

--Lars M.


From larsga@ifi.uio.no  Thu Jan  7 21:11:16 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 07 Jan 1999 22:11:16 +0100
Subject: [XML-SIG] Documentation and problems
In-Reply-To: <Pine.LNX.3.95.981228125504.724A-100000@scaprea.hobby.nl>
References: <Pine.LNX.3.95.981228125504.724A-100000@scaprea.hobby.nl>
Message-ID: <wk4sq24wln.fsf@ifi.uio.no>

* Simon Pepping
| 
| Check it out at http://www.hobby.nl/~scaprea/XML/index.html.

I've now read through your document more thoroughly and have some
corrections to it:

 - the application should _not_ register the driver as a locator.
   The drivers that provide location information do this themselves
   before calling the startDocument method. Those that don't simply 
   do not register a locator.

   In fact, you have no guarantee that the parser and the locator are
   the same object...

 - In <URL:http://www.hobby.nl/~scaprea/XML/t128.html> the last
   paragraph repeats this.

 - In <URL:http://www.hobby.nl/~scaprea/XML/t141.html> you write:

   "A SAX application must contain the handler classes
   DocumentHandler, DTDHandler, EntityResolver, and ErrorHandler,
   which should implement the methods prescribed by the SAX
   specification."

   SAX applications don't have to implement any of these at all, and
   in fact there exists an application that doesn't (saxtimer.py).
   
   So the text should say 'can' instead of 'must'. (A nit, I know, but
   one that would confuse literal-minded people like myself. :)

 - In <URL:http://www.hobby.nl/~scaprea/XML/t141.html> it would be
   nice if you mentioned a little-known fact:

     - saxutils.py defines two useful error handlers: ErrorPrinter and
       ErrorRaiser, both of which can be used directly if their
       behaviour is what your application needs.

 - In the same page you write on the last line:

   "At the end of the parse, your application may stop. Or it may
   continue, especially if it has stored data in memory."

   Perhaps it's better to say:

   "At the end of the parse, the SAX driver returns from the parse or
   parseFile method and your application is free to do whatever it
   wants."

 - In <URL:http://www.hobby.nl/~scaprea/XML/t173.html> you write:

   "The saxexts module defines a ParserFactory class. Upon import it
   makes an instance of it, called XMLParserFactory, which lists all
   known SAX-compliant XML parsers (actually it lists their driver
   modules). [It also makes instances with known validating XML
   parsers, HTML parsers and SGML parsers.]"

   A consequence of this is that you are wrong when you write on the
   previous page that "SAXparser=xml.sax.saxexts.make_parser()" is
   always the best method. It's not if you have special parser
   requirements.

Despite this I think this is a very useful document and that it
definitely fills a need. I've linked to it from the saxlib home page.
(The link may not become visible before tomorrow.)

--Lars M.


From mss@transas.com  Fri Jan  8 16:42:33 1999
From: mss@transas.com (Michael Sobolev)
Date: Fri, 8 Jan 1999 19:42:33 +0300
Subject: [XML-SIG] Unicode stuff in XML package
Message-ID: <19990108194233.A4170@anguish.transas.com>

I have a small suggestion.  The original package (intl??) contained
a nice utility called process_charmap, which helps to deal with
charmap files.  Unfortunately, I could not find it in my python-xml
package (under Debian, version 0.5).  I believe it would be a nice
addition for xml.unicode subpackage.  In case this sounds interesting,
I could provide a modified version of the program (this can be used
as a function).

Cheers,

--
Mike


From michael@graphion.com  Mon Jan 11 21:53:12 1999
From: michael@graphion.com (Michael Sanborn)
Date: Mon, 11 Jan 1999 13:53:12 -0800
Subject: [XML-SIG] Getting a slice from a NodeList?
Message-ID: <369A72C8.3A5D659B@graphion.com>

This is probably really basic, but I'm not understanding an error
message.

What I'm trying to do is to alter an attribute from the first two
members of a NodeList (called "wott_list") returned by
getElementsByTagName. But I don't seem to be able to get a slice of it.
The innermost part of the error message is:

  File "fed.py", line 155, in startElement
    for i in wott_list[0:2]:
  File "C:\Program Files\Python\Lib\UserList.py", line 22, in
__getslice__
    userlist = self.__class__()
TypeError: not enough arguments; expected 4, got 1

What expected 4 arguments? __class__()?

I'm using Python 1.5.2b1 and 0.5 of the XML package on Win95.

Michael Sanborn
Graphion Typesetting


From akuchlin@cnri.reston.va.us  Mon Jan 11 22:04:11 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon, 11 Jan 1999 17:04:11 -0500 (EST)
Subject: [XML-SIG] Getting a slice from a NodeList?
In-Reply-To: <369A72C8.3A5D659B@graphion.com>
References: <369A72C8.3A5D659B@graphion.com>
Message-ID: <13978.30012.569587.130370@amarok.cnri.reston.va.us>

Michael Sanborn writes:
>What expected 4 arguments? __class__()?
>I'm using Python 1.5.2b1 and 0.5 of the XML package on Win95.

	Yes; this looks like a bug (might be fixed in the CVS tree).
I'll look into it tonight and post a patch.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
Nothing is built on stone; all is built on sand, but we must build as if the
sand were stone.
    -- Jorge Luis Borges


From larsga@ifi.uio.no  Mon Jan 11 22:42:04 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 11 Jan 1999 23:42:04 +0100
Subject: [XML-SIG] Documentation and problems
In-Reply-To: <009e01be3367$794b5220$529b90d1@synchrologic.com>
References: <009e01be3367$794b5220$529b90d1@synchrologic.com>
Message-ID: <wkemp11lfn.fsf@ifi.uio.no>

* Frank McGeough
|
| I believe the
| 
| 2. Call the parser factory with the name of a known driver module, e.g.,
| SAXparser=xml.sax.saxexts.make_parser("xml.sax.drivers.drv_xmlproc")
| 
| is incorrect.  The saxexts.py has the following code in it:
| parser_name = 'xml.sax.drivers.drv_' + parser_name
| 
| therefore you should create the parser with :
| 
| SAXparser=xml.sax.saxexts.make_parser("xmlproc")

It now turns out that the version of saxexts.py in the CVS tree had
this change made by mistake. In other words, the behaviour that is
described here is not correct and so the document should remain as it
is (or was). I believe the error is fixed in the CVS tree now, but
can't check because of a local problem.

--Lars M.


From spepping@scaprea.hobby.nl  Wed Jan 13 19:05:32 1999
From: spepping@scaprea.hobby.nl (Simon Pepping)
Date: Wed, 13 Jan 1999 20:05:32 +0100 (MET)
Subject: [XML-SIG] Documentation and problems
In-Reply-To: <wk7luy4xq6.fsf@ifi.uio.no>
Message-ID: <Pine.LNX.3.95.990113200345.267D-100000@scaprea.hobby.nl>

On 7 Jan 1999, Lars Marius Garshol wrote:

> | Pr. XMLValidator does not use my error handlers:
> 
> Hmmm. The code you cite does not match my current development version
> nor the version in the public CVS tree. In fact, I suspect this to be
> from a quite old release.
> 
> | Pr. XMLValidator does not accept spaces around #PCDATA as content in
> | an element type declaration:
> 
> I cannot replicate this problem with my version and the error message
> seems to be from version 0.51 or earlier.
> 
> Can you please check if you're using version 0.52? (Check the source
> of xmlproc.py.) If not, can you please install 0.52 and try again?

Now using version 0.52. The error handlers are still not mine, e.g.:

ERROR: Not a valid name at waarnemingen.dtd:22:38
TEXT: '%tekst;)>
'

The other problem has gone.

Simon Pepping
email: spepping@scaprea.hobby.nl


From spepping@scaprea.hobby.nl  Wed Jan 13 19:05:43 1999
From: spepping@scaprea.hobby.nl (Simon Pepping)
Date: Wed, 13 Jan 1999 20:05:43 +0100 (MET)
Subject: [XML-SIG] Documentation and problems
In-Reply-To: <wk4sq24wln.fsf@ifi.uio.no>
Message-ID: <Pine.LNX.3.95.990113200120.267C-100000@scaprea.hobby.nl>

On 7 Jan 1999, Lars Marius Garshol wrote:

> I've now read through your document more thoroughly and have some
> corrections to it:
> 
>  - the application should _not_ register the driver as a locator.
>    The drivers that provide location information do this themselves
>    before calling the startDocument method. Those that don't simply 
>    do not register a locator.
> 
>    In fact, you have no guarantee that the parser and the locator are
>    the same object...

I had missed that, and I will modify my document as per your
suggestion.

Note, however, that drv_pyexpat.py does not register a locator, while,
if one registers the parser as the locator, it does implement the
locator methods (except for the fact that it does not report the
document, as noted before).

Do I understand correctly that the availability of a locator is not
guaranteed, so that the application should test for this?  Or should
every SAX parser provide at least dummy locator methods so that calls
to them do not generate errors, e.g. by inheriting from
saxlib.Locator? Then dvr_pyexpat.py should register the parser as the
locator. Currently it generates an attribute error if one tries to use
the locator methods.

>  - In <URL:http://www.hobby.nl/~scaprea/XML/t128.html> the last
>    paragraph repeats this.
> 
>  - In <URL:http://www.hobby.nl/~scaprea/XML/t141.html> you write:
> 
>    "A SAX application must contain the handler classes
>    DocumentHandler, DTDHandler, EntityResolver, and ErrorHandler,
>    which should implement the methods prescribed by the SAX
>    specification."
> 
>    SAX applications don't have to implement any of these at all, and
>    in fact there exists an application that doesn't (saxtimer.py).
>    
>    So the text should say 'can' instead of 'must'. (A nit, I know, but
>    one that would confuse literal-minded people like myself. :)

I mean to say that the application in principle should have such
methods, because the parser expects them and makes calls to them.. The
following paragraphs explain that these methods can be provided by
inheriting the dummy methods from the provided sax library.  I still
feel that I state this correctly.

I will follow your other suggestions. Thanks for your critical
comments.

Simon Pepping
email: spepping@scaprea.hobby.nl


From fredrik@pythonware.com  Thu Jan 14 23:26:19 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 15 Jan 1999 00:26:19 +0100
Subject: [XML-SIG] XML-RPC client library
Message-ID: <016901be4015$4bee1e60$f29b12c2@pythonware.com>

<F>

Don't recall if someone else has done something
similar, but I just whipped together a small client
library for Frontier's XML-RPC protocol:

    http://www.pythonware.com/madscientist/

This one is completely self-contained (works on
top of any standard 1.5 installation).  I'm sure
Dave Winer would be really happy if this made
it into the XML SIG distribution some day ;-)

</F>


From akuchlin@cnri.reston.va.us  Fri Jan 15 03:51:37 1999
From: akuchlin@cnri.reston.va.us (A.M. Kuchling)
Date: Thu, 14 Jan 1999 22:51:37 -0500
Subject: [XML-SIG] Getting a slice from a NodeList?
In-Reply-To: <369A72C8.3A5D659B@graphion.com>
References: <369A72C8.3A5D659B@graphion.com>
Message-ID: <199901150351.WAA00812@207-172-45-23.s23.tnt5.brd.erols.com>

Michael Sanborn writes:
 > What I'm trying to do is to alter an attribute from the first two
 > members of a NodeList (called "wott_list") returned by
 > getElementsByTagName. But I don't seem to be able to get a slice of it.

The fix is to replace the __getslice__ method of the NodeList class
with this, correct, function:

    def __getslice__(self, i, j):
        userlist = NodeList([], self._document, self._parent)
        userlist.data[:] = self.data[i:j]
        return userlist

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
Nature is beneficent. I praise her and all her works. She is silent and wise.
She is cunning, but for good ends. She has brought me here and will also lead
me away. She may scold me, but she will not hate her work. I trust her.
    -- Goethe


From gwachob@findlaw.com  Fri Jan 15 08:57:07 1999
From: gwachob@findlaw.com (Gabe Wachob)
Date: Fri, 15 Jan 1999 00:57:07 -0800
Subject: [XML-SIG] XML Product for Zope
Message-ID: <369F02E3.30330C36@findlaw.com>

Python XMLers-

I put together a simple Product for Zope which encapsulates an XML file
and an XSL file and renders the XML into HTML using the XSL file. Since
there is no XSL processor in Python that I could find, I use an external
one written in (gasp) Java.

Perhaps writing an XSL engine in python should be my next task.

Anyway, you can get it at
http://www.aimnet.com/~gwachob/software.html

It should install like a normal Zope product (fingers crossed - my first
real Product).

    -Gabe

P.S. "Product" is Zope's terminology for the encapsulation of an
application (on a small scale). I'm not going marketing-droid-beserk
here.... ;-)


From dieter@handshake.de  Fri Jan 15 21:57:19 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Fri, 15 Jan 1999 22:57:19 +0100
Subject: [XML-SIG] ANN: XSL-Pattern (and minor DOM patch)
Message-ID: <199901152157.WAA00981@lindm.dm>

This is a multi-part MIME message.
--------------FC5583E803777E8ABB8C4995
Content-Type: text/plain

On top of our PyDom, I have implemented XSL-Pattern, the
pattern sublanguage of the XSL working draft (16-December-1998).

Patterns are used extensively in the XSL transformation
language and its control structures.
They can be used outside XSL, too, for e.g. querying/selecting/matching
parts of HTML/SGML/XML documents.

To build the pattern parser, I have used Scott Hassan's PyBison
package.

More information and download at
	URL:http://www.handshake.de/~dieter/pyprojects/xslpattern.html


A small patch to "xml.dom.core" was needed (attached) to
fix a missing "len(...)" in the DOMs attribute handling.


Dieter

--------------FC5583E803777E8ABB8C4995
Content-Type: application/x-patch; name="attr.pat"
Content-Description: Patch to "xml.dom.core" fixing missing "len(...)"

--- :core.py-1  Tue Dec 29 14:59:35 1998
+++ core.py     Tue Jan 12 21:38:25 1999
@@ -203,7 +203,7 @@

     def values(self):
         L = self.data.values()
-        for i in range(L):
+        for i in range(len(L)):
             n = L[i]
             L[i] = NODE_CLASS[ n.type ](n, None, self._document )
         return L


--------------FC5583E803777E8ABB8C4995--


From larsga@ifi.uio.no  Mon Jan 18 14:17:37 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 18 Jan 1999 15:17:37 +0100
Subject: [XML-SIG] Documentation and problems
In-Reply-To: <Pine.LNX.3.95.990113200120.267C-100000@scaprea.hobby.nl>
References: <Pine.LNX.3.95.990113200120.267C-100000@scaprea.hobby.nl>
Message-ID: <wkaezgpsvy.fsf@ifi.uio.no>

* Simon Pepping
| 
| Note, however, that drv_pyexpat.py does not register a locator,
| while, if one registers the parser as the locator, it does implement
| the locator methods (except for the fact that it does not report the
| document, as noted before).

Ah, that's a bug. Thanks for reporting this! I'm afraid the pyexpat
driver saw very little testing, since I did not have pyexpat
available before I released it. (I do now.)

I've fixed this now, so the next release of the driver package will
have this.
 
| Do I understand correctly that the availability of a locator is not
| guaranteed, so that the application should test for this?

Yes. Not all parsers provide location information.

| Or should every SAX parser provide at least dummy locator methods so
| that calls to them do not generate errors, e.g. by inheriting from
| saxlib.Locator? 

I think it's better for the driver to be frank about this and not
register a locator if it doesn't actually have any location
information.

| [Re <URL:http://www.hobby.nl/~scaprea/XML/t141.html>]
|
| I mean to say that the application in principle should have such
| methods, because the parser expects them and makes calls to them.
| The following paragraphs explain that these methods can be provided
| by inheriting the dummy methods from the provided sax library.  I
| still feel that I state this correctly.

I like the way you've written it now better. The only thing I really
would like to see changed is that you don't make it clear that
registering handlers is not required. It's implied now, but not stated
directly. (Yes, I am a nit-chaser. Feel free to ignore this.)

--Lars M.


From coma@korea.com  Thu Jan 21 01:43:37 1999
From: coma@korea.com (coma@korea.com)
Date: Wed, 20 Jan 1999 18:43:37 PDT
Subject: [XML-SIG] Vol3.- 01/19/1999 - Korea.com News
Message-ID: <199901201036.FAA26187@python.org>

This is Korea.Com Newspaper in English.
<a href="http://news.korea.com/newsmail/main-e.htm">
Click here to Read</a>
<a href="http://robot.servertek.co.kr/~dbop/eml/remove/s0=admin99;em=xml%2Dsig%40python%2Eorg">Remove my Address from DB</a>.

----------------------------------------------------

�ѱ� Korea.Com News Service�Դϴ�. 
<a href="http://news.korea.com/newsmail/main.htm">
���� �б�</a>
<a href="http://robot.servertek.co.kr/~dbop/eml/remove/s0=admin99;em=xml%2Dsig%40python%2Eorg">�������� ���� �ʰ���</a>

<center>
<font size=1>Free News Service-Korea.Com News</font>
</center>


From tismer@appliedbiometrics.com  Wed Jan 20 16:47:47 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 20 Jan 1999 17:47:47 +0100
Subject: [XML-SIG] XmlWriter update
References: <852566F2.006B115C.00@li01.lm.ssc.siemens.com>
Message-ID: <36A608B3.F5D6F61C@appliedbiometrics.com>

Jeff.Johnson@icn.siemens.com wrote:
> 
> I moved some code from xml.dom.HtmlWriter up to the super class
> xml.dom.XmlWriter so that it is easier to specify where new lines should be
> inserted when writing XML.  I hope you like it and it gets into the XML
> package or I'll have to rewrite my code :).  This should be fully backwards
> compatible too.
> 
> The following is an example of how the change allows us to specify that
> 'tree' elements should get new lines before and after the start tag and end
> tag.  The 'node' element only gets a new line before the start tag.
> 
>      nl_dict = {
>           'tree':(1,1,1,1),
>           'node':(1,0,0,0),
>           }
>      w = XmlWriter(sys.stdout,nl_dict)
>      w.write(doc)

Hi XMLers!
I found this one quite useful.
Will it make it into the lib?

Furthermore, is anybody interested in a prettyprint mode,
(with some indentation), or has that been done already?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From akuchlin@cnri.reston.va.us  Wed Jan 20 17:03:13 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Wed, 20 Jan 1999 12:03:13 -0500 (EST)
Subject: [XML-SIG] XmlWriter update
In-Reply-To: <36A608B3.F5D6F61C@appliedbiometrics.com>
References: <852566F2.006B115C.00@li01.lm.ssc.siemens.com>
 <36A608B3.F5D6F61C@appliedbiometrics.com>
Message-ID: <13990.3023.577218.223562@amarok.cnri.reston.va.us>

Christian Tismer writes about Jeff Johnson's XmlWriter patch:
>Hi XMLers!
>I found this one quite useful.
>Will it make it into the lib?

	I haven't gotten around to looking at the patch, but
definitely would like to include it; Jeff's code submissions have been
fine in the past, so it'll probably wind up applied to the CVS tree.
(I've been thinking of issuing a 0.5.1 updated release, but haven't
had time for XML hacking lately.)

>Furthermore, is anybody interested in a prettyprint mode,
>(with some indentation), or has that been done already?

	That would be very useful.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
The life of Man is a struggle with Nature and a struggle with the Machine;
when Nature and the Machine link forces against him, Man hasn't a chance.
    -- Robertson Davies, _The Diary of Samuel Marchbanks_


From gwachob@aimnet.com  Wed Jan 20 19:49:41 1999
From: gwachob@aimnet.com (Gabe Wachob)
Date: Wed, 20 Jan 1999 11:49:41 -0800 (PST)
Subject: [XML-SIG] XmlWriter update
In-Reply-To: <13990.3023.577218.223562@amarok.cnri.reston.va.us>
Message-ID: <Pine.GSO.4.05.9901201145250.26248-100000@shell1.ncal.verio.com>

On Wed, 20 Jan 1999, Andrew M. Kuchling wrote:

> Christian Tismer writes about Jeff Johnson's XmlWriter patch:
> >Hi XMLers!
> >I found this one quite useful.
> >Will it make it into the lib?
> 
> 	I haven't gotten around to looking at the patch, but
> definitely would like to include it; Jeff's code submissions have been
> fine in the past, so it'll probably wind up applied to the CVS tree.
> (I've been thinking of issuing a 0.5.1 updated release, but haven't
> had time for XML hacking lately.)
> 
> >Furthermore, is anybody interested in a prettyprint mode,
> >(with some indentation), or has that been done already?
> 
> 	That would be very useful.

Quick-n-ugly (and I do mean ugly) prettyprint into HTML -- if someone
wants to make it better:

from xml.sax import saxlib
from xml.sax import saxexts
import sys

class XMLPrettyPrint(saxlib.HandlerBase):
    """
    Pretty print an XML source tree in HTML with colors, etc
    """
    def __init__(self):
	totalstring=""

    def startElement(self, name, attrs):
	string= "<ul><font color=\"#1155ff\">&lt;"+name
	if attrs.getLength() > 0:
	    for key in attrs.keys():
		string=string+ " <font color=\"#339922\">"+key+"=\"<font color=\"#226600\">"+attrs[key]+"</font>\"</font>"
	self.totalstring=self.totalstring+(string+"&gt;</font>")

    def characters(self, ch, start, length):
	self.totalstring=self.totalstring+"<ul>"+(ch[start:start+length])+"</ul>"

    def endElement(self, name):
	self.totalstring=self.totalstring+ "<font color=\"#1155ff\">&lt;/"+name+"&gt;</font></ul>"

    def startDocument(self):
	self.totalstring=self.totalstring+ "<tt>"

    def endDocument(self):
	self.totalstring=self.totalstring+ "</tt>"

    def processingInstruction(self, target, data):
	self.totalstring=self.totalstring+ "<font color=\"#ff2244\">&lt;?"+target+" "+data+"?&gt;</font>"


if __name__=="__main__":
    myparser=saxexts.make_parser()
    myxpp=XMLPrettyPrint()
    myparser.setDocumentHandler(myxpp)
    myparser.parseFile(sys.stdin)
    print myxpp.totalstring

------------------------------------------------------------------------
Gabe Wachob - http://www.findlaw.com - http://www.aimnet.com/~gwachob
As of today, the U.S. Constitution has been in force for 76,913 days
When this message was sent, there were 29,851,818 seconds before Y2K


From akuchlin@cnri.reston.va.us  Thu Jan 21 03:45:46 1999
From: akuchlin@cnri.reston.va.us (A.M. Kuchling)
Date: Wed, 20 Jan 1999 22:45:46 -0500
Subject: [XML-SIG] Pretty-printing DOM trees
Message-ID: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>

The format() function below pretty-prints a DOM tree.  It strips away
all the whitespace, and then inserts Text nodes containing white
space, producing output like this:

<?xml version="1.0"?>
<?IS10744:arch name="xsa"?>
<HTML>
    <HEAD>
        <TITLE>xmlproc: A Python XML parser</TITLE>
        <META xsa='last-release' VALUE='19980718'/>
    </HEAD>
    <BODY>
        <H1>
            <SPAN xsa='name'>xmlproc</SPAN>: A Python XML parser
       </H1>
   </BODY>
</HTML>

Should this be left as just a black-box function, or should it be
implemented as a subclass of the writer.XmlWriter() class?  I suppose
it depends on the envisioned application for this; if it's just to
make output a little bit more readable for debugging purposes, then
customizability isn't very important.  On the other hand, if people
will want to do careful indenting of the output, indenting some tags
and not others, then the XmlWriter solution is the way to go.
My inclination is to the former view, but then, that's also easier for 
me. :)  Thoughts?

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
We have first raised a dust and then complain we cannot see.
    -- Bishop Berkeley


from xml.dom import utils, core

d = utils.FileReader()
dom = d.readFile( '/scratch/xsademo.xml' )

def format(node, indent=4):
    """Pretty-print a DOM tree"""

    utils.strip_whitespace( node )

    if node.nodeType == core.DOCUMENT_NODE:
        node = node.documentElement

    stack = [ (0,node) ]

    document = node.get_ownerDocument()

    # Add a newline before the opening and closing tags of the root element
    parent = node.get_parentNode()
    parent.insertBefore( document.createTextNode('\n'), node )
    node.appendChild( document.createTextNode('\n') )
    
    while (stack):
        # get the top node from the stack
        depth, node = stack[-1]

        # walk this node's list of children, deleting those that are
        # all whitespace and saving the rest to be pushed onto the stack
        children = []
        for child in node.childNodes[:] :
            if child.nodeType == core.ELEMENT_NODE:
                spacing = '\n' + (' '*(depth+1)*indent)

                # Add spacing before the child element; this space goes before
                # the start tag.
                text = document.createTextNode( spacing )
                node.insertBefore( text, child )

                # Check if the child element has any element children; if so,
                # we'll add whitespace before the closing tag.
                has_element_children = 0
                for n in child.get_childNodes():
                    if n.nodeType == core.ELEMENT_NODE:
                        has_element_children=1

                if has_element_children:
                    # Add spacing as the last child of the child element; this
                    # will go before the closing tag.
                    text = document.createTextNode( spacing )
                    child.appendChild( text )

            if child.hasChildNodes():
                children.append ( (depth+1,child) )
        children.reverse()
        stack[-1:] = children
        
    # end: while stack not empty

format(dom)

print dom.toxml()


From gwachob@aimnet.com  Thu Jan 21 05:57:06 1999
From: gwachob@aimnet.com (Gabe Wachob)
Date: Wed, 20 Jan 1999 21:57:06 -0800 (PST)
Subject: [XML-SIG] Pretty-printing DOM trees
In-Reply-To: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
Message-ID: <Pine.GSO.4.05.9901202152190.22457-100000@shell1.ncal.verio.com>

On Wed, 20 Jan 1999, A.M. Kuchling wrote:

> Should this be left as just a black-box function, or should it be
> implemented as a subclass of the writer.XmlWriter() class?  I suppose
> it depends on the envisioned application for this; if it's just to
> make output a little bit more readable for debugging purposes, then
> customizability isn't very important.  On the other hand, if people
> will want to do careful indenting of the output, indenting some tags
> and not others, then the XmlWriter solution is the way to go.
> My inclination is to the former view, but then, that's also easier for 
> me. :)  Thoughts?

My feeling is that most purposes of writing out a DOM tree (or tree
representation of an XML tree) will either be 1) for debugging purposes or
2) highly stylized, for a pariticular purpose (like an editor or
something). 

In other words, the prettyprint either has to be *really* flexible or not
very useful outside of debugging. How many applications print out XML
directly? 

Even an XML source browser would want to add features like
highlighting/tagging, hiding/exposing branches, filtering, etc. Unless you
plan on including a lot of these features (or at least hooks for them), I
don't see any reason to do anything more than a black-box solution. (I
would like to see an HTML rendering like my black-box SAX-driven script i
posted earlier today).

	-Gabe
------------------------------------------------------------------------
Gabe Wachob - http://www.findlaw.com - http://www.aimnet.com/~gwachob
As of today, the U.S. Constitution has been in force for 76,914 days
When this message was sent, there were 29,815,374 seconds before Y2K


From larsga@ifi.uio.no  Thu Jan 21 10:36:58 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 21 Jan 1999 11:36:58 +0100
Subject: [XML-SIG] SAX status
Message-ID: <wkn23c52ut.fsf@ifi.uio.no>

As you probably know, I have been fixing bugs, writing new drivers
(including a JPython one) and generally working on improving the SAX
libraries, preparing for a new release, which I hoped should not be
too far into the future.

However, David Megginson (the coordinator of the original SAX design)
has just started the discussion of the next SAX version. This means a
couple of things:

 - Unless someone complains I will wait with issuing new versions of
   the packages until new versions of the Java ones come out.

 - Anyone who has strong opinions about how the SAX design should be
   should participate in the xml-dev discussions. (Email with subject
   "subscribe xml-dev" or "subscribe xml-dev-digest" to
   majordomo@ic.ac.uk.)

 - I will probably translate the Java design by hand, and possibly
   also extend it in some cases (as I did last time, clearly
   separating the extensions from the core). Again, Java and JPython
   compatibility will be considered very important (unless someone
   screams really really loudly.) I will also keep the XML-SIG
   informed and attempt to start discussions here about the design and
   translation.

 - If you don't have time to participate fully, but still want to
   voice your opinion, do so here, and I will bear it in mind in the
   xml-dev discussions.

For those who don't have the time to follow xml-dev, David basically
proposed three new extensions:

 - parser filter facilities
 - lexical events
 - namespace handling

My original proposal for parser filters is still at
<URL:http://birk105.studby.uio.no/tmp/filters.zip>

The code is really simple and there are a couple of demos, including a
namespace one.

--Lars M.


From tismer@appliedbiometrics.com  Fri Jan 22 18:00:55 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 22 Jan 1999 19:00:55 +0100
Subject: [XML-SIG] Pretty-printing DOM trees
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
Message-ID: <36A8BCD7.1F992E61@appliedbiometrics.com>

A.M. Kuchling wrote:
> 
> The format() function below pretty-prints a DOM tree.  It strips away
> all the whitespace, and then inserts Text nodes containing white
> space, producing output like this:
> 
> <?xml version="1.0"?>
> <?IS10744:arch name="xsa"?>
> <HTML>
>     <HEAD>
>         <TITLE>xmlproc: A Python XML parser</TITLE>
>         <META xsa='last-release' VALUE='19980718'/>
>     </HEAD>
>     <BODY>
>         <H1>
>             <SPAN xsa='name'>xmlproc</SPAN>: A Python XML parser
>        </H1>
>    </BODY>
> </HTML>
> 
> Should this be left as just a black-box function, or should it be
> implemented as a subclass of the writer.XmlWriter() class?  I suppose
> it depends on the envisioned application for this; if it's just to
> make output a little bit more readable for debugging purposes, then
> customizability isn't very important.  On the other hand, if people
> will want to do careful indenting of the output, indenting some tags
> and not others, then the XmlWriter solution is the way to go.
> My inclination is to the former view, but then, that's also easier for
> me. :)  Thoughts?

Well, thank you - this was exactly what I wanted.
Just readable output. I took it as is, named it "format.py", 
perfect. I don't think that customization is such an issue.

Maybe it could be a drawback that applying format to a dom was
about three or four times slower than creating the dom at all,
but nevermind.

Would this function belong to xml.dom.utils, besides print_tree?
But it is actually a function wich happens to use DOM for its
work, so it seems to be a more general function for all xml
modules, so xml.utils may be better.
Then I could also think of recoding it as an sgmlop app.

Thanks again - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From akuchlin@cnri.reston.va.us  Thu Jan 21 23:16:12 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Thu, 21 Jan 1999 18:16:12 -0500 (EST)
Subject: [XML-SIG] Pretty-printing DOM trees
In-Reply-To: <36A8BCD7.1F992E61@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A8BCD7.1F992E61@appliedbiometrics.com>
Message-ID: <13991.45828.86417.505172@amarok.cnri.reston.va.us>

Christian Tismer writes:
>Maybe it could be a drawback that applying format to a dom was
>about three or four times slower than creating the dom at all,
>but nevermind.

	Hmm... wonder why it's so slow.  One reason might be that, for 
every element, it checks whether any of its children are also
elements, in order to format the two cases differently.  (As in:
  <head>
    <title>Text</title>
  </head>

It's not formatted as 
  <title>
Text
  </title>

>Would this function belong to xml.dom.utils, besides print_tree?
>But it is actually a function wich happens to use DOM for its
>work, so it seems to be a more general function for all xml
>modules, so xml.utils may be better.

	But it requires that you already have a DOM tree created, so
it seems best left in xml.dom.utils.  Indenting a document using SAX
or sgmlop might be best implemented as a specialized handler, not by
the expensive process of creating a DOM tree.

	I'll try to recast it into a subclass of XmlWriter, and have
utils.indent_tree() as shorthand to create and use an instance of that
subclass.  That gives both flexibility and quick-and-dirty
convenience.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
It is true greatness to have in one the frailty of a man and the security of a
god.
    -- Seneca


From tismer@appliedbiometrics.com  Fri Jan 22 12:41:10 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 22 Jan 1999 13:41:10 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A8BCD7.1F992E61@appliedbiometrics.com> <13991.45828.86417.505172@amarok.cnri.reston.va.us>
Message-ID: <36A871E6.89033191@appliedbiometrics.com>

This is a multi-part message in MIME format.
--------------322B2ABE145DE09B7F9C2D0A
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I've tested the formatter with one of my XML workfiles.

Astonishingly, it breaks.
I could not find the error, but it happens when I use
utils.FileReader to build the dom.

>>> d=utils.FileReader()
>>> dom = d.readFile( r'H:\pns\Projekte\SRZ\RoteLi\birgit\SGML\praep.xml')
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
  File "D:\Python\xml\dom\utils.py", line 140, in readFile
    dom = self.readStream(file,type)
  File "D:\Python\xml\dom\utils.py", line 148, in readStream
    dom = self.readXml(stream)
  File "D:\Python\xml\dom\utils.py", line 164, in readXml
    p.feed(stream.read())
  File "i:\cvsroot\xml\sax\drivers\drv_xmlproc.py", line 132, in feed
    self.parser.feed(data)
  File "i:\cvsroot\xml\parsers\xmlproc\xmlutils.py", line 189, in feed
    self.do_parse()
  File "i:\cvsroot\xml\parsers\xmlproc\xmlproc.py", line 278, in
do_parse
    self.parse_end_tag()
  File "i:\cvsroot\xml\parsers\xmlproc\xmlproc.py", line 532, in
parse_end_tag
    self.app.handle_end_tag(name)
  File "i:\cvsroot\xml\sax\drivers\drv_xmlproc.py", line 64, in
handle_end_tag
    self.doc_handler.endElement(name)
  File "xml\dom\builder.py", line 53, in endElement
    assert name == self.current_element.get_nodeName()
AssertionError: 
>>> 

The XML file is well-formed, so there must be a bug in the dom builder.
When I let builder.py ignore the assertion error and avoid popping
the tree, it works!

I hope this helps the author to find the bug, I don't 
understand everything well enough to find this.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home
--------------322B2ABE145DE09B7F9C2D0A
Content-Type: text/plain; charset=iso-8859-1; name="praep.xml"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline; filename="praep.xml"

<Praeparate><Praeparat nummer=3D"12617-2" fachinfo=3D"j" datum=3D"06.07.9=
8, 01.10.98" index=3D"TEIL95008233" gliederung=3D"28.B.1.2.1." abgabe=3D"=
Rp" stoffklasse=3D"B" status=3D"FE" gesperrt=3D"NEIN"><Name>Flutide&reg; =
/
 Flutide&reg; N</Name><Firma>Glaxo Wellcome/Cascan</Firma><Darreichung zu=
lassungsnummer=3D"29117.00.00" code=3D"200523" datum=3D"010195" status=3D=
F><Form>Flutide&reg; Junior 25 Dosier-Aerosol, Suspension und
 Treibmittel</Form><Packung pharmazentralnummer=3D"7123964">1 Dosier-Aero=
sol (N1) mind. 120 Spr&uuml;hst&ouml;&szlig;e, Flutide =

 Junior 25</Packung><Zusammensetzung>1 Spr&uuml;hsto&szlig; (&ap; 85&nbsp=
;mg)<Stoff code=3D"200523"><Name>Fluticason-17-propionat</Name><Menge>,02=
5&nbsp;mg</Menge></Stoff><Hilfsstoff code=3D"200821"><Name>Trichlorfluorm=
ethan</Name></Hilfsstoff><Hilfsstoff code=3D"200819"><Name>Dichlordifluor=
methan</Name></Hilfsstoff><Hilfsstoff code=3D"067800"><Name>Lecithin</Nam=
e></Hilfsstoff></Zusammensetzung></Darreichung><Darreichung zulassungsnum=
mer=3D"29125.02.00" code=3D"200523" datum=3D"010195" status=3DF><Form>Flu=
tide&reg; 250 Rotadisk&reg;, Pulver zum Inhalieren</Form><Packung pharmaz=
entralnummer=3D"7124113">60 Plv. (N1) Flutide 250 Rotadisk</Packung><Pack=
ung pharmazentralnummer=3D"7124142">60 Plv. (N1) Flutide 250 Rotadisk + D=
iskhaler</Packung><Zusammensetzung>1 Einzeldosis<Stoff code=3D"200523"><N=
ame>Fluticason-17-propionat</Name><Menge>0,25&nbsp;mg</Menge>in 25&nbsp;m=
g Pulver</Stoff><Hilfsstoff code=3D"066830"><Name>Lactose 1H<sub>2</sub>O=
</Name></Hilfsstoff></Zusammensetzung></Darreichung><Darreichung zulassun=
gsnummer=3D"30864.01.00" code=3D"200523" datum=3D"010199" status=3DF><For=
m>Flutide&reg; mite 100 Diskus&reg;, Pulver zum Inhalieren</Form><Packung=
 pharmazentralnummer=3D"7124202">60 Plv. (N1) mite 100 Diskus</Packung><Z=
usammensetzung>1 Einzeldosis<Stoff code=3D"200523"><Name>Fluticason-17-pr=
opionat</Name><Menge>0,1&nbsp;mg</Menge>in 12,5&nbsp;mg Pulver</Stoff><Hi=
lfsstoff code=3D"066830"><Name>Lactose 1H<sub>2</sub>O</Name></Hilfsstoff=
></Zusammensetzung></Darreichung><Darreichung zulassungsnummer=3D"30867.0=
0.00" code=3D"200523" datum=3D"010199" status=3DF><Form>Flutide&reg; fort=
e 500 Diskus&reg;, Pulver zum Inhalieren</Form><Packung pharmazentralnumm=
er=3D"7124248">60 Plv. (N1) forte 500 Diskus</Packung><Zusammensetzung>1 =
Einzeldosis<Stoff code=3D"200523"><Name>Fluticason-17-propionat</Name><Me=
nge>0,5&nbsp;mg</Menge>in 12,5&nbsp;mg Pulver</Stoff><Hilfsstoff code=3D"=
066830"><Name>Lactose 1H<sub>2</sub>O</Name></Hilfsstoff></Zusammensetzun=
g></Darreichung><Anwendung>Bronchialasthma aller Schweregrade.</Anwendung=
><Gegenanzeige>Akutbehandl. eines Asthmaanfalles.
 Behandl. bei aktiver od. inaktiver Lungentuberkulose gleichz. mit einem
 gegen die Tuberkulose wirksamen AM.</Gegenanzeige><Anwendungsbeschraenku=
ng>Flutide: Kdr. unter 4&nbsp;J. (zur Zeit keine ausreichenden Erfahrunge=
n).`O
 Flutide N: Kdr. u. Jugendl. unter 16&nbsp;J.</Anwendungsbeschraenkung><N=
ebenwirkung><Signatur>G 14</Signatur>Sehr selten paradoxer Bronchospasmus=
 mit rasch einsetzender Atemnot. Die
 Nebennierenrindenfunkt. bleibt im allg. w&auml;hrend der
 Inhalationsbehandl. mit Fluticason-17-propionat im Normalbereich. Bei
 einzelnen Pat., vor allem wenn sie &uuml;ber l&auml;ngere Zeit mit hohen=
 Tagesdosen
 behandelt werden, kann es zu einer Einschr&auml;nkung der Nebennierenrin=
denfunktion
 kommen. Auch nach Umstellung von and. inhalativen od. oralen Kortikoiden=

 kann die Nebennierenrindenfunkt. noch f&uuml;r l&auml;ngere Zeit eingesc=
hr&auml;nkt sein.
 Selten &Uuml;berempfindlichkeitsreakt. mit Hautbeteiligung. Erh&ouml;hte=

 Blutzuckerspiegel u. in Einzelf. eine Zuckerausscheidung in den Urin.</N=
ebenwirkung><Dosierung>Flutide Junior 25 Dosier-Aerosol: Kdr. &uuml;ber 4=
&nbsp;J., Jugendl. u. Erw.: 2mal
 tgl. 2&nbsp;Spr&uuml;hst&ouml;&szlig;e.`O Flutide N 125 Dosier-Aerosol: =
Jugendl. &uuml;ber 16&nbsp;J. u.
 Erw.: 2mal tgl. 2&nbsp;Spr&uuml;hst&ouml;&szlig;e.`O Flutide N forte 250=
 Dosier-Aerosol: Jugendl.
 &uuml;ber 16&nbsp;J. u. Erw.: 2mal tgl. 2-4 Spr&uuml;hst&ouml;&szlig;e.`=
O Flutide Junior 50 Rotadisk/-
 Junior 50 Diskus/mite 100 Diskus:
 Kdr. &uuml;ber 4&nbsp;J., Jugendl. u. Erw.: 2mal tgl. 1&nbsp;Pulverinhal=
ation.`O Flutide
 250 Rotadisk/-250 Diskus: Jugendl. &gt;&nbsp;16&nbsp;J. u. Erw.: 2mal tg=
l. =

 1&nbsp;Pulverinhalation. forte 500 Diskus: Jugendl. &uuml;ber 16&nbsp;J.=
 u. Erw.: 2mal tgl. =

 1-2&nbsp;Pulverinhalationen.`O
 Die Dosis sollte f&uuml;r jeden Pat. so angepa&szlig;t werden,
 da&szlig; eine Kontr. der Beschw. erreicht werden kann. Danach sollte di=
e
 individuelle Erhaltungsdosis durch schrittweise Verringerung der
 Gesamttagesdosis ermittelt werden. N&auml;heres s. Packungsbeilage.</Dos=
ierung><Lagerung>Lagerungshinweis! Verfalldatum!</Lagerung></Praeparat><P=
raeparat nummer=3D"02394-0" fachinfo=3D"j" datum=3D"30.03.98" index=3D"TE=
IL02046409" gliederung=3D"60.5.B.1." abgabe=3D"Rp" stoffklasse=3D"B" stat=
us=3D"FE" gesperrt=3D"NEIN"><Name>Gastrosil&reg; / -akut / -10, -20 / -re=
tard / -retard mite/ -50</Name><Firma>Heumann</Firma><Darreichung status=3D=
F><Form>Gastrosil&reg; L&ouml;sung</Form></Darreichung><Darreichung zulas=
sungsnummer=3D"4029.00.00" code=3D"077680" datum=3D"010182" status=3DF><F=
orm>Gastrosil&reg; Tabletten</Form><Packung pharmazentralnummer=3D"237893=
2">20 Tbl. (N1)</Packung><Packung pharmazentralnummer=3D"2378949">50 Tbl.=
 (N2)</Packung><Packung pharmazentralnummer=3D"2516825">100 Tbl. (N3)</Pa=
ckung><Zusammensetzung>1 Tbl.<Stoff code=3D"077680"><Name>Metoclopramid-H=
Cl 1H<sub>2</sub>O</Name><Menge>10,54&nbsp;mg</Menge><Entsprechend><Name>=
Metoclopramid-HCl</Name><Menge>10&nbsp;mg</Menge></Entsprechend></Stoff><=
Hilfsstoff code=3D"029345"><Name>Mikrokristalline Cellulose</Name></Hilfs=
stoff><Hilfsstoff code=3D"000052/HC"><Name>Poly(O-carboxymethyl)st&auml;r=
ke-Natriumsalz</Name></Hilfsstoff><Hilfsstoff code=3D"070830"><Name>Magne=
siumstearat</Name></Hilfsstoff><Hilfsstoff code=3D"065100"><Name>hochdisp=
erses Siliciumdioxid</Name></Hilfsstoff></Zusammensetzung></Darreichung><=
Darreichung zulassungsnummer=3D"13045.00.01" code=3D"077680" datum=3D"010=
192" status=3DF><Form>Gastrosil&reg; akut L&ouml;sung</Form><Packung phar=
mazentralnummer=3D"3992829">15.00 ml (N1) Lsg. akut</Packung><Zusammenset=
zung>1 ml<Stoff code=3D"077680"><Name>Metoclopramid-HCl 1H<sub>2</sub>O</=
Name><Menge>5,97&nbsp;mg</Menge><Entsprechend><Name>Metoclopramid-HCl</Na=
me><Menge>5,67&nbsp;mg</Menge></Entsprechend></Stoff><Hilfsstoff code=3D"=
108530"><Name>Sorbitol-Lsg. 70`p</Name>(nicht kristallisiert)</Hilfsstoff=
><Hilfsstoff code=3D"016015"><Name>gereinigtes Wasser</Name></Hilfsstoff>=
<Hilfsstoff code=3D"100300"><Name>Propylenglycol</Name></Hilfsstoff><Hilf=
sstoff code=3D"080350"><Name>Natriumchlorid</Name></Hilfsstoff></Zusammen=
setzung></Darreichung><Darreichung status=3DF><Form>Gastrosil&reg; retard=
 Retardkapseln</Form></Darreichung><Darreichung status=3DF><Form>Gastrosi=
l&reg; retard mite Retardkapseln</Form></Darreichung><Darreichung status=3D=
F><Form>Gastrosil&reg; Injektionsl&ouml;sung</Form></Darreichung><Darreic=
hung status=3DF><Form>Gastrosil&reg; 50 Injektionsl&ouml;sung</Form></Dar=
reichung><Anwendung><Signatur>Z 3 (Lsg.)</Signatur>Motilit&auml;tsst&ouml=
;rungen des oberen Magen-Darmtraktes, z.&nbsp;B.
 Reflux&ouml;sophagitis, Gastritis, Sodbrennen;
 Ulcus ventriculi et duodeni; &Uuml;belkeit, Brechreiz u. Erbrechen bei M=
igr&auml;ne,
 Leber- und Nierenerkrankungen, Sch&auml;del- und Hirnverletzungen,
 Arzneimittelunvertr&auml;glichkeit, Reisekrankheiten; bei anhaltendem Sc=
hluckauf =

 ist
 ein Therapieversuch angezeigt.`O
 Gastrosil akut: Motilit&auml;tsst&ouml;rungen des oberen Magen-Darmtrakt=
es,
 funktionell bedingte Pylorusstenose, &Uuml;belkeit, Brechreiz und Erbrec=
hen,
 zur unterst&uuml;tzenden, symptomatischen Behandlung bei Magen- u.
 Zw&ouml;lffingerdarmgeschw&uuml;ren. Diabetische Gastroparese.`O
 Gastrosil 50: Hochdosierte Metoclopramidtherapie bei &Uuml;belkeit und E=
rbrechen =

 durch das Zytostatikum
 Cisplatin.</Anwendung><Gegenanzeige>Lsg.: Sorbitolintoleranz. Lsg. akut:=
 Sgl. u. Kleinkdr. bis zu =

 2&nbsp;J.</Gegenanzeige><Anwendungsbeschraenkung><Signatur>D 70</Signatu=
r>Lsg. akut: &Auml;ltere Kinder.
 s</Anwendungsbeschraenkung><Nebenwirkung><Signatur>D 70</Signatur>50 Inj=
ektionslsg.: Bradykardie, =

 Blutdruckanstieg, -abfall.</Nebenwirkung><Wechselwirkung><Signatur>D 70<=
/Signatur>Bei
 gleichzeitiger Einnahme von Sympathikomimetika kann der Blutdruck erh&ou=
ml;ht
 werden. Die Aufnahme von Digoxin aus dem Darm kann vermindert, die
 Aufnahme von Paracetamol und versch. Antibiotika sowie von Alkohol
 kann beschleunigt werden. Lsg. akut zus&auml;tzl.:
 Bei gleichzeitiger Gabe von Phenothiazinen und Sympathomimetika k&ouml;n=
nen b.
 empfindl. Pat. extrapyramidale Reaktionen auftreten.</Wechselwirkung><Hi=
nweis>#W(V)</Hinweis><Dosierung>Lsg.: Erw. u. Jugendl. ab 14 Jahren 3mal =
tgl. 15-30 Tr.,
   Kdr. 7-14 Jahre 10-20 Tr., Kdr. 3-6 Jahre 8-12 Tr. vor den
 Mahlz.`O
 Tbl.: Erw. u. Jugendl. ab 14 Jahren 3mal tgl. 1 Tbl. vor den Mahlz.`O
 Z&auml;pf. f. E.: Erw. u. Kdr. ab 14 Jahren bis zu 3mal tgl. 1 Z&auml;pf=
ch.`O Z&auml;pf. f. =

 K.: Kdr.
   zwischen 3 u. 14 Jahren bis zu 3mal tgl. 1 Z&auml;pf.`O
 Lsg. akut: Erw. u. Jugendl. ab 14&nbsp;J.: 0,1&nbsp;mg/kg KG als Einzeld=
osis; max.
 Tagesdosis 0,5&nbsp;mg/kg KG. Bei eingeschr&auml;nkter Nierenfkt. ist di=
e Dosis d.
 Funktionsst&ouml;rung anzupassen. Pat. m. einer Kreatininclearance bis 1=
0&nbsp;ml/min
 1mal tgl. 10&nbsp;mg (30&nbsp;Tr.). Pat. m. einer Kreatininclearance von=
 11-60&nbsp;ml/min
 1mal tgl. 10&nbsp;mg (30&nbsp;Tr.) u. 1mal tgl. 5&nbsp;mg (15&nbsp;Tr.).=
 Einnahme m. etw.
 Fl&uuml;ssigk. vor den Mahlzeiten.`O
 Retardkps.: Erw. u. Jugendl.
 ab 14 Jahren morgens 1 Retardkps. vor dem Essen mit etwas Fl&uuml;ssigke=
it. Bei =

 n&auml;chtlichem Aufsteigen von Magens&auml;ure in die Speiser&ouml;hre =
u. Sodbrennen soll
 die Retardkps. abends eingenommen werden.`O</Dosierung><Dosierung>Retard=
kps. mite: Erw. u. Jugendl. ab
 14 Jahren 2mal tgl. 1 Retardkps. mite vor dem Essen, zur symptomatischen=
 =

 Behandlung
 bei der diabetischen Gastroparese etwa #2 Std. vor dem Essen. Die Einnah=
men
 erfolgen im Abstand von 12 Std. Kdr. von 8-14 Jahren, Patienten mit
 eingeschr&auml;nkter Nierenfunktion u. Alterspatienten sowie Patienten m=
it einem KG
 unter 60 kg 1mal tgl. 1 Retardkps. mite morgens od. abends. Einnahmen in=
 =

 diesen
 F&auml;llen im Abstand von 24 Std. Die tgl. Dosis sollte 0,5&nbsp;mg Met=
oclopramid/kg KG =

 nicht &uuml;berschreiten.`O
 Injektionslsg.: Erw. u. Jugendl. ab 14&nbsp;Jahren 1-3mal tgl. 1&nbsp;Am=
p. i.m. oder =

 i.v.; Kdr. von
 3-14&nbsp;Jahren eine Tagesdosis von 0,5&nbsp;mg/kg KG i.m.`O
 Gastrosil 50 kann nach folgenden 3 Schemata appliziert werden: a) 2&nbsp=
;mg =

 Metoclopramid-HCl (&ap; 0,4&nbsp;ml Gastrosil 50) pro kg KG als Kurzinfu=
sion &uuml;ber =

 15 Min. Zytostatikum 30 Min. nach Therapiebeginn. Jeweils 2&nbsp;mg Meto=
clopramid-
 HCl (&ap; 0,4&nbsp;ml
 Gastrosil 50) pro kg KG werden als weitere Kurzinfusionen &uuml;ber 15 M=
in. nach
 2, 4, 6 und 9 Std. appliziert.`O</Dosierung><Dosierung>b) 1&nbsp;mg Meto=
clopramid-HCl (&ap; 0,2&nbsp;ml Gastrosil 50) pro kg KG als
 Kurzinfusion &uuml;ber 15 Min. Zytostatikum 30 Min. nach Therapiebeginn.=
 Jeweils =

 1&nbsp;mg =

 Metoclopramid-HCl (&ap; 0,2&nbsp;ml Gastrosil 50) pro kg KG werden als w=
eitere =

 Kurzinfusionen &uuml;ber 15 Min. nach 2, 4, 7, 10 und 13 Std. appliziert=
=2E`O
 c) 2&nbsp;mg Metoclopramid-HCl (&ap; 0,4&nbsp;ml Gastrosil 50) pro kg KG=
 als
 Kurzinfusion &uuml;ber 15 Min. Zytostatikum 2 Std. nach Therapiebeginn (=
w&auml;hrend der =

 Dauerinfusion). Anschlie&szlig;end werden 5&nbsp;mg Metoclopramid-HCl (&=
ap; 1&nbsp;ml
 Gastrosil 50) pro kg KG als Dauerinfusion &uuml;ber 12 Std. appliziert.`=
O
 Bei Niereninsuffizienz sollte die Dosis von Gastrosil 50 Injektionsl&oum=
l;sung
 auf #3 der normalen Dosis =

 reduziert bzw. das Dosierungsintervall zwischen den einzelnen Gaben
 entsprechend erh&ouml;ht werden.`O
 Die tgl. Tagesdosis von 0,5&nbsp;mg/kg KG sollte i.&nbsp;a. nicht &uuml;=
berschritten
 werden. Behandlungsdauer 4-6 Wochen. In Einzelf&auml;llen kann Gastrosil=
 =

 auch &uuml;ber mehrere Monate, wenn erforderlich, angewendet werden.</Do=
sierung></Praeparat><Praeparat nummer=3D"12743-8" fachinfo=3D"N" datum=3D=
"03.04.98, 25.08.98, 13.10.98" index=3D"TEIL95012342" gliederung=3D"05.2.=
1." abgabe=3D"Rp" stoffklasse=3D"B" status=3D"FE" gesperrt=3D"NEIN"><Name=
>Tramadolor&reg;</Name><Firma>Hexal</Firma><Darreichung zulassungsnummer=3D=
"zugelassen" code=3D"117575" datum=3D"010195" status=3DF><Form>Tramadolor=
&reg;, Kapseln</Form><Packung pharmazentralnummer=3D"4469515"></Darreichu=
ng><Darreichung zulassungsnummer=3D"zugelassen" code=3D"117575" datum=3D"=
010196" status=3DF><Form>Tramadolor&reg; tabs, Tabletten</Form><Packung p=
harmazentralnummer=3D"7154700">10 Tbl. (N1) tabs</Packung><Packung pharma=
zentralnummer=3D"7154717">30 Tbl. (N2) tabs</Packung><Packung pharmazentr=
alnummer=3D"7154723">50 Tbl. (N3) tabs</Packung><Zusammensetzung>1 Tbl.<S=
toff code=3D"117575"><Name>Tramadol-HCl</Name><Menge>50&nbsp;mg</Menge></=
Stoff><Hilfsstoff code=3D"029345"><Name>Cellulose</Name></Hilfsstoff><Hil=
fsstoff code=3D"066830"><Name>Lactose</Name></Hilfsstoff><Hilfsstoff code=
=3D"097350"><Name>Macrogol 4000</Name></Hilfsstoff><Hilfsstoff code=3D"07=
0830"><Name>Magnesiumstearat</Name></Hilfsstoff><Hilfsstoff code=3D"09857=
5"><Name>Povidon</Name></Hilfsstoff><Hilfsstoff code=3D"020600"><Name>Sac=
charin-Natrium</Name></Hilfsstoff><Hilfsstoff code=3D"065100"><Name>Silic=
iumdioxid</Name></Hilfsstoff><Hilfsstoff code=3D"991650"><Name>Aromastoff=
e</Name></Hilfsstoff></Zusammensetzung></Darreichung><Darreichung zulassu=
ngsnummer=3D"zugelassen" code=3D"117575" datum=3D"010195" status=3DF><For=
m>Tramadolor&reg; Z&auml;pfchen</Form><Packung pharmazentralnummer=3D"446=
9538">10 Z&auml;pf. (N1)</Packung><Packung pharmazentralnummer=3D"4469544=
">20 Z&auml;pf. (N2)</Packung><Zusammensetzung>1 Z&auml;pf.<Stoff code=3D=
"117575"><Name>Tramadol-HCl</Name><Menge>100&nbsp;mg</Menge></Stoff><Hilf=
sstoff code=3D"055460"><Name>Hartfett</Name></Hilfsstoff></Zusammensetzun=
g></Darreichung><Darreichung zulassungsnummer=3D"11558.00.00" code=3D"117=
575" datum=3D"010198" status=3DF><Form>Tramadolor&reg; 100 ID Retardtable=
tten</Form><Packung pharmazentralnummer=3D"8543303">10 Retardtbl. (N1)</P=
ackung><Packung pharmazentralnummer=3D"8543326">30 Retardtbl. (N2)</Packu=
ng><Packung pharmazentralnummer=3D"8543332">50 Retardtbl. (N3)</Packung><=
Zusammensetzung>1 Retardtbl.<Stoff code=3D"117575"><Name>Tramadol-HCl</Na=
me><Menge>100&nbsp;mg</Menge></Stoff><Hilfsstoff code=3D"026410"><Name>Ca=
-hydrogenphosphat</Name></Hilfsstoff><Hilfsstoff code=3D"029345"><Name>Ce=
llulose</Name></Hilfsstoff><Hilfsstoff code=3D"066830"><Name>Lactose</Nam=
e></Hilfsstoff><Hilfsstoff code=3D"070830"><Name>Mg-stearat</Name></Hilfs=
stoff><Hilfsstoff code=3D"123875/a"><Name>Maisst&auml;rke</Name></Hilfsst=
off><Hilfsstoff code=3D"201265"><Name>Hypromellose</Name></Hilfsstoff><Hi=
lfsstoff code=3D"000009/HC"><Name>Na-carboxymethylst&auml;rke</Name></Hil=
fsstoff><Hilfsstoff code=3D"098575"><Name>Povidon</Name></Hilfsstoff><Hil=
fsstoff code=3D"104450/a"><Name>hydriertes Rizinus&ouml;l</Name></Hilfsst=
off><Hilfsstoff code=3D"065100"><Name>Siliciumdioxid</Name></Hilfsstoff><=
/Zusammensetzung></Darreichung><Darreichung zulassungsnummer=3D"zugelasse=
n/zugelassen" code=3D"117575" datum=3D"010195" status=3DF><Form>Tramadolo=
r&reg; 50/100 Injektionsl&ouml;sung</Form><Packung pharmazentralnummer=3D=
"4469596">5 Amp. (N1) 1&nbsp;ml 50&nbsp;mg</Packung><Packung pharmazentra=
lnummer=3D"4469604">5 Amp. (N1) 2&nbsp;ml 100&nbsp;mg</Packung><Packung p=
harmazentralnummer=3D"4469610">10 Amp. (N2) 2&nbsp;ml 100&nbsp;mg</Packun=
g><Zusammensetzung>1 Amp. 1/2&nbsp;ml<Stoff code=3D"117575"><Name>Tramado=
l-HCl</Name><Menge>50&nbsp;mg/100&nbsp;mg</Menge></Stoff><Hilfsstoff code=
=3D"201106"><Name>Natriumacetat</Name></Hilfsstoff><Hilfsstoff code=3D"01=
6015"><Name>Wasser f. Inj.-zwecke</Name></Hilfsstoff></Zusammensetzung></=
Darreichung><Anwendung>M&auml;&szlig;ig starke bis starke Schmerzen.</Anw=
endung><Gegenanzeige>Akute Intoxikat. durch Alkohol, Schmerz-, Schlafm., =
Opioid, Pat., die
 MAO-Hemmer erhalten od. innerhalb d. letzten 14&nbsp;Tage angewendet hab=
en.
 -Brause/-100 Brause/-Kps./-tabs: Kdr.; 100&nbsp;ID: Kdr. #X&nbsp;12&nbsp=
;J.; -Lsg./-50/-100:
 Kdr. #X&nbsp;1&nbsp;J.; -Z&auml;pfchen: Kdr. #X&nbsp;14&nbsp;J.
 Psychopharmaka.
 Kdr. unter 14&nbsp;J. Lsg./Inj.-Lsg.: Kdr. #X&nbsp;1&nbsp;J. Drogensubst=
itution.</Gegenanzeige><Anwendungsbeschraenkung><Signatur>A 85 a-e, k</Si=
gnatur>Kopfverletzung, Schock.</Anwendungsbeschraenkung><Schwangerschaft>=
#`K <i>Gr&nbsp;4</i>, <i>Gr&nbsp;9</i>. (Chron. Anw.). Gabe von Einzeldos=
en m&ouml;gl.</Schwangerschaft><Stillzeit><Signatur>A 85 a-l, n-p, s, v, =
w, x</Signatur>#`K <i>La&nbsp;2</i>. Bei 1mal. Anw. Abstillen jedoch nich=
t erforderl.</Stillzeit><Nebenwirkung><Signatur>A 85 a, b, d</Signatur>Ep=
ileptische Krampfanf&auml;lle, Blutdruckanstieg,
 Appetit&auml;nderungen, allergische Reaktionen bis zum anaphylakt. Schoc=
k,
 Verschlimmerung von Asthma.</Nebenwirkung><Wechselwirkung><Signatur>A 85=
</Signatur>Abschw&auml;chung der Wirkung bei Verwendung von Agonisten/Ant=
agonisten.
 Hemmung durch CYP3A4-hemmende Substanzen. Das krampfausl&ouml;sende Pote=
ntial
 von selektiven Serotonin-Reuptake-Inhibitoren, trizykl. Antidepressiva,
 Antipsychotika u. andere die Krampfschwelle herabsetzende AM wird erh&ou=
ml;ht.
 Neuroleptika: Krampfanf&auml;lle, Carbamazepin: vermindert analget. Effe=
kt.
 MAO-Hemmstoffe: innerh. v. 14&nbsp;Tagen vor Anw. v. Pethidin: lebensbed=
roh.
 Wechselwirkungen (ZNS, Atmungs- u. Kreislauffkt.), die f&uuml;r Tramadol=
 nicht
 auszuschlie&szlig;en sind.</Wechselwirkung><Hinweis>#W(V) B. &Uuml;bersc=
hreit. d. empf. Dos. u. gleichz. Anw. and. zentrald&auml;mpf.
 Medik. atemd&auml;mpfende Wirk. ber&uuml;cksichtigen. Pat. mit Leber- u.=

 Nierenfunktionsst&ouml;r.: Dosierungsintervall verl&auml;ngern! Abh&auml=
;ngigkeitspotential.
 B. l&auml;ngerem Gebr.: Toleranz u. Abh&auml;ngigk. Erfahrungsgem. trete=
n
 Nebenwirkungen starker Analgetika bes. unter k&ouml;rperl. Belastung auf=
=2E Weitere
 Hinw. s. Fachinfo.</Hinweis><Dosierung>-Brause/-Kps./-tabs: B. m&auml;&s=
zlig;ig starken Schmerzen: Erw. u. Jugendl. ab
 12&nbsp;J. als ED 50&nbsp;mg, entspr. 1&nbsp;Brausetbl./Kps./Tbl. Tritt =
innerh. v. =

 30-60 Min. keine Schmerzbefr. ein, Wiederh. m&ouml;gl. B. starken Schmer=
zen als ED
 100&nbsp;mg. Nicht &gt;&nbsp;400&nbsp;mg/Tag. -100 Brause: B. m&auml;&sz=
lig;ig starken Schmerzen: Erw. u. =

 Jugendl. ab 12&nbsp;J. als ED
 50&nbsp;mg, entspr. #2&nbsp;Brausetbl. Tritt innerh. v. 30-60&nbsp;Min. =
keine Schmerzbefr.
 ein, Wiederh. m&ouml;gl. B. starken Schmerzen als ED 100&nbsp;mg. Nicht =
&gt;&nbsp;400&nbsp;mg/Tag.
 -100 ID: Erw. u. Jugendl. ab 12&nbsp;J. ED 200-400&nbsp;mg, entspr. 1-2&=
nbsp;Retardtbl.
 2mal tgl. Nicht &gt;&nbsp;400&nbsp;mg/Tag. -Lsg.: B. m&auml;&szlig;ig st=
arken Schmerzen: Erw. u.
 Jugendl. ab 12&nbsp;J. als ED 50&nbsp;mg, entspr. 20&nbsp;Tr. Tritt inne=
rh. v. 30-60&nbsp;Min.
 keine Schmerzbefr. ein, Wiederh. m&ouml;gl. B. starken Schmerzen 100&nbs=
p;mg, entspr.
 40&nbsp;Tr. Nicht &gt;&nbsp;400&nbsp;mg/Tag. -50/-100: B. m&auml;&szlig;=
ig starken Schmerzen: Erw. u.
 Jugendl. ab 14&nbsp;J. als ED 50&nbsp;mg, entspr. 1&nbsp;ml. Tritt inner=
h. v. 30-60&nbsp;Min.
 keine Schmerzbefr. ein, nochmal 1&nbsp;ml. B. starken Schmerzen als ED 2=
&nbsp;ml. Nicht
 &gt;&nbsp;400&nbsp;mg/Tag. -Z&auml;pf.: Erw. u. Jugendl. ab 14&nbsp;J. a=
ls ED 100&nbsp;mg, entspr.
 1&nbsp;Z&auml;pf. Nicht &gt;&nbsp;400&nbsp;mg/Tag. Dos. b. Tumorschmerze=
n, starken Schmerzen n.
 Operat., Dos. b. Kdr.</Dosierung><Anstaltspackung>je 50 (5x10), 100 (10x=
10)</Anstaltspackung></Praeparat><Praeparat nummer=3D"07364-0" fachinfo=3D=
"j" datum=3D"06.04.98, 25.05.98" index=3D"TEIL85004951" gliederung=3D"05.=
3.B.1.5." abgabe=3D"Rp" stoffklasse=3D"B" status=3D"FE" gesperrt=3D"NEIN"=
><Name>Voltaren&reg;</Name><Firma>Novartis Pharma</Firma><Gliederung>23.1=
=2EB.1.</Gliederung><Darreichung zulassungsnummer=3D"2502.00.01" code=3D"=
041260" datum=3D"010176" status=3DF><Form>Voltaren&reg; 25 magensaftresis=
tente Dragees</Form><Packung pharmazentralnummer=3D"3901689">20 Drg. (N1)=
 Voltaren 25</Packung><Packung pharmazentralnummer=3D"3901695">50 Drg. (N=
2) Voltaren 25</Packung><Packung pharmazentralnummer=3D"3901703">100 Drg.=
 (N3) Voltaren 25</Packung><Zusammensetzung>1 Drg.<Stoff code=3D"041260">=
<Name>Diclofenac-Natrium</Name><Menge>25&nbsp;mg</Menge></Stoff><Hilfssto=
ff code=3D"201286"><Name>Eisenoxidgelb (E&nbsp;172)</Name></Hilfsstoff><H=
ilfsstoff code=3D"066830"><Name>Lactose</Name></Hilfsstoff><Hilfsstoff co=
de=3D"097350"><Name>Macrogol</Name></Hilfsstoff><Hilfsstoff code=3D"07083=
0"><Name>Magnesiumstearat</Name></Hilfsstoff><Hilfsstoff code=3D"123875/a=
"><Name>Maisst&auml;rke</Name></Hilfsstoff><Hilfsstoff code=3D"201265"><N=
ame>Hypromellose</Name></Hilfsstoff><Hilfsstoff code=3D"000258/HC"><Name>=
Poly(methacrylat, ethylacrylat) Copolymerisat</Name></Hilfsstoff><Hilfsst=
off code=3D"098575"><Name>Povidon</Name></Hilfsstoff><Hilfsstoff code=3D"=
098425"><Name>Polysorbat 80</Name></Hilfsstoff><Hilfsstoff code=3D"000052=
/HC"><Name>Poly(O-carboxymethyl)st&auml;rke</Name></Hilfsstoff><Hilfsstof=
f code=3D"112446"><Name>Talkum</Name></Hilfsstoff><Hilfsstoff code=3D"117=
150"><Name>Titandioxid (E&nbsp;171)</Name></Hilfsstoff></Zusammensetzung>=
</Darreichung><Darreichung zulassungsnummer=3D"14651.00.01" code=3D"04126=
0" datum=3D"010177" status=3DF><Form>Voltaren&reg; retard Retarddragees</=
Form><Packung pharmazentralnummer=3D"2192452">20 Retarddrg. (N1)</Packung=
><Packung pharmazentralnummer=3D"2192469">50 Retarddrg. (N2)</Packung><Pa=
ckung pharmazentralnummer=3D"2192475">100 Retarddrg. (N3)</Packung><Zusam=
mensetzung>1 Retarddrg.<Stoff code=3D"041260"><Name>Diclofenac-Natrium</N=
ame><Menge>100&nbsp;mg</Menge></Stoff><Hilfsstoff code=3D"201286"><Name>E=
isenoxidrot (E&nbsp;172)</Name></Hilfsstoff><Hilfsstoff code=3D"029920"><=
Name>Cetylalkohol</Name></Hilfsstoff><Hilfsstoff code=3D"097350"><Name>Ma=
crogol 8000</Name></Hilfsstoff><Hilfsstoff code=3D"070830"><Name>Magnesiu=
mstearat</Name></Hilfsstoff><Hilfsstoff code=3D"098575"><Name>Povidon</Na=
me></Hilfsstoff><Hilfsstoff code=3D"201265"><Name>Hypromellose</Name></Hi=
lfsstoff><Hilfsstoff code=3D"098425"><Name>Polysorbat 80</Name></Hilfssto=
ff><Hilfsstoff code=3D"105350"><Name>Saccharose</Name></Hilfsstoff><Hilfs=
stoff code=3D"112446"><Name>Talkum</Name></Hilfsstoff><Hilfsstoff code=3D=
"117150"><Name>Titandioxid (E&nbsp;171)</Name></Hilfsstoff></Zusammensetz=
ung></Darreichung><Darreichung zulassungsnummer=3D"14651.00.02" code=3D"0=
41260" datum=3D"010177" status=3DF><Form>Voltaren&reg; 100 Z&auml;pfchen<=
/Form><Packung pharmazentralnummer=3D"2092265">10 Supp. (N1) Voltaren 100=
 Z&auml;pfchen</Packung><Packung pharmazentralnummer=3D"2092271">50 Supp.=
 (N3) Voltaren 100 Z&auml;pfchen</Packung><Zusammensetzung>1 Supp.<Stoff =
code=3D"041260"><Name>Diclofenac-Natrium</Name><Menge>100&nbsp;mg</Menge>=
</Stoff><Hilfsstoff code=3D"055460"><Name>Hartfett</Name></Hilfsstoff></Z=
usammensetzung></Darreichung><Darreichung zulassungsnummer=3D"520.00.00" =
code=3D"041260" datum=3D"010177" status=3DF><Form>Voltaren&reg; f&uuml;r =
Kdr. Z&auml;pfchen</Form><Packung pharmazentralnummer=3D"2169602">10 Supp=
=2E (N1) Voltaren f. Kdr.</Packung><Packung pharmazentralnummer=3D"216961=
9">50 Supp. (N3) Voltaren f. Kdr.</Packung><Zusammensetzung>1 Supp.<Stoff=
 code=3D"041260"><Name>Diclofenac-Natrium</Name><Menge>25&nbsp;mg</Menge>=
</Stoff><Hilfsstoff code=3D"055460"><Name>Hartfett</Name></Hilfsstoff></Z=
usammensetzung></Darreichung><Anwendung><Signatur>N 30</Signatur>Entz&uum=
l;ndliche, entz&uuml;ndl. aktivierte degenerative u. extraartikul&auml;re=

 rheumatische Erkrankungen. Akuter Gichtanfall. Nichtrheumat. schmerzh.
 Schwellungen u. Entz&uuml;ndungen. Voltaren 50/-100 Z&auml;pfchen/-retar=
d zus&auml;tzlich: =

 prim&auml;re
 Dysmenorrh&ouml;, Schmerzen bei akuter u. subakuter Adnexitis, Tumorschm=
erzen.
 Voltaren f&uuml;r Kdr. u. Voltaren f&uuml;r Kleinkdr.: juvenile chron. P=
olyarthritis
 u. nichtrheumatische entz&uuml;ndliche Schmerzzust&auml;nde.</Anwendung>=
<Gegenanzeige>Analgetika-Intoleranz. =

 (Suppos.: Proktitis),
 Z 6
 (Amp.).
 Kdr., Kleinkdr. s. Fachinfo.
 Voltaren Amp., -ret., -100: Kdr. u. Jugendliche.</Gegenanzeige><Anwendun=
gsbeschraenkung><Signatur>N 30 a-f, h-j</Signatur>Pat. m. Colitis ulceros=
a, M. Crohn, Pat. unter Diuretika-Therapie
 u. n. gr&ouml;&szlig;. chirurg. Eingriffen sorgf&auml;ltig &uuml;berwach=
en.
 F&uuml;r Kdr. u. Kleinkdr. nur p&auml;diat. Formen anwenden.
 s</Anwendungsbeschraenkung><Nebenwirkung><Signatur>N 30 a-h, k, m-p</Sig=
natur>Selten Alopezie. In Einzelf.: Herzinsuff., Vaskulitis u. Pneumoniti=
s, =

 aphth&ouml;se Stomatitis, Glossitis,
 &Ouml;sophagusl&auml;sionen, Pankreatitis, Photosensibilisierung, Herzkl=
opfen,
 Schmerzen i. d. Brust, Hypertonie.
 Vor&uuml;bergehende Hemmung d. Thrombozytenaggregation.</Nebenwirkung><W=
echselwirkung><Signatur>N 30;</Signatur>Nephrotox. v. Cyclosporin erh&oum=
l;ht. Chinolon-Antibiotika (Krampfneigung
 erh&ouml;ht).</Wechselwirkung><Toxikation>s. Fachinfo.</Toxikation><Hinw=
eis>#W(V) Bei Langzeitbehdlg. sollen als
 vorsorgl. Ma&szlig;nahme Kontrollen des Blutbildes, d. Leber- u. Nierenf=
unktion =

 durchgef&uuml;hrt werden.
 Weit. Einzelh. s. Fachinfo.</Hinweis><Dosierung>Erw. initial 150&nbsp;mg=
=2E Erhaltungsdosis 100&nbsp;mg, ggf. 75 od. 50&nbsp;mg.
 Kleinkdr. ab 1 Lebensjahr 0,5-2&nbsp;mg/kg KG pro Tag, bei juveniler chr=
on. =

 Polyarthritis Erh&ouml;h. auf max. 3&nbsp;mg/kg KG pro Tag. &Auml;ltere =
Kdr.:
 2-3&nbsp;mg/kg KG pro Tag. Einzelheiten s. Fachinfo.</Dosierung><Lagerun=
g>Verfalldatum! (Amp.), Lagerungshinweis!</Lagerung><Anstaltspackung>je 6=
00 Drg. Voltaren 25, Voltaren 50 u. Voltaren retard; je 300 Supp.
 Voltaren 50 u. Voltaren 100; 30 u. 150 Voltaren Amp.</Anstaltspackung><B=
egruendung>gzt, nzt, wzt: Durch Lit. belegt.
 nzt, wzt: Auflage Stammhaus, z.&nbsp;T. durch Lit. belegt.</Begruendung>=
</Praeparat></Praeparate>
--------------322B2ABE145DE09B7F9C2D0A--


From larsga@ifi.uio.no  Fri Jan 22 13:30:09 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 22 Jan 1999 14:30:09 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <36A871E6.89033191@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> 		<36A8BCD7.1F992E61@appliedbiometrics.com> <13991.45828.86417.505172@amarok.cnri.reston.va.us> <36A871E6.89033191@appliedbiometrics.com>
Message-ID: <wk1zknl9jy.fsf@ifi.uio.no>

* Christian Tismer
| 
| The XML file is well-formed, so there must be a bug in the dom
| builder.  When I let builder.py ignore the assertion error and avoid
| popping the tree, it works!

The assertion in question is one that compares the element type name
of an end tag to the name of the current element. Looks rather
strange, since xmlproc (which you apparently use) maintains its own
element stack and checks this internally.
 
Unless xmlproc swallows an event somewhere somehow, the error is
probably in the DOM. Running saxdemo.py and XMLTest.java to get two
canonized versions of the document should show conclusively whether
the problem is xmlproc or the DOM.

--Lars M.


From a.eyre@optichrome.com  Fri Jan 22 13:59:27 1999
From: a.eyre@optichrome.com (Adrian Eyre)
Date: Fri, 22 Jan 1999 13:59:27 -0000
Subject: [XML-SIG] Pretty-printing DOM trees
In-Reply-To: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
Message-ID: <003501be460f$6d3dc110$2bcbd9c2@mars.optichrome.com>

--MimeMultipartBoundary
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

> from xml.dom import utils, core
Am I using the right XML library here, as mine does not appear to have a
file
called utils.py in the xml.dom directory.
I'm using: http://www.python.org/sigs/xml-sig/files/xml-0.5.tgz

> def format(node, indent=4):
>     """Pretty-print a DOM tree"""
I also find passing in an xml.dom.core.Document instance causes the routine
to
fall over.

What am I doing wrong?

+------------------------------------------+
| BFN: Adrian Eyre <a.eyre@optichrome.com> |
+------------------------------------------+

--MimeMultipartBoundary--


From Fred L. Drake, Jr." <fdrake@acm.org  Fri Jan 22 14:34:49 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 22 Jan 1999 09:34:49 -0500 (EST)
Subject: [XML-SIG] Pretty-printing DOM trees
In-Reply-To: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
Message-ID: <13992.35977.925124.769186@weyr.cnri.reston.va.us>

A.M. Kuchling writes:
 > Should this be left as just a black-box function, or should it be
 > implemented as a subclass of the writer.XmlWriter() class?  I suppose

Andrew,
  I'd actually like a subclassable version.  This doesn't mean you
need to write the code, though.  ;-)  A simple black-box function like 
yours can be written on top of the basic pretty-printer.
  I very much like the fact that it operates on a DOM tree, but a
SAX-based version might also be nice, especially for large documents.
(I'd probably only use the DOM version myself, though).


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From tismer@appliedbiometrics.com  Fri Jan 22 15:00:07 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 22 Jan 1999 16:00:07 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> 		<36A8BCD7.1F992E61@appliedbiometrics.com> <13991.45828.86417.505172@amarok.cnri.reston.va.us> <36A871E6.89033191@appliedbiometrics.com> <wk1zknl9jy.fsf@ifi.uio.no>
Message-ID: <36A89277.ADF9A784@appliedbiometrics.com>

Lars Marius Garshol wrote:
> 
> * Christian Tismer
> |
> | The XML file is well-formed, so there must be a bug in the dom
> | builder.  When I let builder.py ignore the assertion error and avoid
> | popping the tree, it works!
> 
> The assertion in question is one that compares the element type name
> of an end tag to the name of the current element. Looks rather
> strange, since xmlproc (which you apparently use) maintains its own
> element stack and checks this internally.
> 
> Unless xmlproc swallows an event somewhere somehow, the error is
> probably in the DOM. Running saxdemo.py and XMLTest.java to get two
> canonized versions of the document should show conclusively whether
> the problem is xmlproc or the DOM.

I ran the file through saxdemo.py and it works.
Further I tried readXml with the sgmlop and sgmllib, giving
the same result.

Lets me say: it must be DOM.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From jim.fulton@digicool.com  Fri Jan 22 15:16:47 1999
From: jim.fulton@digicool.com (Jim Fulton)
Date: Fri, 22 Jan 1999 10:16:47 -0500
Subject: [XML-SIG] [Fwd: [Zope] - XML-RPC]
Message-ID: <36A8965F.A7BE0EEF@digicool.com>

This is a multi-part message in MIME format.
--------------6EB8E500C25FCF6AB5718FC5
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

I meant to CC the XML SIG mailing list in this message, but forgot
to.

Jim
--------------6EB8E500C25FCF6AB5718FC5
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Received: from albert.digicool.com ([206.156.192.156]) by gandalf.digicool.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.1960.3)
	id D1YQ5JFN; Fri, 22 Jan 1999 09:27:13 -0500
Received: from albert.digicool.com (localhost [127.0.0.1])
	by albert.digicool.com (8.9.1/8.9.1) with ESMTP id JAA14284;
	Fri, 22 Jan 1999 09:19:43 -0500
Received: from digicool.com (glebe.digicool.com [206.156.192.148])
	by albert.digicool.com (8.9.1/8.9.1) with ESMTP id JAA14261;
	Fri, 22 Jan 1999 09:19:24 -0500
Message-ID: <36A88774.DDF32151@digicool.com>
Date: Fri, 22 Jan 1999 09:13:08 -0500
From: Jim Fulton <jim.fulton@digicool.com>
Organization: Digital Creations, Inc.
X-Mailer: Mozilla 4.5 [en] (WinNT; I)
X-Accept-Language: en
MIME-Version: 1.0
To: Skip Montanaro <skip@calendar.com>
CC: Pavlos Christoforou <pavlos@gaaros.msrc.sunysb.edu>, zope@zope.org
Subject: Re: [Zope] - XML-RPC
References: <Pine.LNX.3.96.990121140044.2354B-100000@gaaros.msrc.sunysb.edu> <13991.34599.787787.461627@dolphin.calendar.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: zope-admin@zope.org
List-Id: Zope -- The Z Object Publishing Environment <zope.zope.org>
Errors-To: zope-admin@zope.org
X-BeenThere: zope@zope.org
X-Mailman-Version: 1.0b6


I think that XML-RPC would almost certainly be a cool thing to 
support in Zope, and Zope would be a cool server for XML RPC. 
IMO, the right way to do it would be to add support for it to 
ZPublisher.

XML-RPC (http://www.scripting.com/frontier5/xml/code/rpc.html)
uses POST requests with content type "text/xml".  
(Does anyone but me think that this content type is a bit
too broad?) It would be straightforward for ZPublisher to 
recognize this case and:

 - Add the method supplied in the body to 
   the request path, 

 - Get method parameters (positionally)
   from the body.
  
I'm in favor of this but doubt that anyone here at DC
will have time to do this for some time.  I'd gladly
accept patches though, and would be willing to discuss
details with anyone working on such patches. ;)
In fact,  if anyone does work on this, I'd prefer to 
discuss it with them before they get too far.

Jim

--
Jim Fulton           mailto:jim@digicool.com
Technical Director   (888) 344-4332              Python Powered!
Digital Creations    http://www.digicool.com     http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission.  Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.


--------------6EB8E500C25FCF6AB5718FC5--


From tismer@appliedbiometrics.com  Fri Jan 22 18:14:05 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 22 Jan 1999 19:14:05 +0100
Subject: [XML-SIG] SAX prettyprinter (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
Message-ID: <36A8BFED.BE6C3EF6@appliedbiometrics.com>

This is a multi-part message in MIME format.
--------------2FF7655D01AF62D3F1B5AD1E
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

A.M. Kuchling wrote:
> 
> The format() function below pretty-prints a DOM tree.  It strips away
> all the whitespace, and then inserts Text nodes containing white
> space, producing output like this:
> 
> <?xml version="1.0"?>
> <?IS10744:arch name="xsa"?>
> <HTML>
>     <HEAD>
>         <TITLE>xmlproc: A Python XML parser</TITLE>
>         <META xsa='last-release' VALUE='19980718'/>
>     </HEAD>
>     <BODY>
>         <H1>
>             <SPAN xsa='name'>xmlproc</SPAN>: A Python XML parser
>        </H1>
>    </BODY>
> </HTML>

I wrote something similar for the SAX interface.

indenter.py is appended.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home
--------------2FF7655D01AF62D3F1B5AD1E
Content-Type: text/plain; charset=us-ascii; name="indenter.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="indenter.py"


# pretty printer for SAX
# CT990122
# based upon the saxutils.Canonizer code

from xml.sax import saxexts, saxlib, saxutils


import sys

class Indenter(saxlib.HandlerBase):
    "A SAX document handler that produces indented XML output."

    def __init__(self,writer=sys.stdout, indent=2):
        self.elem_level=0
        self.writer=writer
        self.indent=indent
        self.last_level=-1
    
    def processingInstruction (self,target, remainder):
        if not target=="xml":
            self.writer.write("<?"+target+" "+remainder+"?>\n")

    def startElement(self,name,amap):
        self.writer.write("\n"+self.indent*self.elem_level*" "+"<"+name)
        
        a_names=amap.keys()
        a_names.sort()

        for a_name in a_names:
            self.writer.write(" "+a_name+"=\"")
            self.write_data(amap[a_name])
            self.writer.write("\"")
        self.writer.write(">")
        self.last_level = self.elem_level
        self.elem_level=self.elem_level+1

    def endElement(self,name):
        self.elem_level=self.elem_level-1
        if self.last_level < self.elem_level:
            self.writer.write("\n"+self.indent*self.elem_level*" "+"</"+name+">")
        else:
            self.writer.write("</"+name+">")
            self.last_level = -1

    def ignorableWhitespace(self,data,start_ix,length):
        # we drop white space here.
        # self.characters(data,start_ix,length)
        
    def characters(self,data,start_ix,length):
        if self.elem_level>0:
            self.write_data(data[start_ix:start_ix+length])
            
    def write_data(self,data):
        "Writes datachars to writer."
        data=string.replace(data,"&","&amp;")
        data=string.replace(data,"<","&lt;")
        data=string.replace(data,"\"","&quot;")
        data=string.replace(data,">","&gt;")
#        data=string.replace(data,chr(9),"&#9;")
#        data=string.replace(data,chr(10),"&#10;")
#        data=string.replace(data,chr(13),"&#13;")
#        data = string.strip(data)
        self.writer.write(data)
        
    def endDocument(self):
        self.writer.write("\n")
        try:
            pass #self.writer.close()
        except NameError:
            pass # It's OK, if the method isn't there we probably don't need it


--------------2FF7655D01AF62D3F1B5AD1E--


From tismer@appliedbiometrics.com  Fri Jan 22 20:26:48 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 22 Jan 1999 21:26:48 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> <36A8BFED.BE6C3EF6@appliedbiometrics.com>
Message-ID: <36A8DF08.C0E13776@appliedbiometrics.com>

This is a multi-part message in MIME format.
--------------76DB1FFD28AC0CB493D29956
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi again,

the appended version of Indenter.py can use sgmlop to format
large XML files. It then processes a few megabytes in a few seconds.

BTW - is sgmlop deprecated?
It still has some flaws, like not allowing "_" in tagnames.
Is Fredrik no longer supporting it, or what is the current
preferred fast parser for all platforms?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home
--------------76DB1FFD28AC0CB493D29956
Content-Type: text/plain; charset=us-ascii; name="indenter.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="indenter.py"


# pretty printer for SAX
# CT990122
# based upon the saxutils.Canonizer code

# V.0.2 support for sgmlop which doesn't give ignorableWhitespace info

from xml.sax import saxexts, saxlib, saxutils

import string, sys

class Indenter(saxlib.HandlerBase):
    "A SAX document handler that produces indented XML output."

    def __init__(self,writer=sys.stdout, indent=2):
        self.elem_level=0
        self.writer=writer
        self.indent=indent
        self.last_level=-1
        self.buffer = ""   # lazy buffer for whitespace stripping
    
    def processingInstruction (self,target, remainder):
        #if not target=="xml":
            self.writer.write("<?"+target+" "+remainder+"?>\n")

    def startElement(self,name,amap):
        if self.buffer:
            self.write_buffer()
        self.writer.write("\n"+self.indent*self.elem_level*" "+"<"+name)
        
        a_names=amap.keys()
        a_names.sort()

        for a_name in a_names:
            self.writer.write(" "+a_name+"=\"")
            self.write_data(amap[a_name], 1)
            self.writer.write("\"")
        self.writer.write(">")
        self.last_level = self.elem_level
        self.elem_level=self.elem_level+1

    def endElement(self,name):
        if self.buffer:
            self.write_buffer()
        self.elem_level=self.elem_level-1
        if self.last_level < self.elem_level:
            self.writer.write("\n"+self.indent*self.elem_level*" "+"</"+name+">")
        else:
            self.writer.write("</"+name+">")
            self.last_level = -1

    def ignorableWhitespace(self,data,start_ix,length):
        # we drop white space here.
        # self.characters(data,start_ix,length)
        pass
        
    def characters(self,data,start_ix,length):
        if self.elem_level>0:
            self.put_buffer(data[start_ix:start_ix+length])
            
    def put_buffer(self, txt):
        self.buffer = self.buffer+txt
        
    def write_buffer(self):
        if self.buffer:
            self.write_data(string.strip(self.buffer))
            self.buffer = ""
            
    def write_data(self,data, quotes=0):
        "Writes datachars to writer."
        data=string.replace(data,"&","&amp;")
        data=string.replace(data,"<","&lt;")
        if quotes:
            data=string.replace(data,"\"","&quot;")
        data=string.replace(data,">","&gt;")
        self.writer.write(data)
        
    def endDocument(self):
        self.write_buffer()
        self.writer.write("\n")
        try:
            pass #self.writer.close()
        except NameError:
            pass # It's OK, if the method isn't there we probably don't need it


"""
Example to format a DOM:

>>> i=Indenter()
>>> p=saxexts.make_parser()
>>> p.setErrorHandler(saxutils.ErrorPrinter())
>>> p.setDocumentHandler(i)
>>> p.parseFile(cStringIO.StringIO(dom.toxml()))

Example to format a file to a file, with sgmlop as parser:

>>> f=open(r'd:\tmp\test.xml',"w")
>>> i=Indenter(f)
>>> p=saxexts.make_parser("xml.sax.drivers.drv_sgmlop")
>>> p.setErrorHandler(saxutils.ErrorPrinter())
>>> p.setDocumentHandler(i)
>>> p.parseFile(r"h:\pns\projekte\srz\roteli\birgit\sgml\praep.sgm.umgebrochen.xml")
>>> f.close()
"""

--------------76DB1FFD28AC0CB493D29956--


From tismer@appliedbiometrics.com  Fri Jan 22 20:27:58 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 22 Jan 1999 21:27:58 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> <36A8BFED.BE6C3EF6@appliedbiometrics.com>
Message-ID: <36A8DF4E.2D3852D7@appliedbiometrics.com>

This is a multi-part message in MIME format.
--------------F46600A1D3B1D2BC0AA2B68F
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi again,

the appended version of Indenter.py can use sgmlop to format
large XML files. It then processes a few megabytes in a few seconds.

sgmlop does not support ignorableWhitespace, so I supported
this alone, by delayed writing and postprocessing.

BTW - is sgmlop deprecated?
It still has some flaws, like not allowing "_" in tagnames.
Is Fredrik no longer supporting it, or what is the current
preferred fast parser for all platforms?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home
--------------F46600A1D3B1D2BC0AA2B68F
Content-Type: text/plain; charset=us-ascii; name="indenter.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="indenter.py"


# pretty printer for SAX
# CT990122
# based upon the saxutils.Canonizer code

# V.0.2 support for sgmlop which doesn't give ignorableWhitespace info

from xml.sax import saxexts, saxlib, saxutils

import string, sys

class Indenter(saxlib.HandlerBase):
    "A SAX document handler that produces indented XML output."

    def __init__(self,writer=sys.stdout, indent=2):
        self.elem_level=0
        self.writer=writer
        self.indent=indent
        self.last_level=-1
        self.buffer = ""   # lazy buffer for whitespace stripping
    
    def processingInstruction (self,target, remainder):
        #if not target=="xml":
            self.writer.write("<?"+target+" "+remainder+"?>\n")

    def startElement(self,name,amap):
        if self.buffer:
            self.write_buffer()
        self.writer.write("\n"+self.indent*self.elem_level*" "+"<"+name)
        
        a_names=amap.keys()
        a_names.sort()

        for a_name in a_names:
            self.writer.write(" "+a_name+"=\"")
            self.write_data(amap[a_name], 1)
            self.writer.write("\"")
        self.writer.write(">")
        self.last_level = self.elem_level
        self.elem_level=self.elem_level+1

    def endElement(self,name):
        if self.buffer:
            self.write_buffer()
        self.elem_level=self.elem_level-1
        if self.last_level < self.elem_level:
            self.writer.write("\n"+self.indent*self.elem_level*" "+"</"+name+">")
        else:
            self.writer.write("</"+name+">")
            self.last_level = -1

    def ignorableWhitespace(self,data,start_ix,length):
        # we drop white space here.
        # self.characters(data,start_ix,length)
        pass
        
    def characters(self,data,start_ix,length):
        if self.elem_level>0:
            self.put_buffer(data[start_ix:start_ix+length])
            
    def put_buffer(self, txt):
        self.buffer = self.buffer+txt
        
    def write_buffer(self):
        if self.buffer:
            self.write_data(string.strip(self.buffer))
            self.buffer = ""
            
    def write_data(self,data, quotes=0):
        "Writes datachars to writer."
        data=string.replace(data,"&","&amp;")
        data=string.replace(data,"<","&lt;")
        if quotes:
            data=string.replace(data,"\"","&quot;")
        data=string.replace(data,">","&gt;")
        self.writer.write(data)
        
    def endDocument(self):
        self.write_buffer()
        self.writer.write("\n")
        try:
            pass #self.writer.close()
        except NameError:
            pass # It's OK, if the method isn't there we probably don't need it


"""
Example to format a DOM:

>>> i=Indenter()
>>> p=saxexts.make_parser()
>>> p.setErrorHandler(saxutils.ErrorPrinter())
>>> p.setDocumentHandler(i)
>>> p.parseFile(cStringIO.StringIO(dom.toxml()))

Example to format a file to a file, with sgmlop as parser:

>>> f=open(r'd:\tmp\test.xml',"w")
>>> i=Indenter(f)
>>> p=saxexts.make_parser("xml.sax.drivers.drv_sgmlop")
>>> p.setErrorHandler(saxutils.ErrorPrinter())
>>> p.setDocumentHandler(i)
>>> p.parseFile(r"h:\pns\projekte\srz\roteli\birgit\sgml\praep.sgm.umgebrochen.xml")
>>> f.close()
"""

--------------F46600A1D3B1D2BC0AA2B68F--


From dieter@handshake.de  Fri Jan 22 20:19:15 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Fri, 22 Jan 1999 21:19:15 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <36A871E6.89033191@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A871E6.89033191@appliedbiometrics.com>
Message-ID: <199901222019.VAA00392@lindm.dm>

Hello Christian

Using the PDB, I got the following sequence of parser events:

START: Praeparate
START:   Praeparat
START:     Name
END:       /Name
START:     Firma
END:       /Firma
END:       /Name

The last event, obviously, is wrong.
It seems, "xmlproc" does something wrong.

I append the PDB log.

Dieter


----------------------------------------------------------------------------
>>> d.run("p.parse('ct.xml')")
> <string>(0)?()
(Pdb) b
{'/usr/local/lib/python1.5/site-packages/xml/dom/builder.py': [44, 53]}
(Pdb) c
> <string>(1)?()
(Pdb) 
> /usr/local/lib/python1.5/site-packages/xml/dom/builder.py(44)startElement()
-> def startElement(self, name, attrs = {}):
(Pdb) p name
'Praeparate'
(Pdb) c
> /usr/local/lib/python1.5/site-packages/xml/dom/builder.py(44)startElement()
-> def startElement(self, name, attrs = {}):
(Pdb) p name
'Praeparat'
(Pdb) c
> /usr/local/lib/python1.5/site-packages/xml/dom/builder.py(44)startElement()
-> def startElement(self, name, attrs = {}):
(Pdb) p name
'Name'
(Pdb) c
> /usr/local/lib/python1.5/site-packages/xml/dom/builder.py(53)endElement()
-> def endElement(self, name):
(Pdb) p name
'Name'
(Pdb) c
> /usr/local/lib/python1.5/site-packages/xml/dom/builder.py(44)startElement()
-> def startElement(self, name, attrs = {}):
(Pdb) p name
'Firma'
(Pdb) c
> /usr/local/lib/python1.5/site-packages/xml/dom/builder.py(53)endElement()
-> def endElement(self, name):
(Pdb) p name
'Firma'
(Pdb) c
> /usr/local/lib/python1.5/site-packages/xml/dom/builder.py(53)endElement()
-> def endElement(self, name):
(Pdb) p name
'Form'


From tismer@appliedbiometrics.com  Fri Jan 22 20:59:10 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 22 Jan 1999 21:59:10 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A871E6.89033191@appliedbiometrics.com> <199901222019.VAA00392@lindm.dm>
Message-ID: <36A8E69E.D90FDB0@appliedbiometrics.com>

Dieter Maurer wrote:
> 
> Hello Christian
> 
> Using the PDB, I got the following sequence of parser events:
> 
> START: Praeparate
> START:   Praeparat
> START:     Name
> END:       /Name
> START:     Firma
> END:       /Firma
> END:       /Name
> 
> The last event, obviously, is wrong.
> It seems, "xmlproc" does something wrong.
> 
> I append the PDB log.

Thank you!
Actually I claimed that the XML file was right, but it wasn't 
completely. This one was not closed:
<Packung pharmazentralnummer="4469515"> <!-- hier fehlte was! -->
</Packung>

But after that change, FileReader still barfs.

With my SAS prettyprinter everything works fine, with sgmlop, 
xmlproc, whatever I used.

So I doubt xmlproc is wrong. There must be something deeper.
Did you recognize the incompatibility of SAX and DOM?
After playing with several SAX tools, it was impossible
to import xml.dom any longer.
Something is wrong, deep in the classes which are already
a little complicated for my small brain.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From akuchlin@cnri.reston.va.us  Fri Jan 22 21:08:01 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Fri, 22 Jan 1999 16:08:01 -0500 (EST)
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <36A8DF4E.2D3852D7@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A8BFED.BE6C3EF6@appliedbiometrics.com>
 <36A8DF4E.2D3852D7@appliedbiometrics.com>
Message-ID: <13992.59381.973689.430970@amarok.cnri.reston.va.us>

Christian Tismer writes:
>BTW - is sgmlop deprecated?
>It still has some flaws, like not allowing "_" in tagnames.
>Is Fredrik no longer supporting it, or what is the current
>preferred fast parser for all platforms?

	I haven't heard anything about sgmlop being deprecated; as far
as I know it's still being supported, and there is no preferred fast
parser; use sgmlop or PyExpat as you wish.  A while back Fredrik told
me that he had still had some small fixes for sgmlop, but he's been
busy since then, and I haven't heard anything more; perhaps the _
problem you report is one of them.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
Not everybody knows that looking at people in 'a funny way' is the commonest
cause of sudden murder. I happen to know that because I read a Home Office
brochure once.
    -- Tom Baker, in his autobiography


From akuchlin@cnri.reston.va.us  Fri Jan 22 21:26:27 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Fri, 22 Jan 1999 16:26:27 -0500 (EST)
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <36A8E69E.D90FDB0@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A871E6.89033191@appliedbiometrics.com>
 <199901222019.VAA00392@lindm.dm>
 <36A8E69E.D90FDB0@appliedbiometrics.com>
Message-ID: <13992.60573.126910.674312@amarok.cnri.reston.va.us>

Christian Tismer writes:
>So I doubt xmlproc is wrong. There must be something deeper.
>Did you recognize the incompatibility of SAX and DOM?
>After playing with several SAX tools, it was impossible
>to import xml.dom any longer.

	That's bizarre, and I don't see how that would be possible in Python.
What were the symptoms?  What happened when the import failed?

	(I'll look into the problem with FileReader tonight; no time
to do it at work.)

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
I spent a busy day today, but got little done. This is because I am at last
becoming perfect in the art of seeming busy, even when very little is going on
in my head or under my hands. This is an art which every man learns, if he
does not intend to work himself to death.
    -- Robertson Davies, _The Table Talk of Samuel Marchbanks_


From dieter@handshake.de  Fri Jan 22 21:44:09 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Fri, 22 Jan 1999 22:44:09 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <36A871E6.89033191@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A871E6.89033191@appliedbiometrics.com>
Message-ID: <199901222144.WAA00911@lindm.dm>

Hello Christian

I have investigated the problem further:

"xmlproc" requires *ALL* attribute values to be enclosed
in either single or double quotes.

The problem is caused by your

   <Darreichung zulassungsnummer="29117.00.00" code="200523" datum="010195" status=F>

more precisely, the "status=F", where the "F" is not enclosed in quotes.

"xmlproc" sees the problem and reports an error "3016" (you will
see it, if you install an error handler). Then it skips beyond
the closing '>'.
However, it is still in attribute processing for "Darreichung"
-- an "xmlproc" bug. In this mode, it cannot understand "<Form>"
and its content and keeps skipping until the "</Form>" which
is reported as end tag -- an end tag without corresponding
start tag.

Dieter


From larsga@ifi.uio.no  Sat Jan 23 10:21:08 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 23 Jan 1999 11:21:08 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <36A8DF4E.2D3852D7@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> <36A8BFED.BE6C3EF6@appliedbiometrics.com> <36A8DF4E.2D3852D7@appliedbiometrics.com>
Message-ID: <wksod22stn.fsf@ifi.uio.no>

* Christian Tismer
| 
| the appended version of Indenter.py can use sgmlop to format large
| XML files. It then processes a few megabytes in a few seconds.

How is the performance when you use sgmlop directly compared to when
you use it's SAX driver?
 
| BTW - is sgmlop deprecated?

If it works with your XML it should be OK, but it does not conform
very closely to the standard, unlike expat.

--Lars M.


From larsga@ifi.uio.no  Sat Jan 23 10:43:59 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 23 Jan 1999 11:43:59 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <199901222144.WAA00911@lindm.dm>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> 	<36A871E6.89033191@appliedbiometrics.com> <199901222144.WAA00911@lindm.dm>
Message-ID: <wkr9sm2rrk.fsf@ifi.uio.no>

* Dieter Maurer
| 
| I have investigated the problem further:
| 
| "xmlproc" requires *ALL* attribute values to be enclosed in either
| single or double quotes.

This is correct, simply because the XML standard also requires this.
A document that does not quote all it's attribute values is not
well-formed and thus not considered an XML document at all. In fact,
by continuing to report data events to the client after a
well-formedness error, xmlproc violates the standard. 

Since it appears to be so common to not use error handlers, perhaps I
should make it conform. What do people think?
 
| The problem is caused by your
| 
| <Darreichung zulassungsnummer="29117.00.00" code="200523"
|              datum="010195" status=F>
| 
| more precisely, the "status=F", where the "F" is not enclosed in
| quotes.
| 
| "xmlproc" sees the problem and reports an error "3016"

Almost correct, it reports 3004: "One of ' or " expected".

| (you will see it, if you install an error handler).

Just a tip: my experience is that if you don't always install error
handlers little nitty problems with your XML will cause you a lot of
headaches that you can't figure out at first. 

xml.sax.saxutils contains two default error handlers that you can plug
in and use directly. One prints errors, the other raises exceptions.

| Then it skips beyond the closing '>'.

This is correct. This is xmlproc in 'panic mode'. Since it doesn't do
tokenization it has no clues as to what is coming up next, and tries
to skip to the end of the start tag.

| However, it is still in attribute processing for "Darreichung" -- an
| "xmlproc" bug.

So it is. Even though the application has no right to expect correct
information about the document any more, it is pointless not to get
this right when it is so easy to do it. We'll pay a slight performance
penalty for it, though.

Thank you very much for diagnosing the problem so clearly. I'll fix
this now so that the problem does not occur in 0.60. (0.60, by the
way, should have full support for parameter entities.)

--Lars M.


From fredrik@pythonware.com  Sat Jan 23 11:48:10 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 23 Jan 1999 12:48:10 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
Message-ID: <00b701be46c6$40ea7a10$f29b12c2@pythonware.com>

>BTW - is sgmlop deprecated?
>It still has some flaws, like not allowing "_" in tagnames.
>Is Fredrik no longer supporting it, or what is the current
>preferred fast parser for all platforms?

the XML session on the Houston conference decided
to lobby for sgmlop to be included in a future Python
release.  don't know if anyone is actually doing some-
thing about that, though...

sgmlop was intentionally designed to have a very efficient
Python interface, be small enough to ship with the standard
distribution without anyone noticing, and to be compatible
with both sgmllib and xmllib. it's currently somewhat sloppy
(that is, you can use it to parse most xml data, but you
shouldn't use it to verify that your xml writing code creates
perfectly portable xml).  one big problem with it s that it's
being ignored by the sgmllib and xmllib maintainers, so keeping
things in sync is pretty hard.

on the other hand, looks like people don't care about back-
wards compatibility any more.  xmllib in 1.5.2b1 silently broke
ALL our existing XML code (including xmlrpclib).  I'm seriously
considering to just ignore the standard stuff, and stick to our
proprietary XML hacks in future applications...

patches to sgmlop.c are still welcome, though.

Cheers /F


From larsga@ifi.uio.no  Sat Jan 23 15:03:38 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 23 Jan 1999 16:03:38 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <00b701be46c6$40ea7a10$f29b12c2@pythonware.com>
References: <00b701be46c6$40ea7a10$f29b12c2@pythonware.com>
Message-ID: <wk679y2fqt.fsf@ifi.uio.no>

* Fredrik Lundh
|
| on the other hand, looks like people don't care about back- wards
| compatibility any more.  xmllib in 1.5.2b1 silently broke ALL our
| existing XML code (including xmlrpclib). 

Did you complain? It's still in beta, so they might revert back, no?

| I'm seriously considering to just ignore the standard stuff, and
| stick to our proprietary XML hacks in future applications...

Maybe you could use SAX? It won't break things if it is at all
possible to avoid it, and at least you'll a chance to voice your
opinion first here on the XML-SIG. And with mllib you can even get an
xmllib-like interface.

(I qualify this because two things may change: EntityResolver and
setLocale. Hopefully nobody will veto EntityResolver.)

--Lars M.


From tismer@appliedbiometrics.com  Sat Jan 23 15:04:31 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 23 Jan 1999 16:04:31 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A871E6.89033191@appliedbiometrics.com> <199901222144.WAA00911@lindm.dm>
Message-ID: <36A9E4FF.76B8E3D5@appliedbiometrics.com>

Dieter Maurer wrote:
> 
> Hello Christian
> 
> I have investigated the problem further:
> 
> "xmlproc" requires *ALL* attribute values to be enclosed
> in either single or double quotes.
> 
> The problem is caused by your
> 
>    <Darreichung zulassungsnummer="29117.00.00" code="200523" datum="010195" status=F>
> 
> more precisely, the "status=F", where the "F" is not enclosed in quotes.

Aaahh, oh, whow, thanks.
Maybe xmlproc should be a little more forgiving for this case
and not skip beyond ">" but just skip (or repair) the attribute.

XMLers, please take my excuse, it was not DOM but a faulty
Python script from my course. My XMLpro viewer didn't complain,
so I thought it was correct.

thanks for all the support - chris
(und besonders an Dieter)

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Sat Jan 23 15:14:54 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 23 Jan 1999 16:14:54 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A871E6.89033191@appliedbiometrics.com>
 <199901222019.VAA00392@lindm.dm>
 <36A8E69E.D90FDB0@appliedbiometrics.com> <13992.60573.126910.674312@amarok.cnri.reston.va.us>
Message-ID: <36A9E76E.48EF2A46@appliedbiometrics.com>

Andrew M. Kuchling wrote:
> 
> Christian Tismer writes:
> >So I doubt xmlproc is wrong. There must be something deeper.
> >Did you recognize the incompatibility of SAX and DOM?
> >After playing with several SAX tools, it was impossible
> >to import xml.dom any longer.
> 
>         That's bizarre, and I don't see how that would be possible in Python.
> What were the symptoms?  What happened when the import failed?

it was simply impossible to import xml.dom any longer. xml.sax
was still working. I closed my PyWin session, started over and
it was alive again. I don't know how to reproduce this yet,
but it happened the second time. Some combination of trying
this driver and that one... I need to turn my session logger
on forever, then I can see what I did.

>         (I'll look into the problem with FileReader tonight; no time
> to do it at work.)

I think there is something more. When I pass a file object to my
Indenter (which is basically similar to Normalizer) and later
delete that file, also delete all instances of handlers which
I created, the file doesn't get closed. There is something sticky
which keeps references alive.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Sat Jan 23 15:44:02 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 23 Jan 1999 16:44:02 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> <36A8BFED.BE6C3EF6@appliedbiometrics.com> <36A8DF4E.2D3852D7@appliedbiometrics.com> <wksod22stn.fsf@ifi.uio.no>
Message-ID: <36A9EE42.78F166D5@appliedbiometrics.com>

Lars Marius Garshol wrote:
> 
> * Christian Tismer
> |
> | the appended version of Indenter.py can use sgmlop to format large
> | XML files. It then processes a few megabytes in a few seconds.
> 
> How is the performance when you use sgmlop directly compared to when
> you use it's SAX driver?

I didn't try yet since I was very happy with the speed.

> | BTW - is sgmlop deprecated?
> 
> If it works with your XML it should be OK, but it does not conform
> very closely to the standard, unlike expat.

I could no use pyexpat yet, since a pyexpat dll is missing.
I will build one for Windows (as I also did before with sgmlop,
the binary in the CVS was broke). I just wasn't aware that
I need to get an extra tar file for that.

When I find the time, I will also provide a patch for sgmlop for
a couple of things.
What I need to find is the fastest acceptable parser which allows
me to turn masses of XML data into Python structures. We don't
work with complicated but smaller documents, but we are processing
XML encoded database records which are quite irregular (useless
to use a relational database) and quite simple, but the standard
size is some 50MB. This is why I'm after speed, much more than
conformance.

A general question (comes up because I had to hack my Indenter
especially for sgmlop):
Is a SAX parser required to report ignorableWHitespace events?
Or is it also allowed to never call this method, as sgmlop does?
If so, then the interface doesn't make too much sense since I have
to collect all data and handle whitespace when the next tag appears.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From larsga@ifi.uio.no  Sat Jan 23 15:54:30 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 23 Jan 1999 16:54:30 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <36A9EE42.78F166D5@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> <36A8BFED.BE6C3EF6@appliedbiometrics.com> <36A8DF4E.2D3852D7@appliedbiometrics.com> <wksod22stn.fsf@ifi.uio.no> <36A9EE42.78F166D5@appliedbiometrics.com>
Message-ID: <wk4spi2de1.fsf@ifi.uio.no>

* Lars Marius Garshol
|
| How is the performance when you use sgmlop directly compared to when
| you use it's SAX driver?

* Christian Tismer
| 
| I didn't try yet since I was very happy with the speed.

Would be interesting to know, though, since it will tell us something
about what the penalty of using SAX is, compared to doing it directly.

| I could no use pyexpat yet, since a pyexpat dll is missing.  I will
| build one for Windows (as I also did before with sgmlop, the binary
| in the CVS was broke).

Both the pyexpat and the sgmlop DLLs are in CVS and both of them work
for me. Maybe you should try a 'cvs update'? :)

| Is a SAX parser required to report ignorableWHitespace events?

No, and in fact non-validating parsers cannot tell the difference if
they haven't read the DTD. (AElfred reads the DTD to be able to
provide this information, but does not validate.)

See

<URL:http://www.stud.ifi.uio.no/~larsga/download/python/xml/sax-spec.html#DocumentHandler>

| Or is it also allowed to never call this method, as sgmlop does?  If
| so, then the interface doesn't make too much sense since I have to
| collect all data and handle whitespace when the next tag appears.

I agree that this is suboptimal, but the problem springs from the
design of XML itself. Most parsers simply do not have the information
required to know when to call this method.

--Lars M.


From tismer@appliedbiometrics.com  Sat Jan 23 16:30:05 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sat, 23 Jan 1999 17:30:05 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> <36A8BFED.BE6C3EF6@appliedbiometrics.com> <36A8DF4E.2D3852D7@appliedbiometrics.com> <wksod22stn.fsf@ifi.uio.no> <36A9EE42.78F166D5@appliedbiometrics.com> <wk4spi2de1.fsf@ifi.uio.no>
Message-ID: <36A9F90D.B6759872@appliedbiometrics.com>

Lars Marius Garshol wrote:
> 
> * Lars Marius Garshol
> |
> | How is the performance when you use sgmlop directly compared to when
> | you use it's SAX driver?
> 
> * Christian Tismer
> |
> | I didn't try yet since I was very happy with the speed.
> 
> Would be interesting to know, though, since it will tell us something
> about what the penalty of using SAX is, compared to doing it directly.

I will provide timings when I have time, also with expat.

> | I could no use pyexpat yet, since a pyexpat dll is missing.  I will
> | build one for Windows (as I also did before with sgmlop, the binary
> | in the CVS was broke).
> 
> Both the pyexpat and the sgmlop DLLs are in CVS and both of them work
> for me. Maybe you should try a 'cvs update'? :)

:)) it is *my* dll which is in the cvs now.

But you are right, the (py)expat dlls are all there. I just
cannot import pyexpat. The dlls are not found. sgmlop works
off-the-shelf. Is it necessary to adjust path variables for 
pyexpat? If so, then I'll change the layout for Windows a little
to make this unnecessary. Until now, I could simply plug the
whole package into my Python dir and use it.

And thanks about the info concerning whitespace.

ciao - chris 

p.s.: now busy building an ultra-light DOM which needs less memory
than its XML string representation. It's becoming fun :-)

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From gstein@lyra.org  Sun Jan 24 00:48:52 1999
From: gstein@lyra.org (Greg Stein)
Date: Sat, 23 Jan 1999 16:48:52 -0800
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A871E6.89033191@appliedbiometrics.com> <199901222144.WAA00911@lindm.dm> <36A9E4FF.76B8E3D5@appliedbiometrics.com>
Message-ID: <36AA6DF4.662C7ED5@lyra.org>

Christian Tismer wrote:
> 
> Dieter Maurer wrote:
> >
> > Hello Christian
> >
> > I have investigated the problem further:
> >
> > "xmlproc" requires *ALL* attribute values to be enclosed
> > in either single or double quotes.
> >
> > The problem is caused by your
> >
> >    <Darreichung zulassungsnummer="29117.00.00" code="200523" datum="010195" status=F>
> >
> > more precisely, the "status=F", where the "F" is not enclosed in quotes.
> 
> Aaahh, oh, whow, thanks.
> Maybe xmlproc should be a little more forgiving for this case
> and not skip beyond ">" but just skip (or repair) the attribute.

It should *NOT* repair the attribute. That will simply encourage poor
XML authoring. It should report the error properly (or, alternatively,
the error should be responded to properly).

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From larsga@ifi.uio.no  Sun Jan 24 11:28:17 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Jan 1999 12:28:17 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <36AA6DF4.662C7ED5@lyra.org>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> 			<36A871E6.89033191@appliedbiometrics.com> <199901222144.WAA00911@lindm.dm> <36A9E4FF.76B8E3D5@appliedbiometrics.com> <36AA6DF4.662C7ED5@lyra.org>
Message-ID: <wkaez8vrji.fsf@ifi.uio.no>

* Christian Tismer
| 
| Maybe xmlproc should be a little more forgiving for this case and
| not skip beyond ">" but just skip (or repair) the attribute.

* Greg Stein
| 
| It should *NOT* repair the attribute. That will simply encourage
| poor XML authoring. It should report the error properly (or,
| alternatively, the error should be responded to properly).

The error is reported properly as it is and the attribute is not
repaired, but subsequent data events are wrong. That's now fixed (data
events, not the attribute), but the question remains whether the
parser should follow the XML recommendation and stop reporting data
events after a well-formedness bug.

I'm inclined to make that default behaviour, but behaviour it is
possible to turn off. Opinions are welcome.

--Lars M.


From gstein@lyra.org  Sun Jan 24 11:39:35 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 24 Jan 1999 03:39:35 -0800 (PST)
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <wkaez8vrji.fsf@ifi.uio.no>
Message-ID: <Pine.SUN.3.95.990124033848.3315A-100000@svpal.svpal.org>

On 24 Jan 1999, Lars Marius Garshol wrote:
> The error is reported properly as it is and the attribute is not
> repaired, but subsequent data events are wrong. That's now fixed (data
> events, not the attribute), but the question remains whether the
> parser should follow the XML recommendation and stop reporting data
> events after a well-formedness bug.
> 
> I'm inclined to make that default behaviour, but behaviour it is
> possible to turn off. Opinions are welcome.

Sounds good -- default is to "abort" on bad input.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From larsga@ifi.uio.no  Sun Jan 24 12:20:09 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Jan 1999 13:20:09 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <Pine.SUN.3.95.990124033848.3315A-100000@svpal.svpal.org>
References: <Pine.SUN.3.95.990124033848.3315A-100000@svpal.svpal.org>
Message-ID: <wk3e50vp52.fsf@ifi.uio.no>

* Lars Marius Garshol
|
| [...] the question remains whether the parser should follow the XML
| recommendation and stop reporting data events after a
| well-formedness bug.
| 
| I'm inclined to make that default behaviour, but behaviour it is
| possible to turn off. Opinions are welcome.

* Greg Stein
| 
| Sounds good -- default is to "abort" on bad input.

I know, but the user might want to know if there are more errors, to
avoid having to run the parser n times for n well-formedness errors.
So I prefer not reporting more data events, but keep sending error
events. The application can stop the parse at any time by throwing an
exception, anyway.

Thanks for the opinion. Once I get a couple more of those I'll do the
necessary patch.

--Lars M.


From fredrik@pythonware.com  Sun Jan 24 12:37:37 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 24 Jan 1999 13:37:37 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
Message-ID: <006001be4796$54879230$f29b12c2@pythonware.com>

>A general question (comes up because I had to hack my Indenter
>especially for sgmlop):
>Is a SAX parser required to report ignorableWHitespace events?
>Or is it also allowed to never call this method, as sgmlop does?
>If so, then the interface doesn't make too much sense since I have
>to collect all data and handle whitespace when the next tag appears.

If I understand things correctly, sgmlop cannot figure
out what's ignorable and not; you need to have access
to the DTD to handle that.

Our internal xml libraries allows the user to indicate
whether a resource is "xml text" or "xml data".  the
latter doesn't allow elements to contain both text
and other elements, which means that it's easy to
figure out what to ignore.

Cheers /F
fredrik@pythonware.com
http://www.pythonware.com


From tismer@appliedbiometrics.com  Sun Jan 24 12:29:19 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 24 Jan 1999 13:29:19 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A871E6.89033191@appliedbiometrics.com> <199901222144.WAA00911@lindm.dm> <36A9E4FF.76B8E3D5@appliedbiometrics.com> <36AA6DF4.662C7ED5@lyra.org>
Message-ID: <36AB121F.7620E5D@appliedbiometrics.com>

Greg Stein wrote:
> 
> Christian Tismer wrote:
...
> > Aaahh, oh, whow, thanks.
> > Maybe xmlproc should be a little more forgiving for this case
> > and not skip beyond ">" but just skip (or repair) the attribute.
> 
> It should *NOT* repair the attribute. That will simply encourage poor
> XML authoring. It should report the error properly (or, alternatively,
> the error should be responded to properly).

Well, I agree. It should not encourage bad authoring.
But I, as a complete newbie to a SIG which is very evolving,
was kind of struggling with a lot of code, many parsers, and so
on. I think, others will get into at least as much trouble
as I had.
Furthermore, the file which I wanted to inspect wasn't mine.
What should I do if I'm confronted with foreign XML files
which have some flaws, and the parser doesn't make it through
it. The argument is fine for me, but in this case I have
no chance.
For my custom work, it would be best to have a parser which
*does* complain about an error, but also repairs easy cases
like this. This gives me a chance to work with the file,
inspect it and complain to my customer.
This is easy after all since I now know enough of
the XML package and can help myself.

The remaining qeustion is: How should faulty XML be handled
at all? There are enough examples where you cannot simply
reject the document. You need to read it.
Does it make sense to think of a "correcting"
parser which turns a bad document into something well-formed
which can be inspected with an XML browser, together with
some error-annotation tags?

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From gstein@lyra.org  Sun Jan 24 12:41:14 1999
From: gstein@lyra.org (Greg Stein)
Date: Sun, 24 Jan 1999 04:41:14 -0800 (PST)
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <36AB121F.7620E5D@appliedbiometrics.com>
Message-ID: <Pine.SUN.3.95.990124043323.3520A-100000@svpal.svpal.org>

On Sun, 24 Jan 1999, Christian Tismer wrote:
> Well, I agree. It should not encourage bad authoring.
> But I, as a complete newbie to a SIG which is very evolving,
> was kind of struggling with a lot of code, many parsers, and so
> on. I think, others will get into at least as much trouble
> as I had.

Well, that was simply because the errors weren't reported properly. That
can be fixed.

> Furthermore, the file which I wanted to inspect wasn't mine.
> What should I do if I'm confronted with foreign XML files
> which have some flaws, and the parser doesn't make it through
> it. The argument is fine for me, but in this case I have
> no chance.

Push back against where the file came from. What if somebody sent you a
bad executable? Do you try to correct it? What if they send a bad MSFT
Word file? Do you try to correct it? Makefiles with spaces instead of
tabs? crontab files with a missing column? etc. etc.

Well, the same for XML. If it is bad, then you ask for a correct one. Why
should XML be any different than the multitude of documents that you deal
with every day?

> For my custom work, it would be best to have a parser which
> *does* complain about an error, but also repairs easy cases
> like this. This gives me a chance to work with the file,
> inspect it and complain to my customer.
> This is easy after all since I now know enough of
> the XML package and can help myself.

By default, it should not correct it. That simply continues to encourage
poor XML authoring. As a programmer, if you want to try to auto-correct,
then okay, but I would not recommend it.

> The remaining qeustion is: How should faulty XML be handled
> at all? There are enough examples where you cannot simply
> reject the document. You need to read it.
> Does it make sense to think of a "correcting"
> parser which turns a bad document into something well-formed
> which can be inspected with an XML browser, together with
> some error-annotation tags?

No. No. No. No....

HTML is a huge mess because people started writing parsers that were
flexible and would correct things for you. Go try to write an HTML parser
that works against all the stuff out on the Internet. It is frightening
how difficult that is. There is just so much crap out there because people
said, "well, we can just correct that for them." Mismatched tags. Missing
quotes. Illegal characters. Missing close brackets. Simply crap.

With XML, the designers said, "No way. The document has to be correct, or
it gets rejected. Tough shit for the authors of bad documents."

Yes, I'm rather fascist on this one :-). I simply cannot condone or
recommend *any* allowance of flexibility in parsers. That will just lead
us back to the horrible situation that we are in now with HTML.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/


From tismer@appliedbiometrics.com  Sun Jan 24 13:22:55 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 24 Jan 1999 14:22:55 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <Pine.SUN.3.95.990124043323.3520A-100000@svpal.svpal.org>
Message-ID: <36AB1EAF.7F0273E5@appliedbiometrics.com>

Greg Stein wrote:
[but the file was from my course, and I'm correcting their homework]

> Push back against where the file came from. What if somebody sent you a
> bad executable? Do you try to correct it? What if they send a bad MSFT
> Word file? Do you try to correct it? Makefiles with spaces instead of
> tabs? crontab files with a missing column? etc. etc.

:-) Of course, I usually don't correct them. No exes.
Word files: Sometimes, if they come to me, whining about
their single copy of a Word file which is broke. I can give
them the plain text back in most cases, and this is ok.

> Well, the same for XML. If it is bad, then you ask for a correct one. Why
> should XML be any different than the multitude of documents that you deal
> with every day?

I'd say, since XML is not binary but very redundant ascii which
I can read, and also most often understand and correct by hand,
it is not so simple. You could also throw a faulty C program
away since ti is no proper C. Instead, I correct it.
Well, this was a bit far off, but somewhere between is the truth.

...
> By default, it should not correct it. That simply continues to encourage
> poor XML authoring. As a programmer, if you want to try to auto-correct,
> then okay, but I would not recommend it.

150% agreed.

[correcting parser]
> No. No. No. No....
> 
> HTML is a huge mess because people started writing parsers that were
> flexible and would correct things for you. Go try to write an HTML parser
> that works against all the stuff out on the Internet. It is frightening
> how difficult that is. There is just so much crap out there because people
> said, "well, we can just correct that for them." Mismatched tags. Missing
> quotes. Illegal characters. Missing close brackets. Simply crap.

Yes, I also don't want this again. You are right.

> With XML, the designers said, "No way. The document has to be correct, or
> it gets rejected. Tough shit for the authors of bad documents."
> 
> Yes, I'm rather fascist on this one :-). I simply cannot condone or
> recommend *any* allowance of flexibility in parsers. That will just lead
> us back to the horrible situation that we are in now with HTML.

Ok, let me name it different since my thought was different.

I don't want bad XML to be corrected automatically.
Instead, when it is rejected, I thought of generating a
different document, say an "error document" which gives
a description of the errors. This is a new (well-formed:)
XML document which wraps the source, inserts comments
or anything where the parsing broke, leaves correct
passages intact so far, but of course does not try
to produce correct XML from wrong XML. I'd apply this
tool to a file after I know it is wrong, for debuging
purposes. A little like a compiler listing.
Maybe it would suffice to escape the wrong parts and add the
XML error code and message to the error doc.

This was my reason to write the little indenter - debugging.

Thanks for your commitment, we're on the same side - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Sun Jan 24 13:34:12 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 24 Jan 1999 14:34:12 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <006001be4796$54879230$f29b12c2@pythonware.com>
Message-ID: <36AB2154.658633AD@appliedbiometrics.com>

Fredrik Lundh wrote:
> 
> >A general question (comes up because I had to hack my Indenter
> >especially for sgmlop):
> >Is a SAX parser required to report ignorableWHitespace events?
> >Or is it also allowed to never call this method, as sgmlop does?
> >If so, then the interface doesn't make too much sense since I have
> >to collect all data and handle whitespace when the next tag appears.
> 
> If I understand things correctly, sgmlop cannot figure
> out what's ignorable and not; you need to have access
> to the DTD to handle that.

Well, I understand. Lars also mentioned that without a
DTD and a parser which understands it, this event is useless.

> Our internal xml libraries allows the user to indicate
> whether a resource is "xml text" or "xml data".  the
> latter doesn't allow elements to contain both text
> and other elements, which means that it's easy to
> figure out what to ignore.

That sounds good, this is exactly what we need to distinguish,
too. How do you indicate this without a DTD?
A list of tags which are treated as raw data? (kind of a
sub-sub-DTD?)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From fredrik@pythonware.com  Sun Jan 24 13:50:02 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 24 Jan 1999 14:50:02 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
Message-ID: <003f01be47a0$71978f10$f29b12c2@pythonware.com>

>> Our internal xml libraries allows the user to indicate
>> whether a resource is "xml text" or "xml data".  the
>> latter doesn't allow elements to contain both text
>> and other elements, which means that it's easy to
>> figure out what to ignore.
>
>That sounds good, this is exactly what we need to distinguish,
>too. How do you indicate this without a DTD?

the caller must tell the library what to do based on
his/her knowledge of the DTD in question.

(in my experience, most data-oriented DTD's are
"xml data" in the sense that values are only stored
in leaf elements.  That's definitely true for every-
thing we design).

Cheers /F
fredrik@pythonware.com
http://www.pythonware.com


From digitome@iol.ie  Sun Jan 24 14:15:52 1999
From: digitome@iol.ie (Sean Mc Grath)
Date: Sun, 24 Jan 1999 14:15:52 +0000
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <Pine.SUN.3.95.990124043323.3520A-100000@svpal.svpal.org>
References: <36AB121F.7620E5D@appliedbiometrics.com>
Message-ID: <3.0.6.32.19990124141552.009262f0@gpo.iol.ie>

[Greg Stein]
>
>Push back against where the file came from. What if somebody sent you a
>bad executable? Do you try to correct it? What if they send a bad MSFT
>Word file? Do you try to correct it? Makefiles with spaces instead of
>tabs? crontab files with a missing column? etc. etc.
>
>Well, the same for XML. If it is bad, then you ask for a correct one. Why
>should XML be any different than the multitude of documents that you deal
>with every day?
>

Some "document" types such as C++ source code for example
benefit, in my opinion, from error recovery parsing. Nobody
wants a C++ compiler to generate executable code in the face of
errors but getting a listing of more than one error
increases your chances of fixing more than one error
in a single edit-compile cycle.

I belive an analogy with XML here is valid.
In production use, it makes total sense for an XML parser
to stop stone dead on error. For development use,
an XL parser that can recover from certain types
of error is a darned useful thing.

To give a concrete example, an XML parser with
optional error recovery would be wonderful for
XML up-translation work. There are many occasions
when you have automated the creation of pseudo-XML
and you want to cut code to get it the rest of the
way to full XML. Stop dead parsers are useless for
this type of work.

So, I would like to see xmlproc having
some optional error recovery functionality
that I could turn on for up-translation
parsing.

I realize that this is a contentious opinion:-)


<Sean uri="http://www.digitome.com/sean.htm"/>


From larsga@ifi.uio.no  Sun Jan 24 15:00:32 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Jan 1999 16:00:32 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <36AB2154.658633AD@appliedbiometrics.com>
References: <006001be4796$54879230$f29b12c2@pythonware.com> <36AB2154.658633AD@appliedbiometrics.com>
Message-ID: <wkzp78u35b.fsf@ifi.uio.no>

* Christian Tismer
|
| [ignorableWhitespace] 
|
| Well, I understand. Lars also mentioned that without a DTD and a
| parser which understands it, this event is useless.

Not useless, just impossible to fire as distinguished from the
characters event.
 
* Fredrik Lundh
|
| Our internal xml libraries allows the user to indicate whether a
| resource is "xml text" or "xml data".  the latter doesn't allow
| elements to contain both text and other elements, which means that
| it's easy to figure out what to ignore.

This sounds like a good approach to me. The XML recommendation
(sensibly) requires parsers to report all whitespace to the
application, but an application-specific layer on top of that sounds
good to me.
 
* Christian Tismer
|
| That sounds good, this is exactly what we need to distinguish,
| too. How do you indicate this without a DTD?  A list of tags which
| are treated as raw data? (kind of a sub-sub-DTD?)

Why not make a simple SAX parser filter that reads in such a list of
element type names and then filters characters events into characters
and ignorableWhitespace, possibly also doing whitespace normalization?

Sounds like something that is both simple to develop and eminently
reusable. 

--Lars M.


From tismer@appliedbiometrics.com  Sun Jan 24 15:18:55 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 24 Jan 1999 16:18:55 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <003f01be47a0$71978f10$f29b12c2@pythonware.com>
Message-ID: <36AB39DF.9AB57131@appliedbiometrics.com>

Fredrik,

Playing a little more with sgmlop, I realized that it
doesn't resolve entities when run under SAX.

What's the problem? Is there any but the necessary time?
Should I try to add this, or forget about SAX and use
sgmlop directly?

I'm still very happy with this and would like to work
on it, but need advice.
If no entityresolver is defined, should'nt the standard
entities &lt; &gt; &amp; be resolved internally?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From tismer@appliedbiometrics.com  Sun Jan 24 15:30:45 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 24 Jan 1999 16:30:45 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <006001be4796$54879230$f29b12c2@pythonware.com> <36AB2154.658633AD@appliedbiometrics.com> <wkzp78u35b.fsf@ifi.uio.no>
Message-ID: <36AB3CA5.8D95AAD3@appliedbiometrics.com>

Lars Marius Garshol wrote:
> 
> * Christian Tismer
> |
> | [ignorableWhitespace]
> |
> | Well, I understand. Lars also mentioned that without a DTD and a
> | parser which understands it, this event is useless.
> 
> Not useless, just impossible to fire as distinguished from the
> characters event.

But after all, I'm baffled. I got whitespace events when
I didn't specify the parser. It was using xmlproc as it looks 
like. xmlproc reported whitespace to me I think between a
closing tag of a sublevel, before the next closing tag.
I.E
      </thisthing>
  </outerthing>

between these I got witespace, ignored it and handled my
own indentation, and everything looked pretty.

Is this correct behavior, then?

...
> Why not make a simple SAX parser filter that reads in such a list of
> element type names and then filters characters events into characters
> and ignorableWhitespace, possibly also doing whitespace normalization?
> 
> Sounds like something that is both simple to develop and eminently
> reusable.

Well, good idea. For many simple data applications, it makes 
also sense to simply default to keep whitespace at leaf nodes, 
as Fredrik pointed out.
But before, I have to understand that topic above :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From paul@prescod.net  Sun Jan 24 16:25:26 1999
From: paul@prescod.net (Paul Prescod)
Date: Sun, 24 Jan 1999 10:25:26 -0600
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> 			<36A871E6.89033191@appliedbiometrics.com> <199901222144.WAA00911@lindm.dm> <36A9E4FF.76B8E3D5@appliedbiometrics.com> <36AA6DF4.662C7ED5@lyra.org> <wkaez8vrji.fsf@ifi.uio.no>
Message-ID: <36AB4976.1C0983CA@prescod.net>

Lars Marius Garshol wrote:
> 
> The error is reported properly as it is and the attribute is not
> repaired, but subsequent data events are wrong. That's now fixed (data
> events, not the attribute), but the question remains whether the
> parser should follow the XML recommendation and stop reporting data
> events after a well-formedness bug.
> 
> I'm inclined to make that default behaviour, but behaviour it is
> possible to turn off. Opinions are welcome.

I think that optional error recovery is a good idea. There are legitimate
uses for it and also the potential for serious abuse. If I ever used an
XML editor that refused to load half of a document because of missing
quotes I would dump it Pretty Damn Quick.

 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"You have the wrong number."
"Eh? Isn't that the Odeon?"
"No, this is the Great Theater of Life. Admission is free, but the 
taxation is mortal. You come when you can, and leave when you must. The 
show is continuous. Good-night." -- Robertson Davies, "The Cunning Man"


From larsga@ifi.uio.no  Sun Jan 24 17:17:08 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Jan 1999 18:17:08 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <36AB39DF.9AB57131@appliedbiometrics.com>
References: <003f01be47a0$71978f10$f29b12c2@pythonware.com> <36AB39DF.9AB57131@appliedbiometrics.com>
Message-ID: <wku2xgtwtn.fsf@ifi.uio.no>

* Christian Tismer
| 
| Playing a little more with sgmlop, I realized that it doesn't
| resolve entities when run under SAX.
| 
| [...]  Should I try to add this, or forget about SAX and use sgmlop
| directly?

If it's possible, I'd very much like either you or me to add it to the
driver. As far as I can see one must set a handle_entity handler that
does this somehow. Don't know the exact details, though.
 
| If no entityresolver is defined, should'nt the standard entities
| &lt; &gt; &amp; be resolved internally?

Yes. This is part of the XML recommendation. However, EntityResolver
is only used for external entities, not internal ones.

--Lars M.


From tismer@appliedbiometrics.com  Sun Jan 24 18:12:06 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Sun, 24 Jan 1999 19:12:06 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <003f01be47a0$71978f10$f29b12c2@pythonware.com> <36AB39DF.9AB57131@appliedbiometrics.com> <wku2xgtwtn.fsf@ifi.uio.no>
Message-ID: <36AB6276.6098AD89@appliedbiometrics.com>

Lars Marius Garshol wrote:
> 
> * Christian Tismer
> |
> | Playing a little more with sgmlop, I realized that it doesn't
> | resolve entities when run under SAX.
> |
> | [...]  Should I try to add this, or forget about SAX and use sgmlop
> | directly?
> 
> If it's possible, I'd very much like either you or me to add it to the
> driver. As far as I can see one must set a handle_entity handler that
> does this somehow. Don't know the exact details, though.

Fredrik handled this different, he has an extra mode for SAX
where he does not use his callback for entities. I have no
idea why, must wait for his answer.

> | If no entityresolver is defined, should'nt the standard entities
> | &lt; &gt; &amp; be resolved internally?
> 
> Yes. This is part of the XML recommendation. However, EntityResolver
> is only used for external entities, not internal ones.

Aha! And sgmlop didn't do this, so that's the reason why I got
&amp;lt in my attributes which contained "<" encoded as &lt;

So this is funny: If I just do some reformatting and juggling,
the process is this: The parser gives me characters and
tags and entities and whatsoever, strips the encodings off,
and I have to insert them back. What a mess.

It appears to me that XML parsers are already doing quite much,
also in cases where I don't need it. In my case, I would have 
been comfortable with kinda XML scanner which just recognizes
tokens, makes no attempt to resolve anything, to parse and
reorder the parameters (which is ok but I hate it) and
gives the plain text to me.
From that point of view, my basic simple parser building block
would something which can correctly recognize tags and doesn't 
change anything, just give me indices into the text.
Marc Lemburg's tagging engine springs into mind...

Anyway, if sgmlop doesn't resolve external entities but handles
the standards internally, this is ok with me. Again, I need
advice form /F.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From larsga@ifi.uio.no  Sun Jan 24 20:48:39 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Jan 1999 21:48:39 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: <36AB4976.1C0983CA@prescod.net>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> 			<36A871E6.89033191@appliedbiometrics.com> <199901222144.WAA00911@lindm.dm> <36A9E4FF.76B8E3D5@appliedbiometrics.com> <36AA6DF4.662C7ED5@lyra.org> <wkaez8vrji.fsf@ifi.uio.no> <36AB4976.1C0983CA@prescod.net>
Message-ID: <wkognotn14.fsf@ifi.uio.no>

* Lars Marius Garshol wrote:
| 
| The error is reported properly as it is and the attribute is not
| repaired, but subsequent data events are wrong. That's now fixed
| (data events, not the attribute), but the question remains whether
| the parser should follow the XML recommendation and stop reporting
| data events after a well-formedness bug.
| 
| I'm inclined to make that default behaviour, but behaviour it is
| possible to turn off. Opinions are welcome.

Since Christian, Greg, Paul and Sean all seem to be in agreement that
this is a good idea I've now made this change. It will appear in 0.60
together with a lot of other stuff.

--Lars M.


From Jack.Jansen@cwi.nl  Sun Jan 24 20:55:26 1999
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Sun, 24 Jan 1999 21:55:26 +0100
Subject: [XML-SIG] Big Bug? (was:Pretty-printing DOM trees)
In-Reply-To: Message by Lars Marius Garshol <larsga@ifi.uio.no> ,
 24 Jan 1999 12:28:17 +0100 , <wkaez8vrji.fsf@ifi.uio.no>
Message-ID: <UTC199901242055.VAA05521.jack@snelboot.cwi.nl>

Recently, Lars Marius Garshol <larsga@ifi.uio.no> said:
> The error is reported properly as it is and the attribute is not
> repaired, but subsequent data events are wrong. That's now fixed (data
> events, not the attribute), but the question remains whether the
> parser should follow the XML recommendation and stop reporting data
> events after a well-formedness bug.
> 
> I'm inclined to make that default behaviour, but behaviour it is
> possible to turn off. Opinions are welcome.

This sounds like the right way to go. Most applications should stop on 
non-well-formed documents, but there are definitely applications that
should be able to continue (like applications that try to repair
documents). It would be a bit silly to have to hand-craft code for
these if it could be an optional feature of the standard parser.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@cwi.nl      | ++++ if you agree copy these lines to your sig ++++
http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From uche.ogbuji@fourthought.com  Sun Jan 24 21:06:36 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Sun, 24 Jan 1999 14:06:36 -0700
Subject: [XML-SIG] xmlproc, SAX and EntityResolver
Message-ID: <199901242106.OAA00826@malatesta.local>

According to the (Java) SAX docs,

"""
public interface EntityResolver 

Basic interface for resolving entities. 

If a SAX application needs to implement customized handling for external 
entities, it must implement this interface and register
an instance with the SAX parser using the parser's setEntityResolver method.

The parser will then allow the application to intercept any external entities 
(including the external DTD subset and external
parameter entities, if any) before including them.
"""

And this is how the xmlproc in xml-0.4 used to work.  If I implemented 
entityResolver in a handler, and registered it, I'd get the entity events for 
the external DTD declaration as well as any other entities declared.

This no longer appears to work in xml-0.5.  Unfortunately, my current code it 
pretty complex, and I first of all want to make sure this wasn't an 
intentional change.  I'm pretty sure I've narrowed it to xmlproc, but if I'm 
told this should _not_ be so, I'll work on a stripped-down test-case.

Thanks.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From larsga@ifi.uio.no  Sun Jan 24 21:14:23 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Jan 1999 22:14:23 +0100
Subject: [XML-SIG] Re: xmlproc, SAX and EntityResolver
In-Reply-To: <199901242106.OAA00826@malatesta.local>
References: <199901242106.OAA00826@malatesta.local>
Message-ID: <wkn238tlu8.fsf@ifi.uio.no>

* uche ogbuji
| 
| [reporting of external DTD subset and external parameter entities]
|
| This no longer appears to work in xml-0.5.  Unfortunately, my
| current code it pretty complex, and I first of all want to make sure
| this wasn't an intentional change.  I'm pretty sure I've narrowed it
| to xmlproc, but if I'm told this should _not_ be so, I'll work on a
| stripped-down test-case.

As I recall, you fixed the reporting of the external DTD subset in
xmlproc (the version in xml-0.5). However, you didn't do it correctly,
so in my development code I have the correct patch (which is not
released yet).

Adding this to drv_xmlproc.py should do the trick for the external
subset:

    def resolve_doctype_pubid(self,pubid,sysid):
        return self.ent_handler.resolveEntity(pubid,sysid)

This will not affect external parameter entities. If you need those as
well, let me know. They will be reported by 0.60, but that may still
be a couple of weeks into the future.


A worse problem is that, as Paul pointed out, Python SAX
EntityResolvers return the system identifier of the entity rather than
an object from which the entity contents can be read. I intend to fix
this when we release SAX 2.0 unless someone screams loudly.

--Lars M.


From uche.ogbuji@fourthought.com  Sun Jan 24 22:24:19 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Sun, 24 Jan 1999 15:24:19 -0700
Subject: [XML-SIG] Re: xmlproc, SAX and EntityResolver
In-Reply-To: Your message of "24 Jan 1999 22:14:23 +0100."
 <wkn238tlu8.fsf@ifi.uio.no>
Message-ID: <199901242224.PAA00906@malatesta.local>

> 
> * uche ogbuji
> | 
> | [reporting of external DTD subset and external parameter entities]
> |
> | This no longer appears to work in xml-0.5.  Unfortunately, my
> | current code it pretty complex, and I first of all want to make sure
> | this wasn't an intentional change.  I'm pretty sure I've narrowed it
> | to xmlproc, but if I'm told this should _not_ be so, I'll work on a
> | stripped-down test-case.
> 
> As I recall, you fixed the reporting of the external DTD subset in
> xmlproc (the version in xml-0.5). However, you didn't do it correctly,
> so in my development code I have the correct patch (which is not
> released yet).

Oh yeah.  I actually fixed this for xml-0.4, but it was so long ago that I 
forgot.  I recently discovered that I had the wrong sym-link, and I've been 
using xml-0.4 instead of xml-0.5, even after installing the latter.  So when I 
fixed the link and started using xml-0.5, my patch to report the external DTD 
subset wasn't there.  Duh!

You'd mentioned before that this patch of mine was "wrong", but I didn' know 
how to do it the "right" way, so thanks for the code snippet below.

> Adding this to drv_xmlproc.py should do the trick for the external
> subset:
> 
>     def resolve_doctype_pubid(self,pubid,sysid):
>         return self.ent_handler.resolveEntity(pubid,sysid)

Unfortunately, on preliminary testing it doesn't appear to work.  I'll work on 
an isolated test case and get back to you.

> This will not affect external parameter entities. If you need those as
> well, let me know. They will be reported by 0.60, but that may still
> be a couple of weeks into the future.

It looks as if 0.60 will be very helpful to me when it's released.

> A worse problem is that, as Paul pointed out, Python SAX
> EntityResolvers return the system identifier of the entity rather than
> an object from which the entity contents can be read. I intend to fix
> this when we release SAX 2.0 unless someone screams loudly.

For my selfish purposes (constructing DOM trees from SAX events), this doesn't 
affect me, so it's okay with me to wait until SAX 2.0.

The only thing I'd mention is that discussion of SAX 2.0 on XML-DEV appears to 
be going at a pretty deliberate pace (a good thing!), and so 2.0 might be a 
ways off.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From larsga@ifi.uio.no  Sun Jan 24 22:35:36 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Jan 1999 23:35:36 +0100
Subject: [XML-SIG] Re: xmlproc, SAX and EntityResolver
In-Reply-To: <199901242224.PAA00906@malatesta.local>
References: <199901242224.PAA00906@malatesta.local>
Message-ID: <wkhftgti2v.fsf@ifi.uio.no>

* uche ogbuji
| 
| Oh yeah.  I actually fixed this for xml-0.4, but it was so long ago
| that I forgot.  I recently discovered that I had the wrong sym-link,
| and I've been using xml-0.4 instead of xml-0.5, even after
| installing the latter.  So when I fixed the link and started using
| xml-0.5, my patch to report the external DTD subset wasn't there.
| Duh!

Ah, that explains it. 
 
| Unfortunately, on preliminary testing it doesn't appear to work.
| I'll work on an isolated test case and get back to you.

Arh! I'm getting confused by all the different versions here. Sorry,
you need this in xmlproc.py as well to have it call that method (just
replace the existing method with this, hopefully this does not depend
on other changes):

    def parse_doctype(self):
	"Parses the document type declaration."

	if self.seen_doctype:
	    self.report_error(3032)
	if self.seen_root:
	    self.report_error(3033)
	
	self.skip_ws(1)
        rootname=self._get_name()
	self.skip_ws(1)

        (pub_id,sys_id)=self.parse_external_id()
	self.skip_ws()
	
	if self.now_at("["):
	    self.parse_internal_dtd()    
	elif not self.now_at(">"):
            self.report_error(3005,">")

        # External subset must be parsed _after_ the internal one
	if pub_id!=None or sys_id!=None: # Was there an external id at all?
            sys_id=self.pubres.resolve_doctype_pubid(pub_id,sys_id)
	    self.app.handle_doctype(rootname,pub_id,sys_id)
            
        self.dtd.prepare_for_parsing()
	self.seen_doctype=1 # Has to be at the end to avoid block trouble

 
| It looks as if 0.60 will be very helpful to me when it's released.

Hopefully it will be to a lot of people. And after I decided to delay
DDML support (the standard previously known as XSchema) and DTD
caching to 0.61 what remains is mainly upgrading the regression test,
documentation, running test etc  

Just the mechanics of getting out a new version.
 
| The only thing I'd mention is that discussion of SAX 2.0 on XML-DEV
| appears to be going at a pretty deliberate pace (a good thing!), and
| so 2.0 might be a ways off.

It may, yes. I guess it all depends on David Megginson and how much
time he can devote to this. If the discussion turns out to take too
long we'll just have to put out a SAX 1.1 in the meantime.

--Lars M.


From larsga@ifi.uio.no  Sun Jan 24 22:59:47 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 24 Jan 1999 23:59:47 +0100
Subject: [XML-SIG] XSA 1.0 specification released
Message-ID: <wkd844tgyk.fsf@ifi.uio.no>

XSA is an XML-based system that allows anyone who is interested to
automatically discover new versions of software products as they are
released by polling XML documents describing the products. It is
mainly intended to help software index maintainers keep their indexes
up to date.

I have now finalized the XSA 1.0 specification and XSA is thus ready
for use. 

<URL:http://birk105.studby.uio.no/www_work/xsa/>
<URL:http://birk105.studby.uio.no/www_work/xsa/xsaprop.txt>


The accompanying software is still being tested, but will be released
as soon as possible, probably in a week or so. I will announce it here
when it is ready.


What this means is that we now have an XML application for publishing
structured information on the web that is ready for use. I am using it
(via a cron job on my Linux machine) to keep track of new releases on
my XML tools list[1], and I'm confident that other software list
maintainers will start using the system as well once I release the
software.

So, to all you developers of XML software: please make yourself an XSA
document and publish it on the web. That way we can both keep the
software indexes updated and demonstrate that XML can actually be
used. The more people who do this, the more useful the system will be.

The XSA site contains both a wizard for making documents, an online
validator and a form for registering new XSA documents.

--Lars M.

[1] <URL:http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html>


From uche.ogbuji@fourthought.com  Sun Jan 24 23:48:41 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Sun, 24 Jan 1999 16:48:41 -0700
Subject: [XML-SIG] Re: xmlproc, SAX and EntityResolver
In-Reply-To: Your message of "24 Jan 1999 23:35:36 +0100."
 <wkhftgti2v.fsf@ifi.uio.no>
Message-ID: <199901242348.QAA01110@malatesta.local>

This is a multipart MIME message.

--==_Exmh_-20107729380
Content-Type: text/plain; charset=us-ascii

> | Unfortunately, on preliminary testing it doesn't appear to work.
> | I'll work on an isolated test case and get back to you.
> 
> Arh! I'm getting confused by all the different versions here. Sorry,
> you need this in xmlproc.py as well to have it call that method (just
> replace the existing method with this, hopefully this does not depend
> on other changes):

Thanks, but maybe there is yet a dependency, because it still doesn't work.  
Here's a small test case.  The SAX app, xml file and DTD are attached.  I get 
results from the startElement (of course), but not from any of the other 
events.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


--==_Exmh_-20107729380
Content-Type: text/plain ; name="addr_book.dtd"; charset=us-ascii
Content-Description: addr_book.dtd
Content-Disposition: attachment; filename="addr_book.dtd"

<!ELEMENT ADDRBOOK (ENTRY*)>
<!ELEMENT ENTRY (NAME, ADDRESS, PHONENUM*, EMAIL)>
<!ATTLIST ENTRY
    ID ID #REQUIRED
>
<!ELEMENT NAME (#PCDATA)>
<!ELEMENT ADDRESS (#PCDATA)>
<!ELEMENT PHONENUM (#PCDATA)>
<!ATTLIST PHONENUM
    DESC CDATA #REQUIRED
>
<!ELEMENT EMAIL (#PCDATA)>

--==_Exmh_-20107729380
Content-Type: text/plain ; name="addr_book1.xml"; charset=us-ascii
Content-Description: addr_book1.xml
Content-Disposition: attachment; filename="addr_book1.xml"

<?xml version = "1.0"?>
<!DOCTYPE ADDRBOOK SYSTEM "addr_book.dtd" [
	<!ENTITY xxx SYSTEM "Mein Kampf">
]>
<ADDRBOOK>
	<ENTRY ID="pa">
		<NAME>Pieter Aaron</NAME>
		<ADDRESS>404 Error Way</ADDRESS>
		<PHONENUM DESC="Work">404-555-1234</PHONENUM>
		<PHONENUM DESC="Fax">404-555-4321</PHONENUM>
		<PHONENUM DESC="Pager">404-555-5555</PHONENUM>
		<EMAIL>pieter.aaron@inter.net</EMAIL>
	</ENTRY>
	<ENTRY ID="en">
		<NAME>Emeka Ndubuisi</NAME>
		<ADDRESS>42 Spam Blvd</ADDRESS>
		<PHONENUM DESC="Work">767-555-7676</PHONENUM>
		<PHONENUM DESC="Fax">767-555-7642</PHONENUM>
		<PHONENUM DESC="Pager">800-SKY-PAGEx767676</PHONENUM>
		<EMAIL>endubuisi@spamtron.com</EMAIL>
	</ENTRY>
	<ENTRY ID="vz">
		<NAME>Vasia Zhugenev</NAME>
		<ADDRESS>2000 Disaster Plaza</ADDRESS>
		<PHONENUM DESC="Work">000-987-6543</PHONENUM>
		<PHONENUM DESC="Cell">000-000-0000</PHONENUM>
		<EMAIL>vxz@magog.ru</EMAIL>
	</ENTRY>
</ADDRBOOK>

--==_Exmh_-20107729380
Content-Type: text/plain; name="test_doctype.py"; charset=us-ascii
Content-Description: test_doctype.py
Content-Disposition: attachment; filename="test_doctype.py"
Content-Transfer-Encoding: quoted-printable

import sys
from xml.sax import saxlib, saxexts, drivers

class test_doctype(saxlib.HandlerBase):
    def unparsedEntityDecl (self, publicId, systemId, notationName):
        print "unparsedEntityDecl", publicId, systemId, notationName

    def resolveEntity (self, name, publicId, systemId):
	print "entity", name, publicId, systemId

    def startElement(self, name, attribs):
	print "element", name, attribs

    def warning(self, exception):
	raise exception

    def error(self, exception):
	raise exception

    def fatalError(self, exception):
	raise exception


if __name__ =3D=3D "__main__":
    parser =3D saxexts.XMLValParserFactory.make_parser()
    handler =3D test_doctype()
    parser.setDocumentHandler(handler)
    parser.setDTDHandler(handler)
    parser.setEntityResolver(handler)
    parser.setErrorHandler(handler)

    parser.parseFile(open("addr_book1.xml"))


--==_Exmh_-20107729380--


From uche.ogbuji@fourthought.com  Sun Jan 24 23:53:21 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Sun, 24 Jan 1999 16:53:21 -0700
Subject: [XML-SIG] xmlproc and parameter entities in external DTD subsets
Message-ID: <199901242353.QAA01124@malatesta.local>

I know that xmlproc 0.52 doesn't support parameter entities in external DTD 
subsets within declarations yet, but is there a chance that they will 
supported in 0.6?  We are working with the xsl.dtd, and it requires _many_ 
parameter entities to avoid being of near-infinite length.

Thanks.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From larsga@ifi.uio.no  Sun Jan 24 23:53:20 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 25 Jan 1999 00:53:20 +0100
Subject: [XML-SIG] Re: xmlproc and parameter entities in external DTD subsets
In-Reply-To: <199901242353.QAA01124@malatesta.local>
References: <199901242353.QAA01124@malatesta.local>
Message-ID: <wkzp78rzwv.fsf@ifi.uio.no>

* uche ogbuji
|
| I know that xmlproc 0.52 doesn't support parameter entities in
| external DTD subsets within declarations yet, but is there a chance
| that they will supported in 0.6? 

They are already. :) I have implemented this, and it works. Better
testing remains, but, yes, I am confident that this will be in 0.6.

| We are working with the xsl.dtd, and it requires _many_ parameter
| entities to avoid being of near-infinite length.

Hmmm. Well, if you want to be a beta tester on xmlproc, just send me
an email and I'll put out a zip of my current version with the current
SAX driver. (That won't happen until tomorrow, though, since I'm going
to sleep now.)

--Lars M.


From Fred L. Drake, Jr." <fdrake@acm.org  Mon Jan 25 15:05:16 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 25 Jan 1999 10:05:16 -0500 (EST)
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <00b701be46c6$40ea7a10$f29b12c2@pythonware.com>
References: <00b701be46c6$40ea7a10$f29b12c2@pythonware.com>
Message-ID: <13996.34860.568582.467756@weyr.cnri.reston.va.us>

Fredrik Lundh writes:
 > being ignored by the sgmllib and xmllib maintainers, so keeping
 > things in sync is pretty hard.

  Not ignored; I for one am simply swamped with some other concerns at 
the moment.  I plan to update sgmllib when I can, I just can't promise 
when that will be.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From wunder@infoseek.com  Mon Jan 25 18:17:09 1999
From: wunder@infoseek.com (Walter Underwood)
Date: Mon, 25 Jan 1999 10:17:09 -0800
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <36A9EE42.78F166D5@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A8BFED.BE6C3EF6@appliedbiometrics.com>
 <36A8DF4E.2D3852D7@appliedbiometrics.com>
 <wksod22stn.fsf@ifi.uio.no>
Message-ID: <3.0.5.32.19990125101709.00ca7580@corp>

At 04:44 PM 1/23/99 +0100, Christian Tismer wrote:
>What I need to find is the fastest acceptable parser which allows
>me to turn masses of XML data into Python structures. [...] we are 
>processing XML encoded database records which are quite irregular 
>(useless to use a relational database) and quite simple, but the 
>standard size is some 50MB. This is why I'm after speed, much more than
>conformance.

I'm using pyexpat for the XML support in our search engine.
At this point in development, I'm collecting text and associating
it with *every* enclosing element. So this is worst-case for
parsing time.

Parsing Jon Bosak's tagged "Old Testament" (3.4 megabytes) takes
30 seconds. That document is pretty heavily tagged, with an element
for each verse, each chapter, each book, the body, etc.

Collecting less information would probably be faster.

If you need a lot more speed than this (integer factors faster) 
you might need to do all the parsing in C. Remember that there
is a difference between a paser that implements all of XML and
a parser that extracts the data you need from your XML documents.
If you can trust the documents to be legal (perhaps they are 
checked when generated), then a hard-coded parser may be the
answer.

wunder


Walter R. Underwood
wunder@infoseek.com
wunder@best.com (home)
http://www.best.com/~wunder/
1-408-543-6946


From tismer@appliedbiometrics.com  Mon Jan 25 20:50:30 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Mon, 25 Jan 1999 21:50:30 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com>
 <36A8BFED.BE6C3EF6@appliedbiometrics.com>
 <36A8DF4E.2D3852D7@appliedbiometrics.com>
 <wksod22stn.fsf@ifi.uio.no> <3.0.5.32.19990125101709.00ca7580@corp>
Message-ID: <36ACD916.2F97E1FF@appliedbiometrics.com>

Walter Underwood wrote:
> 
> At 04:44 PM 1/23/99 +0100, Christian Tismer wrote:
> >What I need to find is the fastest acceptable parser which allows
> >me to turn masses of XML data into Python structures. [...] we are
> >processing XML encoded database records which are quite irregular
> >(useless to use a relational database) and quite simple, but the
> >standard size is some 50MB. This is why I'm after speed, much more than
> >conformance.
> 
> I'm using pyexpat for the XML support in our search engine.
> At this point in development, I'm collecting text and associating
> it with *every* enclosing element. So this is worst-case for
> parsing time.
> 
> Parsing Jon Bosak's tagged "Old Testament" (3.4 megabytes) takes
> 30 seconds. That document is pretty heavily tagged, with an element
> for each verse, each chapter, each book, the body, etc.
> 
> Collecting less information would probably be faster.

Interesting. I tested my Indenter with this file
(what a nice example).
It takes 11.75 seconds to indent this through SAX, using sgmlop.
With xmlproc, it takes 30.87 seconds.
Running the whole text through sgmlop without any
associated events ran in below one second.

> If you need a lot more speed than this (integer factors faster)
> you might need to do all the parsing in C. Remember that there
> is a difference between a paser that implements all of XML and
> a parser that extracts the data you need from your XML documents.
> If you can trust the documents to be legal (perhaps they are
> checked when generated), then a hard-coded parser may be the
> answer.

Well, both is true. I want to validate small amounts of newly
added data "records" which are in XML format, but then
kept in a special repository, and I want to be able to
re-import large amounts of XML which were exported by my
app before. This means, I need a validating parser of
acceptable speed, where I think xmlproc is very good?
And I need something that simply eats large amounts
of approved data.
But I won't go so far to code this all in C since these imports
will not be so frequent. I would even prefer to do it all
in Python if possible.

There are also cases where even sgmlop does much more
than I need. There are applications where I just want to
know where the tags start and end, and I don't want
substitutions, no parsing and reordering of parameters,
just to be able to juggle with unmodified pieces of XML.

Therefore I proposed an XML scanner which just provides
the tools to build up what you actually need. Maybe I
overlooked it and we have that already somewhere.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From larsga@ifi.uio.no  Mon Jan 25 21:18:29 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 25 Jan 1999 22:18:29 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
In-Reply-To: <36ACD916.2F97E1FF@appliedbiometrics.com>
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> 	 <36A8BFED.BE6C3EF6@appliedbiometrics.com> 	 <36A8DF4E.2D3852D7@appliedbiometrics.com> 	 <wksod22stn.fsf@ifi.uio.no> <3.0.5.32.19990125101709.00ca7580@corp> <36ACD916.2F97E1FF@appliedbiometrics.com>
Message-ID: <wkognnuk4a.fsf@ifi.uio.no>

* Christian Tismer
| 
| [About ot.xml]
|
| Interesting. I tested my Indenter with this file (what a nice
| example).

A rather misleading one, I'm afraid, since it doesn't use entities,
comments, PIs, marked sections or attributes, only elements and
PCDATA.

| It takes 11.75 seconds to indent this through SAX, using sgmlop.
| With xmlproc, it takes 30.87 seconds.  

Interesting. (And pleasing. :)

| Running the whole text through sgmlop without any associated events
| ran in below one second.
 
It's worth noting that this is just the time for the raw parse. As far
as I know, sgmlop will not call handlers if there aren't any and so
this entire second will be spent in C source.

| I want to validate small amounts of newly added data "records" which
| are in XML format, but then kept in a special repository, and I want
| to be able to re-import large amounts of XML which were exported by
| my app before. This means, I need a validating parser of acceptable
| speed, where I think xmlproc is very good? 

I think the Java parsers are probably faster, but xmlproc should be
acceptable, yes. 

When I release 0.60 the DTD parser and DTD objects are separated from
the XML parser. This means that provided you can get the external and
internal DTD subsets from expat it's possible to build an expat-based
validator using the xmlproc sources. This will require a bit of work,
though.

With DTD caching (scheduled for 0.61 in my current plans) you won't
have to keep reparsing the DTD for each document either, thus saving
even more speed. (Parse times for large DTDs such as TEI-XML take
substantial amounts of time.)

--Lars M.


From tismer@appliedbiometrics.com  Mon Jan 25 21:58:04 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Mon, 25 Jan 1999 22:58:04 +0100
Subject: [XML-SIG] SAX prettyprinter V2 and SGMLOP
References: <199901210345.WAA29899@207-172-49-200.s200.tnt14.ann.erols.com> 	 <36A8BFED.BE6C3EF6@appliedbiometrics.com> 	 <36A8DF4E.2D3852D7@appliedbiometrics.com> 	 <wksod22stn.fsf@ifi.uio.no> <3.0.5.32.19990125101709.00ca7580@corp> <36ACD916.2F97E1FF@appliedbiometrics.com> <wkognnuk4a.fsf@ifi.uio.no>
Message-ID: <36ACE8EC.6130BD08@appliedbiometrics.com>

Lars Marius Garshol wrote:
> 
> * Christian Tismer
> |
> | [About ot.xml]
> |
> | Interesting. I tested my Indenter with this file (what a nice
> | example).
> 
> A rather misleading one, I'm afraid, since it doesn't use entities,
> comments, PIs, marked sections or attributes, only elements and
> PCDATA.

Right, very simple.

> | It takes 11.75 seconds to indent this through SAX, using sgmlop.
> | With xmlproc, it takes 30.87 seconds.
> 
> Interesting. (And pleasing. :)

And then I wrote a simple plain vanilla indenter in
pure Python which does the same in 5 seconds.
Just splitting away, finding tags correctly, counting
levels, and doing nothing else at all.

I think this will not become much faster by using sgmlop,
so the test which you mentioned a while ago is obsolete.
5 seconds is the need for indentation, the rest is
gymnastics which is useless in this case.

> | Running the whole text through sgmlop without any associated events
> | ran in below one second.
> 
> It's worth noting that this is just the time for the raw parse. As far
> as I know, sgmlop will not call handlers if there aren't any and so
> this entire second will be spent in C source.

Right, this is the "naked" time.

> | I want to validate small amounts of newly added data "records" which
> | are in XML format, but then kept in a special repository, and I want
> | to be able to re-import large amounts of XML which were exported by
> | my app before. This means, I need a validating parser of acceptable
> | speed, where I think xmlproc is very good?
> 
> I think the Java parsers are probably faster, but xmlproc should be
> acceptable, yes.
> 
> When I release 0.60 the DTD parser and DTD objects are separated from
> the XML parser. This means that provided you can get the external and
> internal DTD subsets from expat it's possible to build an expat-based
> validator using the xmlproc sources. This will require a bit of work,
> though.
> 
> With DTD caching (scheduled for 0.61 in my current plans) you won't
> have to keep reparsing the DTD for each document either, thus saving
> even more speed. (Parse times for large DTDs such as TEI-XML take
> substantial amounts of time.)

I'm happy to hear this.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From jae@kavi.com  Tue Jan 26 02:12:40 1999
From: jae@kavi.com (John Eikenberry)
Date: Mon, 25 Jan 1999 18:12:40 -0800 (PST)
Subject: [XML-SIG] Bug or Delusion
Message-ID: <Pine.LNX.3.96.990125174100.13717G-100000@taos.kavi.com>

Hello,

I'm in the process of writing my first DTD, and am having bit of a 
problem. I'm attempting to create valueless attributes (like <DL compact>
in html). Now my XML book has this statement:

 For an XML document to be valid, whenever an element type with an
 #IMPLIED attribute appears and does not have a value, the XML procesor
 must report the missing value and continue processing.

In addition, in the ibtwsh.dtd (Itsy Bitsy Teeny Weeny Simple Hypertext
DTD), they have the 'compact' attribute defined like this:

 <!ENTITY % compact "compact (compact) #IMPLIED">

 <!-- Definition list -->
 <!ELEMENT DL (DT|DD)+>
 <!ATTLIST DL
        %compact;
        %basic;>

When I try something like this in my DTD...

<!ATTLIST setup
	c	(c)	#IMPLIED
	...>

And run the xvcmd over a test xml document. I get these errors:

xmysql.xml:4:10: Document root element 'package' does not match declared
root element
xmysql.xml:40:9: '=' expected
xmysql.xml:40:11: One of '>' or '/>' expected
Parse complete, 3 error(s) and 0 warning(s)

The first error I've been getting, and just haven't gotten around to
tracking it down (the package element seems fine to me... but I don't 
think this is relevant to the problem at hand).

Are these errors the systems way of reporting the missing value (as the
paragraph from my book states)? I thought that errors were fatal, and
things to be avoided. I was expecting mabey a warning.

BTW, Here's line 40:

<setup c/>


This seems to either be a mistake in xmlproc, or I'm not understanding
this very well (probably the latter). If this is a mistake on my part, I'd
appreciate any tips/advice. 

Thanks,

---

John Eikenberry
[jae@taos.kavi.com - http://taos.kavi.com/~jae/] 
______________________________________________________________
"A society that will trade a little liberty for a little order
 will deserve neither and lose both."
                                         --B. Franklin


From hiren@infoseek.com  Tue Jan 26 04:42:59 1999
From: hiren@infoseek.com (Hirendra Hindocha)
Date: Mon, 25 Jan 1999 20:42:59 -0800 (PST)
Subject: [XML-SIG] ampersand in name how to parse
Message-ID: <Pine.GSO.3.96.990125203719.28261h-100000@ahab>

Hi,

I've just started working with the xml package and I was 
trying to parse a document which looks like this - 

<node name="root" id="">
 <node name="test & 2" id="1234">
 </node>
 </node>

the & in the name above seems to cause an exception .

I have the following code fragment - 
class TaxonomyHandler(saxlib.DocumentHandler):
    def startElement(self, name, attrs):
		nodename = attrs['name']
		id = attrs['id']
		print nodename,id

If I use the BaseHandler to inherit from , the second node is silently
ignored. When I use the DocumentHandler as above, an exception is
generated.  If I drop the "&" then everything works. 

What do I need to do to be able to accept the & in the name ? 

Any help is appreciated,
Thanks,
Hiren
--------------------------------------------------------
USER ERROR: replace user and press any key to continue. 


From larsga@ifi.uio.no  Tue Jan 26 07:30:11 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 26 Jan 1999 08:30:11 +0100
Subject: [XML-SIG] ampersand in name how to parse
In-Reply-To: <Pine.GSO.3.96.990125203719.28261h-100000@ahab>
References: <Pine.GSO.3.96.990125203719.28261h-100000@ahab>
Message-ID: <wk3e4y8pa4.fsf@ifi.uio.no>

* Hirendra Hindocha
| 
| <node name="root" id="">
|  <node name="test & 2" id="1234">
|  </node>
|  </node>
| 
| the & in the name above seems to cause an exception .

As it should, since the document above is not well-formed. (XML is
much stricter than HTML.)
 
| What do I need to do to be able to accept the & in the name ? 

Write it as &amp; instead. :)

--Lars M.


From larsga@ifi.uio.no  Tue Jan 26 07:36:48 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 26 Jan 1999 08:36:48 +0100
Subject: [XML-SIG] Bug or Delusion
In-Reply-To: <Pine.LNX.3.96.990125174100.13717G-100000@taos.kavi.com>
References: <Pine.LNX.3.96.990125174100.13717G-100000@taos.kavi.com>
Message-ID: <wk1zki8oz3.fsf@ifi.uio.no>

* John Eikenberry
| 
|  For an XML document to be valid, whenever an element type with an
|  #IMPLIED attribute appears and does not have a value, the XML procesor
|  must report the missing value and continue processing.

This quote is rather misleading. What it's trying to say (or should be
trying to say) is that #IMPLIED attributes are optional. Not the
value, but the whole attribute.
 
| In addition, in the ibtwsh.dtd (Itsy Bitsy Teeny Weeny Simple
| Hypertext DTD), they have the 'compact' attribute defined like this:
| 
|  <!ENTITY % compact "compact (compact) #IMPLIED">
| 
|  <!-- Definition list -->
|  <!ELEMENT DL (DT|DD)+>
|  <!ATTLIST DL
|         %compact;

What this means is that DLs can look like:

<DL compact="compact">

or

<DL>

| And run the xvcmd over a test xml document. I get these errors:
| 
| xmysql.xml:4:10: Document root element 'package' does not match declared
| root element

This means that you have 

<!DOCTYPE something-other-than-package 

in your document.

| xmysql.xml:40:9: '=' expected
| xmysql.xml:40:11: One of '>' or '/>' expected
| Parse complete, 3 error(s) and 0 warning(s)
| 
| Are these errors the systems way of reporting the missing value (as the
| paragraph from my book states)? I thought that errors were fatal, and
| things to be avoided.

Not really, this is the systems way of reporting that your document is
not well-formed. All XML attributes _must_ have a value if they are
present in the start tag.

The XML grammar shows this clearly:

<URL:http://www.w3.org/TR/REC-xml#NT-Attribute>

| This seems to either be a mistake in xmlproc, or I'm not
| understanding this very well (probably the latter). If this is a
| mistake on my part, I'd appreciate any tips/advice.

Set

<setup c="c"/>

instead and it will work.

--Lars M.


From jae@kavi.com  Tue Jan 26 08:10:08 1999
From: jae@kavi.com (John Eikenberry)
Date: Tue, 26 Jan 1999 00:10:08 -0800 (PST)
Subject: [XML-SIG] Bug or Delusion
In-Reply-To: <wk1zki8oz3.fsf@ifi.uio.no>
Message-ID: <Pine.LNX.3.96.990126000555.15322B-100000@taos.kavi.com>

On 26 Jan 1999, Lars Marius Garshol wrote:

> | And run the xvcmd over a test xml document. I get these errors:
> | 
> | xmysql.xml:4:10: Document root element 'package' does not match declared
> | root element
> 
> This means that you have 
> 
> <!DOCTYPE something-other-than-package 
> 
> in your document.

Cool. I thought you just needed this in the dtd. I'd been spending all my
time trying to figure out the other problem. I hopefully would have
figured this out after looking at the top of an xbel document. :)

> Not really, this is the systems way of reporting that your document is
> not well-formed. All XML attributes _must_ have a value if they are
> present in the start tag.
> 
> The XML grammar shows this clearly:
> 
> <URL:http://www.w3.org/TR/REC-xml#NT-Attribute>

Thanks for the clarification Lars. I guess I just assumed that you could
reproduce html in xml, and therefor (assumed) there had to be a way to
have a valueless attribute.


Thanks again,

---

John Eikenberry
[jae@taos.kavi.com - http://taos.kavi.com/~jae/] 
______________________________________________________________
"A society that will trade a little liberty for a little order
 will deserve neither and lose both."
                                         --B. Franklin


From fredrik@pythonware.com  Tue Jan 26 10:25:45 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 26 Jan 1999 11:25:45 +0100
Subject: [XML-SIG] Python 1.5.2b1's xmllib.py Considered Harmful
Message-ID: <01f901be4916$3d3fc490$f29b12c2@pythonware.com>

xmllib.py currently got a completely new interface in 1.5.2b1.
The new interface silently breaks all existing implementations
(it no longer calls start and end handlers), something that has
caused us a LOT of trouble lately.  For example, our highly
successful xmlrpclib.py implementation doesn't work at all
under 1.5.2b1.

I hereby propose that the old implementation of xmllib.py
should put back in Python 1.5.2 final, and that the new
incompatible version is shipped under a new name (e.g.
xmllib2).  I don't mind if the old version is deprecated,
just don't remove it from Python before 2.0.

Regards /F
fredrik@pythonware.com
http://www.pythonware.com


From fredrik@pythonware.com  Tue Jan 26 11:45:32 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 26 Jan 1999 12:45:32 +0100
Subject: [XML-SIG] Python 1.5.2b1's xmllib.py Considered Harmful
Message-ID: <028401be4921$625727e0$f29b12c2@pythonware.com>

>xmllib.py currently got a completely new interface in 1.5.2b1.

duh. s/current/sudden/g

    /F


From gherman@darwin.in-berlin.de  Wed Jan 27 17:48:00 1999
From: gherman@darwin.in-berlin.de (Dinu C. Gherman)
Date: Wed, 27 Jan 1999 18:48:00 +0100
Subject: [XML-SIG] XML package as RPM anywhere?
Message-ID: <36AF5150.8E7A178D@darwin.in-berlin.de>

Are the various versions of the XML add-ons distributed also
in the popular RPM format? If so, where can the be found?
It seems Oliver Andrich does not provide them.

Thanks,

Dinu

-- 
Dinu C. Gherman       :  Mit Berlin kannste mir jagen!
................................................................
LHS International AG  :  http://www.lhsgroup.com
8050 Zurich           :  http://www.zurich.ch
Switzerland           :  http://pgp.ai.mit.edu 
                      :  mobile://49.172.3060751


From tismer@appliedbiometrics.com  Thu Jan 28 08:29:43 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Thu, 28 Jan 1999 09:29:43 +0100
Subject: [XML-SIG] XML package as RPM anywhere?
References: <36AF5150.8E7A178D@darwin.in-berlin.de>
Message-ID: <36B01FF7.7E016A85@appliedbiometrics.com>

Dinu C. Gherman wrote:
> 
> Are the various versions of the XML add-ons distributed also
> in the popular RPM format? If so, where can the be found?
> It seems Oliver Andrich does not provide them.

I don't think that this makes sense, already. The XML SIG has
made great progress but is still very evolving. The current
snapshot releases are very easy to install, since you
just need to unpack the archive into a dir which is in the
Python path, and you can import the modules instantly.

If you want to follow the latest releases, I'd recommend
to use CVS. RPM seems to be a little early.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From olli@rhein-zeitung.de  Thu Jan 28 09:05:03 1999
From: olli@rhein-zeitung.de (Oliver Andrich)
Date: Thu, 28 Jan 1999 10:05:03 +0100
Subject: [XML-SIG] XML package as RPM anywhere?
In-Reply-To: <36B01FF7.7E016A85@appliedbiometrics.com>; from Christian Tismer on Thu, Jan 28, 1999 at 09:29:43AM +0100
References: <36AF5150.8E7A178D@darwin.in-berlin.de> <36B01FF7.7E016A85@appliedbiometrics.com>
Message-ID: <19990128100502.C2267@rwpc.rhein-zeitung.de>

Hi,

I am using the xml stuff very much at work myself, but because it is changing
so fast, I kept the xml 0.5 package in secret. If someone needs them, then I
can sent them to him/her. But I think chris is right with his opinion, but if
you think that the xml 0.5 should be released precompiled then this is no
problem.

Bye, Oliver

On Thu, Jan 28, 1999 at 09:29:43AM +0100, Christian Tismer wrote:
> Dinu C. Gherman wrote:
> > 
> > Are the various versions of the XML add-ons distributed also
> > in the popular RPM format? If so, where can the be found?
> > It seems Oliver Andrich does not provide them.
> 
> I don't think that this makes sense, already. The XML SIG has
> made great progress but is still very evolving. The current
> snapshot releases are very easy to install, since you
> just need to unpack the archive into a dir which is in the
> Python path, and you can import the modules instantly.
> 
> If you want to follow the latest releases, I'd recommend
> to use CVS. RPM seems to be a little early.
> 
> ciao - chris
> 
> -- 
> Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
> Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
> Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
> 10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
>      we're tired of banana software - shipped green, ripens at home
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://www.python.org/mailman/listinfo/xml-sig

-- 
Oliver Andrich, RZ-Online, Schlossstrasse Str. 42, D-56068 Koblenz
Telefon: 0261-3921027 / Fax: 0261-3921033 / Web: http://rhein-zeitung.de 
Private Homepage: http://andrich.net/


From akuchlin@cnri.reston.va.us  Fri Jan 29 14:43:52 1999
From: akuchlin@cnri.reston.va.us (Andrew M. Kuchling)
Date: Fri, 29 Jan 1999 09:43:52 -0500 (EST)
Subject: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com>
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com>
Message-ID: <14001.50784.335761.791226@amarok.cnri.reston.va.us>

Paul Everitt writes:
>
>Chris wrote:
>> 	<?ztml #var arg ?>
>
>Ahh, tis nuthin better than seeing a patch accompany a proposal :^)
>
>Here's my main beef with this.  The ostensible goal of the XML syntax is
>to make it parse-able by new tools.  Unfortunately, a valid use of the
>current syntax:
>
>  <font size="<!--#var font_size-->">
>
>which is legal, would become:
>
>  <font size="<?ztml #var arg ?>">
>
>which *not* valid XML...is it?  That is, can you have markup inside
>markup?

	I don't believe so, but have CC'ed this to the XML-SIG where
the real experts hang out.  PIs have to be outside other markup; I
suspect the XML way of handling your second case would be to define an
entity:

  <font size="&arg;">

This is unfortunate for the application of HTML templating, because it
collides with the use of entities in HTML.  It also makes things
difficult because the entity would have to be declared at the
beginning of the file in the DOCTYPE declaration.  Making the
templating identical to XML, while keeping it conveniently
human-editable, may not be possible.

-- 
A.M. Kuchling			http://starship.skyport.net/crew/amk/
    "You? What are you?"
    "Me? Lady, I'm your worst nightmare -- a pumpkin with a gun."
    -- The Furies and Mervyn, in SANDMAN #66: "The Kindly Ones:10"


From Fred L. Drake, Jr." <fdrake@acm.org  Fri Jan 29 15:07:40 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 29 Jan 1999 10:07:40 -0500 (EST)
Subject: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <14001.50784.335761.791226@amarok.cnri.reston.va.us>
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com>
 <14001.50784.335761.791226@amarok.cnri.reston.va.us>
Message-ID: <14001.52924.23614.735389@weyr.cnri.reston.va.us>

Paul Everitt writes:
 >  <font size="<!--#var font_size-->">
 >
 >which is legal, would become:

  This is legal:  The "<!--#var font_size-->" is the CDATA value of
the size attribute, not a comment.

 >  <font size="<?ztml #var arg ?>">
 >
 >which *not* valid XML...is it?  That is, can you have markup inside

  The "<?ztml #var arg ?>" is a perfectly valid string value of the
size attribute, just as before.

 > 	I don't believe so, but have CC'ed this to the XML-SIG where
 > the real experts hang out.  PIs have to be outside other markup; I
 > suspect the XML way of handling your second case would be to define an
 > entity:
 > 
 >   <font size="&arg;">

  In neither SGML nor XML can markup be nested like this.  The use of
entities is the proper way to do this in either case.  Perhaps a
processing tool needs to be available which can perform "entity
expansion" for specified entity names only?


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From petrilli@amber.org  Fri Jan 29 15:26:52 1999
From: petrilli@amber.org (Christopher G. Petrilli)
Date: Fri, 29 Jan 1999 10:26:52 -0500
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <14001.52924.23614.735389@weyr.cnri.reston.va.us>; from Fred L. Drake on Fri, Jan 29, 1999 at 10:07:40AM -0500
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com> <14001.50784.335761.791226@amarok.cnri.reston.va.us> <14001.52924.23614.735389@weyr.cnri.reston.va.us>
Message-ID: <19990129102652.06893@amber.org>

On Fri, Jan 29, 1999 at 10:07:40AM -0500, Fred L. Drake wrote:
> 
> Paul Everitt writes:
>  >  <font size="<!--#var font_size-->">
>  >
>  >which is legal, would become:
> 
>   This is legal:  The "<!--#var font_size-->" is the CDATA value of
> the size attribute, not a comment.

Right this is the current scheme (note that this is one use of the DTML
command set that is embedded in an HTML tag, a lot aren't).  And this is
also how I read the sstandard.

>  >  <font size="<?ztml #var arg ?>">
>  >
>  >which *not* valid XML...is it?  That is, can you have markup inside
> 
>   The "<?ztml #var arg ?>" is a perfectly valid string value of the
> size attribute, just as before.

Wouldn't the DTD restrict the use of < inside?  I thoguht the spec
required that except inside a couple things ... like PIs... that the <
and & characters must be escaped?

>  >   <font size="&arg;">
> 
>   In neither SGML nor XML can markup be nested like this.  The use of
> entities is the proper way to do this in either case.  Perhaps a
> processing tool needs to be available which can perform "entity
> expansion" for specified entity names only?

I'm confused by what you mean here, being a newbie to XMLish things.

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From Fred L. Drake, Jr." <fdrake@acm.org  Fri Jan 29 15:40:22 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 29 Jan 1999 10:40:22 -0500 (EST)
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <19990129102652.06893@amber.org>
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com>
 <14001.50784.335761.791226@amarok.cnri.reston.va.us>
 <14001.52924.23614.735389@weyr.cnri.reston.va.us>
 <19990129102652.06893@amber.org>
Message-ID: <14001.54886.704665.224130@weyr.cnri.reston.va.us>

Christopher G. Petrilli writes:
 > Wouldn't the DTD restrict the use of < inside?  I thoguht the spec
 > required that except inside a couple things ... like PIs... that the <
 > and & characters must be escaped?

  Hmm... not the DTD, but you got me:  the XML spec may well restrict
the use of < and & in quoted attribute values.  While avoiding some of 
the delimiter-in-context rules from SGML for the benefit of parser
implementors, we end up with some ugly markup.  ;-(

 > >  >   <font size="&arg;">
 > > 
 > >   In neither SGML nor XML can markup be nested like this.  The use of
 > > entities is the proper way to do this in either case.  Perhaps a
 > > processing tool needs to be available which can perform "entity
 > > expansion" for specified entity names only?
 > 
 > I'm confused by what you mean here, being a newbie to XMLish things.

  I meant that '<foo bar="<!-- ... -->">' (SGML this time!) did not
contain nested markup.  (Same for the PI in an attribute value.)
  '<foo bar="&arg;">' does contain nested markup, but not nested
structure.
  My thought was that a tool could be written which would convert:

	<!DOCTYPE thing PUBLIC "..." [
		<!ENTITY frob CDATA "replacement text">
	]>
	<thing>
	  &frob;
	  &amp;
	</thing>

into this:

	<!DOCTYPE thing PUBLIC "...">
	<thing>
	  replacement text
	  &amp;
	</thing>

  Such a tool could perform expansion on either all the entities
defined in the internal subset (the stuff in [ ... ] in the DOCTYPE
declaration), or allow the user to specify a list of names (and
possibly values) from another source.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From petrilli@amber.org  Fri Jan 29 15:43:23 1999
From: petrilli@amber.org (Christopher G. Petrilli)
Date: Fri, 29 Jan 1999 10:43:23 -0500
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <14001.54886.704665.224130@weyr.cnri.reston.va.us>; from Fred L. Drake on Fri, Jan 29, 1999 at 10:40:22AM -0500
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com> <14001.50784.335761.791226@amarok.cnri.reston.va.us> <14001.52924.23614.735389@weyr.cnri.reston.va.us> <19990129102652.06893@amber.org> <14001.54886.704665.224130@weyr.cnri.reston.va.us>
Message-ID: <19990129104323.11790@amber.org>

On Fri, Jan 29, 1999 at 10:40:22AM -0500, Fred L. Drake wrote:
> 
>   I meant that '<foo bar="<!-- ... -->">' (SGML this time!) did not
> contain nested markup.  (Same for the PI in an attribute value.)

So my implementation of the <?ztml ... ?> is acceptable under XML
guidelines?  That was how I interpreted it, but gods only know!

>   '<foo bar="&arg;">' does contain nested markup, but not nested
> structure.
>   My thought was that a tool could be written which would convert:

My head just exploded Thank you Very Much :-)

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From Fred L. Drake, Jr." <fdrake@acm.org  Fri Jan 29 15:49:06 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 29 Jan 1999 10:49:06 -0500 (EST)
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <19990129104323.11790@amber.org>
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com>
 <14001.50784.335761.791226@amarok.cnri.reston.va.us>
 <14001.52924.23614.735389@weyr.cnri.reston.va.us>
 <19990129102652.06893@amber.org>
 <14001.54886.704665.224130@weyr.cnri.reston.va.us>
 <19990129104323.11790@amber.org>
Message-ID: <14001.55410.760735.96752@weyr.cnri.reston.va.us>

Christopher G. Petrilli writes:
 > So my implementation of the <?ztml ... ?> is acceptable under XML
 > guidelines?  That was how I interpreted it, but gods only know!

  (I didn't catch any of the discussion before Andrew CC'd the
XML-SIG, so I think I'm missing some of the context here.)
  What you probably want to do is to pass an example that uses the PI
syntax in all situations that you intend to support (including in
attribute values if you want that), and pass it through a validating
parser.  If it complains, you'll know what's broken.  If it doesn't,
then go ahead and use it.  I'd rather see PI syntax used over comment
syntax for either SGML or XML for this sort of processing.

 > My head just exploded Thank you Very Much :-)

  You're welcome.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From petrilli@amber.org  Fri Jan 29 15:53:40 1999
From: petrilli@amber.org (Christopher G. Petrilli)
Date: Fri, 29 Jan 1999 10:53:40 -0500
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <14001.55410.760735.96752@weyr.cnri.reston.va.us>; from Fred L. Drake on Fri, Jan 29, 1999 at 10:49:06AM -0500
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com> <14001.50784.335761.791226@amarok.cnri.reston.va.us> <14001.52924.23614.735389@weyr.cnri.reston.va.us> <19990129102652.06893@amber.org> <14001.54886.704665.224130@weyr.cnri.reston.va.us> <19990129104323.11790@amber.org> <14001.55410.760735.96752@weyr.cnri.reston.va.us>
Message-ID: <19990129105340.15248@amber.org>

On Fri, Jan 29, 1999 at 10:49:06AM -0500, Fred L. Drake wrote:
> 
> Christopher G. Petrilli writes:
>  > So my implementation of the <?ztml ... ?> is acceptable under XML
>  > guidelines?  That was how I interpreted it, but gods only know!
> 
>   (I didn't catch any of the discussion before Andrew CC'd the
> XML-SIG, so I think I'm missing some of the context here.)
>   What you probably want to do is to pass an example that uses the PI
> syntax in all situations that you intend to support (including in
> attribute values if you want that), and pass it through a validating
> parser.  If it complains, you'll know what's broken.  If it doesn't,
> then go ahead and use it.  I'd rather see PI syntax used over comment
> syntax for either SGML or XML for this sort of processing.

Well, then I'll sit down and write a test suite as soon as I get the
brainpower back from your exploding my head, and we can see what
explodes and what doesn't.  Also, I'm going to tweek the syntax since
everyone seems to want to get rid of some vestigal old pieces...

Also, I'm not sure it's intended to BE XML, more accurately it's
intended to LOOK like XML to an XML editor, the move to full XML for
this could be troublesome... for exmaple:

<?ztml in variable ?> <!-- iterate over a sequence -->
  <?ztml if sequence-start ?>
    Print something here.
  <?ztml /if ?>

  <?ztml var foo ?> <!--Print a variable from the sequence itterated -->
<?ztml /in ?>

Yes I realise that XML has it's own constructs for doing things like
this, BUT ... what I'm trying to do is create a migration path, and move
it to something that starts to LOOK like XML, so that people using
<?ztml var favorite-editor ?> can use it, and not have to worry about
the troublesome side-effects of putting logic in comment code.

Please no religious responses though :-) 

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From larsga@ifi.uio.no  Fri Jan 29 16:01:23 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 29 Jan 1999 17:01:23 +0100
Subject: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <14001.50784.335761.791226@amarok.cnri.reston.va.us>
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com> <14001.50784.335761.791226@amarok.cnri.reston.va.us>
Message-ID: <wkaez25ar0.fsf@ifi.uio.no>

* Paul Everitt
| 
| <font size="<!--#var font_size-->">
| [...]
| <font size="<?ztml #var arg ?>">
| 
| which *not* valid XML...is it?

Neither of these are well-formed XML, since '<'s are not allowed in
attribute values. The spec is less clear than it ought to be on this[1],
perhaps, but xmlproc, XP, Lark and the Sun XML parser are all in
agreement that this isn't allowed. 

AElfred allows it, but then some checks have been left out of AElfred,
ostensibly for class file size reasons.

| That is, can you have markup inside markup?

No. Even if you write 

  <font size="&lt;?ztml #var arg ?>">

the PI in the attribute won't be recognized as one.
 
However, not knowing Zope I don't think this is fatal if Zope
substitutes this before any XML/HTML parsers see the result. If you're
trying to use XML/HTML/SGML syntax for a preprocessor then maybe that
isn't the way to go.

* Andrew M. Kuchling
|
| I don't believe so, but have CC'ed this to the XML-SIG where the
| real experts hang out.  PIs have to be outside other markup; I
| suspect the XML way of handling your second case would be to define
| an entity:
| 
|   <font size="&arg;">

This is right, yes.

--Lars M.

[1] The relevant part is a WFC to production 41 in section 3.1.


From Fred L. Drake, Jr." <fdrake@acm.org  Fri Jan 29 16:17:59 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 29 Jan 1999 11:17:59 -0500 (EST)
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <19990129105340.15248@amber.org>
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com>
 <14001.50784.335761.791226@amarok.cnri.reston.va.us>
 <14001.52924.23614.735389@weyr.cnri.reston.va.us>
 <19990129102652.06893@amber.org>
 <14001.54886.704665.224130@weyr.cnri.reston.va.us>
 <19990129104323.11790@amber.org>
 <14001.55410.760735.96752@weyr.cnri.reston.va.us>
 <19990129105340.15248@amber.org>
Message-ID: <14001.57143.185047.67235@weyr.cnri.reston.va.us>

Christopher G. Petrilli writes:
 > Also, I'm not sure it's intended to BE XML, more accurately it's
 > intended to LOOK like XML to an XML editor, the move to full XML for
 > this could be troublesome... for exmaple:

  If an XML editor is going to handle it, it better be XML!  If it
looks like XML, someone will want to use an editor for it, so....


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191


From petrilli@amber.org  Fri Jan 29 16:21:23 1999
From: petrilli@amber.org (Christopher G. Petrilli)
Date: Fri, 29 Jan 1999 11:21:23 -0500
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <14001.57143.185047.67235@weyr.cnri.reston.va.us>; from Fred L. Drake on Fri, Jan 29, 1999 at 11:17:59AM -0500
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com> <14001.50784.335761.791226@amarok.cnri.reston.va.us> <14001.52924.23614.735389@weyr.cnri.reston.va.us> <19990129102652.06893@amber.org> <14001.54886.704665.224130@weyr.cnri.reston.va.us> <19990129104323.11790@amber.org> <14001.55410.760735.96752@weyr.cnri.reston.va.us> <19990129105340.15248@amber.org> <14001.57143.185047.67235@weyr.cnri.reston.va.us>
Message-ID: <19990129112123.13500@amber.org>

On Fri, Jan 29, 1999 at 11:17:59AM -0500, Fred L. Drake wrote:
> 
> Christopher G. Petrilli writes:
>  > Also, I'm not sure it's intended to BE XML, more accurately it's
>  > intended to LOOK like XML to an XML editor, the move to full XML for
>  > this could be troublesome... for exmaple:
> 
>   If an XML editor is going to handle it, it better be XML!  If it
> looks like XML, someone will want to use an editor for it, so....

Well, my goal has not been to convert DTML to an XML, but to make it
more LIKE XML, something that is familiar, and something that wouldn't
look out of place.  Honestly, I do not expect people to try and write
HTML+ZTML with an XML tool, were such a beast to actually end up
existing in the hands of a normal human... 

Chris
--
| Christopher Petrilli
| petrilli@amber.org


From petrilli@amber.org  Fri Jan 29 16:23:32 1999
From: petrilli@amber.org (Christopher G. Petrilli)
Date: Fri, 29 Jan 1999 11:23:32 -0500
Subject: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <wkaez25ar0.fsf@ifi.uio.no>; from Lars Marius Garshol on Fri, Jan 29, 1999 at 05:01:23PM +0100
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com> <14001.50784.335761.791226@amarok.cnri.reston.va.us> <wkaez25ar0.fsf@ifi.uio.no>
Message-ID: <19990129112332.41792@amber.org>

On Fri, Jan 29, 1999 at 05:01:23PM +0100, Lars Marius Garshol wrote:
>  
> However, not knowing Zope I don't think this is fatal if Zope
> substitutes this before any XML/HTML parsers see the result. If you're
> trying to use XML/HTML/SGML syntax for a preprocessor then maybe that
> isn't the way to go.

Currently, and I can't speak for the future of this, but currently, Zope
is designed to parse DTML (the current syntax, using comments) into pure
raw HTML, and nothing else... it's not intended to go to XML/SGML, and
quite honestly, I don't think it would be a good fit for that.  What it
is, quite honestly, is a tiny little scripting ability (like PHP), not a
full blown mark-up language.  I believe PHP also uses <?php ?> as it's
syntax, and I've not seen any huge explosions of fire from that one.

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From pharris@forfree.at  Fri Jan 29 16:23:03 1999
From: pharris@forfree.at (Phil Harris)
Date: Fri, 29 Jan 1999 16:23:03 -0000
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
Message-ID: <01ec01be4ba3$bc4abe40$5c773fc1@ml.uwcm.ac.uk>

Surely, xml would allow <'s and >'s within quoted strings?

if not, boy is that weird!


----- Original Message ----- 
From: Lars Marius Garshol <larsga@ifi.uio.no>
To: <xml-sig@python.org>; <zope@zope.org>
Sent: Friday, January 29, 1999 4:01 PM
Subject: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code


>
>* Paul Everitt
>| 
>| <font size="<!--#var font_size-->">
>| [...]
>| <font size="<?ztml #var arg ?>">
>| 
>| which *not* valid XML...is it?
>
>Neither of these are well-formed XML, since '<'s are not allowed in
>attribute values. The spec is less clear than it ought to be on this[1],
>perhaps, but xmlproc, XP, Lark and the Sun XML parser are all in
>agreement that this isn't allowed. 
>
>AElfred allows it, but then some checks have been left out of AElfred,
>ostensibly for class file size reasons.
>
>| That is, can you have markup inside markup?
>
>No. Even if you write 
>
>  <font size="&lt;?ztml #var arg ?>">
>
>the PI in the attribute won't be recognized as one.
> 
>However, not knowing Zope I don't think this is fatal if Zope
>substitutes this before any XML/HTML parsers see the result. If you're
>trying to use XML/HTML/SGML syntax for a preprocessor then maybe that
>isn't the way to go.
>
>* Andrew M. Kuchling
>|
>| I don't believe so, but have CC'ed this to the XML-SIG where the
>| real experts hang out.  PIs have to be outside other markup; I
>| suspect the XML way of handling your second case would be to define
>| an entity:
>| 
>|   <font size="&arg;">
>
>This is right, yes.
>
>--Lars M.
>
>[1] The relevant part is a WFC to production 41 in section 3.1.
>
>
>_______________________________________________
>Zope maillist  -  Zope@zope.org
>http://www2.zope.org/mailman/listinfo/zope
>


From larsga@ifi.uio.no  Fri Jan 29 16:35:39 1999
From: larsga@ifi.uio.no (Lars Marius Garshol)
Date: 29 Jan 1999 17:35:39 +0100
Subject: [XML-SIG] Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code
In-Reply-To: <01ec01be4ba3$bc4abe40$5c773fc1@ml.uwcm.ac.uk>
References: <01ec01be4ba3$bc4abe40$5c773fc1@ml.uwcm.ac.uk>
Message-ID: <wk679q595w.fsf@ifi.uio.no>

* Phil Harris
|
| Surely, xml would allow <'s and >'s within quoted strings?

It does not, unfortunately. Well, you can have them in entities, but
if you use those entities in the wrong places then you're not
well-formed.
 
| if not, boy is that weird!

It might be to keep people from thinking that <foo/> inside an
attribute value is an element instead of just a string that looks like
an element.

--Lars M.


From paul@prescod.net  Fri Jan 29 16:34:12 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 29 Jan 1999 10:34:12 -0600
Subject: [XML-SIG] RE: [Zope] - XML-style DTML code
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com>
 <14001.50784.335761.791226@amarok.cnri.reston.va.us> <14001.52924.23614.735389@weyr.cnri.reston.va.us>
Message-ID: <36B1E304.8E5165CE@prescod.net>

"Fred L. Drake" wrote:
> 
>   In neither SGML nor XML can markup be nested like this.  The use of
> entities is the proper way to do this in either case.  Perhaps a
> processing tool needs to be available which can perform "entity
> expansion" for specified entity names only?

I discussed this a couple of months ago on the Zope list. I suggested that
they use XSL template syntax. It's more verbose but it separates the
levels more cleanly. The syntax for doing attributes would look something
like this:

<FONT>
   <xsl:attribute name="size">6</xsl:attribute>
   blah blah blah
</FONT>

The Zope equivalent would be:

<FONT>
   <zope:attribute name="size">6</zope:attribute>
   blah blah blah
</FONT>

IMHO, the current Zope syntax cannot survive into the "XML age." People
will want to author their templates in XML editors and Zope's illegal
syntax will prevent this.

 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

Don't you know that the smart bombs are so clever, they only kill 
bad people."
	- http://www.boingo.com/lyrics/WarAgain.html


From co@daisybytes.su.uunet.de  Fri Jan 29 17:00:29 1999
From: co@daisybytes.su.uunet.de (Carsten Oberscheid)
Date: Fri, 29 Jan 1999 18:00:29 +0100
Subject: [XML-SIG] AW: [Zope] - XML-style DTML code
Message-ID: <01BE4BB1.42EC9C90.co@daisybytes.su.uunet.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>
> Paul Everitt writes:
> >
> >Chris wrote:
> >> 	<?ztml #var arg ?>
> >
> >Ahh, tis nuthin better than seeing a patch accompany a proposal :^)
> >
> >Here's my main beef with this.  The ostensible goal of the XML syntax is
> >to make it parse-able by new tools.  Unfortunately, a valid use of the
> >current syntax:
> >
> >  <font size="<!--#var font_size-->">
> >
> >which is legal, would become:

Sorry, I think this ain't legal, too. It's ok with sgml (at least nsgmls 
doesn't complain), but the XML specs say you can't use "<" inside attribute 
values at all.

> >
> >  <font size="<?ztml #var arg ?>">
> >
> >which *not* valid XML...is it?  That is, can you have markup inside
> >markup?
>
> 	I don't believe so, but have CC'ed this to the XML-SIG where
> the real experts hang out.  PIs have to be outside other markup; I
> suspect the XML way of handling your second case would be to define an
> entity:
>
>   <font size="&arg;">
>
> This is unfortunate for the application of HTML templating, because it
> collides with the use of entities in HTML.  It also makes things
> difficult because the entity would have to be declared at the
> beginning of the file in the DOCTYPE declaration.  Making the
> templating identical to XML, while keeping it conveniently
> human-editable, may not be possible.
>

What about this:

   <?ztml store("var arg") ?><font size="&ztml;">

where &ztml; is a dummy entity declared once in the DTD. This should be valid 
XML.
The DTML engine then interprets the PI as "I store this string as a DTML 
command, then next time I encounter &ztml; I replace it with the results of the 
DTML command".

I admit that this is less editable/readable than the current DTML syntax, but 
it's quite close, especially if the "store" PI is kept close to the &ztml; 
"placeholder". For the "simple" case of DTML commands within character data 
Chris' proposal still works:

   <P> ...plain text ... <?ztml "var arg" ?>  ... </P>

without the "cmd" assignment can be "executed" and replaced immediately without 
the entity stunt, and it is valid XML.


Regards

.co.


+------------------------------------------------------- daisy bytes! --------+
 Carsten Oberscheid
 co@daisybytes.su.uunet.de                        digital document processing
 http://www.pweb.de/daisybytes.su                     electronic publishing

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 5.5.3i for non-commercial use <http://www.pgpi.com>

iQA/AwUBNrHbHowjR4jmR8/DEQKZpgCguMJhCDXh/sHIcP+uCeqz3PpF/PMAoP4U
btpPwlkRa66yQC9vahx904oU
=ibSb
-----END PGP SIGNATURE-----


From tismer@appliedbiometrics.com  Fri Jan 29 17:00:56 1999
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Fri, 29 Jan 1999 18:00:56 +0100
Subject: [XML-SIG] Please stop the cross posting
Message-ID: <36B1E948.42CE8FE3@appliedbiometrics.com>

Friends,

it is ok to cross post things which belong to two mailing
lists. But can we *please* take care about the subject lines?
I get crazy when I have to read 

   Re: [Zope] - Re: [XML-SIG] RE: [Zope] - XML-style DTML code

Is it possible to always reply from the sig where the 
origin was?

I will also propose to change Mailman to handle this
in a better way. My own patched version on Starship
never prepends the list name if it can be matched in the
"re" already.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101   :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgp.ai.mit.edu/
     we're tired of banana software - shipped green, ripens at home


From paul@prescod.net  Fri Jan 29 16:35:59 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 29 Jan 1999 10:35:59 -0600
Subject: [XML-SIG] RE: [Zope] - XML-style DTML code
References: <613145F79272D211914B0020AFF64019049E8C@gandalf.digicool.com>
 <14001.50784.335761.791226@amarok.cnri.reston.va.us> <14001.52924.23614.735389@weyr.cnri.reston.va.us>
Message-ID: <36B1E36F.9A5AE072@prescod.net>

"Fred L. Drake" wrote:
> 
> Paul Everitt writes:
>  >  <font size="<!--#var font_size-->">
>  >
>  >which is legal, would become:
> 
>   This is legal:  The "<!--#var font_size-->" is the CDATA value of
> the size attribute, not a comment.

That is legal SGML but not XML.

>  >  <font size="<?ztml #var arg ?>">
>  >
>  >which *not* valid XML...is it?  That is, can you have markup inside
> 
>   The "<?ztml #var arg ?>" is a perfectly valid string value of the
> size attribute, just as before.

Ditto.

>   In neither SGML nor XML can markup be nested like this.  The use of
> entities is the proper way to do this in either case.  Perhaps a
> processing tool needs to be available which can perform "entity
> expansion" for specified entity names only?

In a valid XML document, all entities must be defined in the DTD. XML does
not provide for them to be supplied by the containing application. SGML
did, but XML does not. The usual way to do this is with elements, as
described in XSL.

 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

Don't you know that the smart bombs are so clever, they only kill 
bad people."
	- http://www.boingo.com/lyrics/WarAgain.html


From bwarsaw@python.org  Fri Jan 29 17:16:48 1999
From: bwarsaw@python.org (Barry A. Warsaw)
Date: Fri, 29 Jan 1999 12:16:48 -0500 (EST)
Subject: [XML-SIG] Re: Please stop the cross posting
References: <36B1E948.42CE8FE3@appliedbiometrics.com>
Message-ID: <14001.60672.211755.185830@anthem.cnri.reston.va.us>

>>>>> "CT" == Christian Tismer <tismer@appliedbiometrics.com> writes:

    CT> I will also propose to change Mailman to handle this
    CT> in a better way. My own patched version on Starship
    CT> never prepends the list name if it can be matched in the
    CT> "re" already.

I thought Mailman already does this too.  I'll double check.  Chris,
you might want to send your patches to mailman-developers@python.org

-Barry


From dieter@handshake.de  Sun Jan 31 16:48:41 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Sun, 31 Jan 1999 17:48:41 +0100
Subject: [XML-SIG] minor BUG and Patch for "html_builder"
Message-ID: <199901311648.RAA01338@lindm.dm>

This is a multi-part MIME message.
--------------FC5583E803777E8ABB8C4995
Content-Type: text/plain; charset=iso-8859-1

"HtmlBuilder" (from the xml-0.5 distribution) goes into an
infinite loop when it encounters an empty tag explicitely
closed, e.g.:

  <LINK HREF="style.css" REL="stylesheet" TYPE="text/css"></LINK>

"HtmlWriter" generates such constructs.

A patch is appended.


Dieter

--------------FC5583E803777E8ABB8C4995
Content-Type: application/x-patch; name="html_builder.pat"

--- :html_builder.py	Tue Dec 29 10:45:25 1998
+++ html_builder.py	Sun Jan 31 16:22:35 1999
@@ -72,7 +72,7 @@
 
 		while self.stack:
 			if tag in self.empties:
-				continue
+				break
 			start_tag = self.stack[-1]
 			del self.stack[-1]
 			Builder.endElement(self, start_tag)


--------------FC5583E803777E8ABB8C4995--