From D.Hoeppner@tu-bs.de  Tue Aug  3 13:22:04 1999
From: D.Hoeppner@tu-bs.de (=?ISO-8859-1?Q?Dierk_H=F6ppner?=)
Date: Tue, 3 Aug 1999 14:22:04 +0200
Subject: [XML-SIG] SAX and HTML
Message-ID: <5D650DE026C@buch.biblio.etc.tu-bs.de>

Hello,

I want to use SAX to extract data from HTML. I began with 
modifying the example saxstats.py but it did not come very far 
because my html-sources are not well constructed xml-documents. 
Then I forced the parser to use drv_htmllib but this failed because 
HTMLParser of htmllib wants a formatter. drv_htmllib gives None 
which doesn't work of course.

Any hints what to do? Even RTFM ist welcome but please give a 
hint to a good page ;-)

greetings

Dierk Hoeppner

Braunschweig University Library
Pockelsstr. 13
D-38106 Braunschweig
Germany
Tel: +49-531-391-5066 Fax: -5836
E-Mail: d.hoeppner@tu-bs.de     


From Fred L. Drake, Jr." <fdrake@acm.org  Tue Aug  3 14:41:22 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Tue, 3 Aug 1999 09:41:22 -0400 (EDT)
Subject: [XML-SIG] SAX and HTML
In-Reply-To: <5D650DE026C@buch.biblio.etc.tu-bs.de>
References: <5D650DE026C@buch.biblio.etc.tu-bs.de>
Message-ID: <14246.61826.383361.367841@weyr.cnri.reston.va.us>

Dierk Höppner writes:
 > I want to use SAX to extract data from HTML. I began with 
 > modifying the example saxstats.py but it did not come very far 
 > because my html-sources are not well constructed xml-documents. 
 > Then I forced the parser to use drv_htmllib but this failed because 
 > HTMLParser of htmllib wants a formatter. drv_htmllib gives None 
 > which doesn't work of course.

Dierk,
  Try changing drv_htmllib to use a formatter.NullFormatter instance.
Let us know how that works; if a simple fix to drv_htmllib does the
trick, I think we can do that!


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From kantel@mpiwg-berlin.mpg.de  Tue Aug  3 16:38:04 1999
From: kantel@mpiwg-berlin.mpg.de (=?iso-8859-1?Q?J=F6rg?= Kantel)
Date: Tue, 3 Aug 1999 17:38:04 +0200
Subject: [XML-SIG] XML to XML Conversation via SAX
Message-ID: <l03130301b3ccba8167eb@[194.94.223.157]>

(maybe a very stupid question ;-)

We have a collection of (very) large XML files that we have to convert to
XML files. That sounds stupid but we have to insert or to update in
different tags attributes concerning on the contents of the files.

I thought I could do that with Python and the saxlib but I run in a problem
by writing the attributes back (in other words: I'm to stupid to use the
saxlib.AttributeList-methods. I tried the following (mostly inspired by the
saxlib tutorial ;-)

#!usr/local/bin/python

from xml.sax import saxlib
import string

class WriteTags(saxlib.HandlerBase, saxlib.AttributeList):

   def makeStartTag(self, name):
	tagText = "<" + name
	numbers = saxlib.AttributeList().getLength()
	print numbers
	if numbers:
		for i in numbers:
		tagName = saxlib.AttributeList().getName(i)
		tagValue = saxlib.AttributeList().getValue(i)
			print tagName
			print tagValue
		tagText = tagText + " " + tagName + "=\"" + tagValue + "\" "
		tagText = tagText + ">"
	return tagText

(...)


makeStartTag was called in the startElement-method, but numbers (and
therefore tagName and tagValue too) returns always "none". I'm really not
sure how to connect the AttributeList-methods with the HandlerBase. Any
hints are wellcome.

TIA
J"org

--
--------------------------------------------------------------------------
J"org Kantel               Max-Planck-Institute for the History of Science
Computer-Department                             kantel@mpiwg-berlin.mpg.de
Wilhelmstr. 44     http://www.mpiwg-berlin.mpg.de/staff/kantel/kantel.html
D-10117 Berlin     fon: +4930-22667-220               fax: +4930-22667-299
--------------------------------------------------------------------------


From paul@prescod.net  Tue Aug  3 21:05:33 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 03 Aug 1999 15:05:33 -0500
Subject: [XML-SIG] Python DOM Unification -- level
References: <3724CC49.AAB857A5@prescod.net>
 <14116.55422.189139.235663@amarok.cnri.reston.va.us>
 <3724E2A1.62223458@prescod.net> <14117.52230.551462.836651@weyr.cnri.reston.va.us> <37A019E2.B334709D@FourThought.com>
Message-ID: <37A74B8D.9C7E38C0@prescod.net>

Mike Olson wrote:
> 
>     We're  gonna have some free time in August to do some major work on 4DOM and
> 4XSLT, including getting 4XSLT up to date with the latest XSLT draft and
> breaking out the patterns into 4XPath (or some clever name).

Cool!

>     I wanted to bring up the DOM interface unification topic again as we will be
> working on 4DOM this month and may have time to experiment with some Lit/ python
> interfaces.  Last we left off, we couldn't decide how many and how lit the
> interfaces should be.  Is anyone still doing work to come up with a unified
> interface(s)?  Is it something we still want to consider? Should we
> (Fourthought) just produce a lit interface as pythonic as possible and then
> mold/wrap pydom and 4dom to meet it?

Well I think that the main issues for the pythonic interface are:

 * mappings should act as Python mappings. (in fact the only
standardized interfaces should probably be the __getitem__ stuff)

 * node lists should act as Python sequences. (ditto)

 * namespace properties should be modelled on the relevant operators in
XPath (I think that the real DOM will be copying XPath)

XPath support should probably be available both as a module and as
methods on the DOM. The module is cool because it could be made
available for any DOM. The methods are cool because they could be really
optimized for *this* DOM. Microsoft calls the XPath-using methods
"selectNodes" and "selectSingleNode". They also have "transformNode". 

Any DOM could add "simple" support by redirecting the methods to a
DOM-generic method. They could add optimized support by writing code for
the methods themselves. They could even use a mix where they call the
method for complex queries!

Many of the methods we discussed before like getChild, getText and so
forth can be done easily as queries like node.selectNodes( "text()" ),
node.selectNodes( "//text()" ) and so forth.

One issue awith selectNodes is how to count nodes. XSL mandates that
adjacent text nodes must be merged. The DOM does not (but probably
should!).

>     For my 2 cents worth, I guess I see a need for 2 interfaces.  The one
> defined by W3C and a totally pythonic interface.  Then a wrapper that can be
> used to turn a DOM compliant interface implementation into the pythonic
> interface.  ORB/ORBless I think we decided is orthagonal to this decision.

That all sounds right to me.

 Paul Prescod


From dieter@handshake.de  Tue Aug  3 22:57:15 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Tue,  3 Aug 1999 23:57:15 +0200 (CEST)
Subject: [XML-SIG] [Ann] PyXPath 0.1 -- Implementation of the XPath July working
 draft on top of PyDom
Message-ID: <14247.25863.875238.893246@lindm.dm>

I have just released PyXPath 0.1, an implementation
of the XPath July working draft on top of PyDOM.

For more information and download, see

	URL:http://www.handshake.de/~dieter/pyprojects/pyxpath.html


- Dieter


From D.Hoeppner@tu-bs.de  Wed Aug  4 09:03:33 1999
From: D.Hoeppner@tu-bs.de (=?ISO-8859-1?Q?Dierk_H=F6ppner?=)
Date: Wed, 4 Aug 1999 10:03:33 +0200
Subject: [XML-SIG] SAX and HTML - success!??
Message-ID: <5EA03460720@buch.biblio.etc.tu-bs.de>

Fred,

you mentioned pylibs.py. I played around a little (far from 
understanding the whole thing). Perhaps I found a solution: In 
pylibs.SGMLParsers I just added

    def handle_starttag(self,tag,method,attributes): ####
        "Handles start tags."
        attrs={}
        for (a,v) in attributes:
            attrs[a]=v

        self.doc_handler.startElement(tag,saxutils.AttributeMap(attrs))

The demo xml/demo/saxsaxstats.py worked for me with one little 
change. I changed the line

p=saxexts.make_parser()

to

p=saxexts.make_parser("xml.sax.drivers.drv_htmllib")

I dont't know if this was the solution but perhaps it is a hint for your 
search.

Thanks for your help

Dierk

Braunschweig University Library
Pockelsstr. 13
D-38106 Braunschweig
Germany
Tel: +49-531-391-5066 Fax: -5836
E-Mail: d.hoeppner@tu-bs.de     


From Fred L. Drake, Jr." <fdrake@acm.org  Wed Aug  4 15:33:49 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 4 Aug 1999 10:33:49 -0400 (EDT)
Subject: [XML-SIG] SAX and HTML - success!??
In-Reply-To: <5EA03460720@buch.biblio.etc.tu-bs.de>
References: <5EA03460720@buch.biblio.etc.tu-bs.de>
Message-ID: <14248.20301.683337.109525@weyr.cnri.reston.va.us>

Dierk Höppner writes:
 > you mentioned pylibs.py. I played around a little (far from 
 > understanding the whole thing). Perhaps I found a solution: In 
 > pylibs.SGMLParsers I just added

Dierk,
  That would have been my first thing to try!  Does this solve your
immediate problems?
  If you can send along a test script (you mentioned a modified demo;
were the modifications relevant to the problem?) and example data, I
can still play with it some.  If this still looks like a good fix,
I'll commit it to the repository.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From michel.plu@cnet.francetelecom.fr  Thu Aug  5 14:14:34 1999
From: michel.plu@cnet.francetelecom.fr (PLU Michel CNET/DSM/LAN)
Date: Thu, 5 Aug 1999 15:14:34 +0200
Subject: [XML-SIG] string encoding translater
Message-ID: <B932E841DDE0D0119D2300609759036C010EF2C3@l-mhs4.lannion.cnet.fr>


here is my problem

i want to parse an IS0-8859-1 ( iso latin 1) encoded xml file . 
But when i parse it whith the python sax parser ( saxexts.make_parser)  all
strings in the resulted dom tree ( attributes value or nodes data)  are
store as utf-8 ( unicode) encoded string.

as example for an xml line as
<term  name="Matières" fatherId="root"/>

produce a node where the value of attribute name is : MatiÃ¨res

is there a way in python to translate the utf-8 encode string to orginal
iso-8859

	thanks for answers

			Michel


From sean@digitome.com  Thu Aug 12 15:26:21 1999
From: sean@digitome.com (Sean Mc Grath)
Date: Thu, 12 Aug 1999 15:26:21 +0100
Subject: [XML-SIG] pyDOM NamedNodeMap - bug report and problem
Message-ID: <3.0.6.32.19990812152621.00965710@gpo.iol.ie>

I am trying to print out attribute name,value pairs using pyDOM and
having some problems. Here is the relevant part of my code:

for n in doc.documentElement.childNodes:
	if n.nodeType == core.ELEMENT_NODE:
		attrs = n.attributes
		for i in range (0,attrs.get_length()):
			attr = attrs.item(i)
			print attr.name
			print attr.value


The item() method initially did not work, returning an unsubscriptable object
error. This is a buglet in NamedNodeMap:

Before fix:
    # Additional methods specified in the DOM Recommendation
    def item(self, index):
        return self.data.values[ index ]

After fix: (parenthesis in call to values method of data dictionary)
    # Additional methods specified in the DOM Recommendation
    def item(self, index):
        return self.data.values()[ index ]

I am now getting my attribute names through just fine but all my attribute
values are None. There are definitely there in the DOM structure because
toxml puts 'em out just fine. Ideas?

regards,


<Sean URI="http://www.digitome.com/sean.html">
Developers Day Co-Chair, 9th International World Wide Web Conference
16-19, May, 2000, Amsterdam, The Netherlands http://www9.org
</Sean>


From position.Offers.USA@Freelance.com  Thu Aug 12 16:10:49 1999
From: position.Offers.USA@Freelance.com (position.Offers.USA@Freelance.com)
Date: Thu, 12 Aug 1999 17:10:49 +0200
Subject: [XML-SIG] Freelance Technologies
Message-ID: <OFD42121AD.1E5D5E40-ONC12567CB.0051F3F3@gdfi.com>

Madam, Sir,

We noted your e-mail at the internet address "http://www.versions.com/"
The  mission of Freelance Technologies is to be the commercial task force
for independent contractors. To achieve this, we are creating a sales
network in the major cities of the United States. This network will help
you to find the most interesting positions in IT consulting in the best
companies. We will help to promote your skills and your career.


You,  as an independent consultant, will have to pay no fee or sign any
exclusivity contract with  Freelance  Technologies.

Visit our web site at http://www.freelance.com to find out more about our
services. It is at your disposal and is the professional web site of the
independent contractor.

There, you can:

- Find a list of available projects
- Communicate with our sales persons via e-mail or get their contact
details
- Mail us your resume
- Find out about other services that we can offer you (accounting,
training,  insurance,  internet  links, networking...)
- Subscribe to our mailing list to receive a daily e-mail listing of new
projects

You can also mail you resume to the following addresses :

     - contactUSA@freelance.com

     -  Freelance Technologies
        75 Maiden Lane, suite 507
        New York, NY 10038


Please  don't  hesitate  to contact us if you have questions or comments or
to suggest ideas or services you would like to see on the website.

Also feel free to communicate our internet address to your colleagues.

Thanking you,


Yann Marteil, President of Freelance Technologies USA

-----------------------------------------------------------

FREELANCE TECHNOLOGIES, the commercial task force of the Independant
Consultant

Freelance Technologies
75 Maiden Lane, suite 507
New York, NY 10038
Tel : 212 402 68 68 - Fax : 212 402 68 69
http://www.freelance.com

-----------------------------------------------------------
...

We are fully aware that this document has been mailed without your request.
We apologize if you are not concerned by this message.
If you don't reply with us you won't receive anything from us again.


From akuchlin@mems-exchange.org  Fri Aug 13 03:44:40 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Thu, 12 Aug 1999 22:44:40 -0400 (EDT)
Subject: [XML-SIG] pyDOM NamedNodeMap - bug report and problem
In-Reply-To: <3.0.6.32.19990812152621.00965710@gpo.iol.ie>
References: <3.0.6.32.19990812152621.00965710@gpo.iol.ie>
Message-ID: <14259.34456.576862.738920@amarok.cnri.reston.va.us>

Sean Mc Grath writes:
>I am now getting my attribute names through just fine but all my attribute
>values are None. There are definitely there in the DOM structure because
>toxml puts 'em out just fine. Ideas?

       Try this patch; without the NODE_CLASS stuff in the patch, the
item() method returns a _node instance, which shouldn't be exposed to
the user.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Welcome, one and all, to the far-flung future of -- 1965!
    -- Zot, in ZOT! #1


Index: core.py
===================================================================
RCS file: /home/cvsroot/xml/dom/core.py,v
retrieving revision 1.46
diff -C2 -r1.46 core.py
*** core.py	1999/05/08 20:18:18	1.46
--- core.py	1999/08/13 02:26:34
***************
*** 259,265 ****
      # Additional methods specified in the DOM Recommendation
      def item(self, index):
!         return self.data.values[ index ]
  
      getNamedItem = UserDict.UserDict.__getitem__
--- 259,266 ----
      # Additional methods specified in the DOM Recommendation
      def item(self, index):
!         n = self.data.values()[ index ]
!         return NODE_CLASS[ n.type ](n, self._document )
  
      getNamedItem = UserDict.UserDict.__getitem__


From dieter@handshake.de  Thu Aug 12 18:06:22 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Thu, 12 Aug 1999 19:06:22 +0200
Subject: [XML-SIG] XML 0.5.1 bug: 'amp' character reference not handled correctly by "HtmlBuilder/HtmlWriter"
Message-ID: <199908121706.TAA00810@lindm.dm>

"HtmlBuilder" translates '&amp;' into an entity reference.
This does not follow the DOM spec. It specifies that
character references are expected to be expanded by the
HTML/XML processor.

"XmlWriter/HtmlWriter" does not output the 'amp' entity reference.
This, obviously, is a bug in "XmlWriter/HtmlWriter".
By the way, processing instructions are not output, too.

I have fixed my "&amp;" problem by adding "amp" to the
"expand_entities" tuple in "HtmlBuilde". This, however,
is not a general solution.

- Dieter


From dieter@handshake.de  Fri Aug 13 07:41:24 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Fri, 13 Aug 1999 08:41:24 +0200 (CEST)
Subject: [XML-SIG] pyDOM NamedNodeMap - bug report and problem
In-Reply-To: <3.0.6.32.19990812152621.00965710@gpo.iol.ie>
References: <3.0.6.32.19990812152621.00965710@gpo.iol.ie>
Message-ID: <14259.47828.462367.393783@lindm.dm>

Hello Sean

Sean Mc Grath writes:
 > After fix: (parenthesis in call to values method of data dictionary)
 >     # Additional methods specified in the DOM Recommendation
 >     def item(self, index):
 >         return self.data.values()[ index ]
 > 
 > I am now getting my attribute names through just fine but all my attribute
 > values are None. There are definitely there in the DOM structure because
 > toxml puts 'em out just fine. Ideas?
For some unknown reason (a bug, I think),
the real attribute information is in the "children[0]" attribute
of the returned "item".

You may try:
 >     def item(self, index):
 >         return self.data.values()[ index ].children[0]

But I am not sure, whether this will work for all NamedNodeMap's.
And it is probably not the correct solution, because it returns
a "_nodeData" instance rather than an "Attr" instance.

Almost surely, the correct implementation is:
 >     def item(self, index):
 >         return Attr(self.data.values()[ index ],self._document)

- Dieter


From D.Hoeppner@tu-bs.de  Fri Aug 13 07:53:26 1999
From: D.Hoeppner@tu-bs.de (=?ISO-8859-1?Q?Dierk_H=F6ppner?=)
Date: Fri, 13 Aug 1999 08:53:26 +0200
Subject: [XML-SIG] entity munching monster tracked down!
Message-ID: <6C0E0697493@buch.biblio.etc.tu-bs.de>

Dear SIGgers,

when playing around with the xml-package I sent an ordinary html 
file through a slightly modified xml/demo/dom/html2html.py. The 
output was html, too. Almost, because except '<', '&' and '>' all 
other entities vanished :-(( You can see it in the output of the 
original html2html. The data contains the word 'trouv&eacute;s' 
which in the html output becomes 'trouvs'

My solution (the experts of you have decide if this was alright): 

xml.dom.writer.HtmlWriter derives from xml.dom.writer.XmlWriter 
which has a method doText. The last line says 

self.stream.write(escape(data))

xml.utils.escape() just 'escapes' thos three entities mentiond above. 
But it may be called with an extra table for entities to be converted. 
I modified XmlWriter a little: I added

self.escapes={}

to __init__()

and in doText the last line now is

self.stream.write(escape(data, self.escapes))

In html2html I now build the almost invers version of 
htmlentitydefs.entitydefs but leave out <, &, and >. (My routine 
MakeEscapes()) The lines

w = HtmlWriter()
w.write(b.document)

became

w = HtmlWriter()
w.escapes = MakeEscapes()
w.write(b.document)

It works but not perfectly. In another text I had an image

<IMG ... ALT="N&auml;chster" ...>

which becomes

<IMG ... ALT="N&amp;auml;chster" ...>

The solution for this problem I didn't found yet :-(

Greetings

Dierk Hoeppner

Universitaetsbibliothek
Pockelsstr. 13
D-38106 Braunschweig
Germany
Tel: +49-531-391-5066 Fax: -5836
E-Mail: d.hoeppner@tu-bs.de     


From akuchlin@mems-exchange.org  Fri Aug 13 14:41:20 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 13 Aug 1999 09:41:20 -0400 (EDT)
Subject: [XML-SIG] pyDOM NamedNodeMap - bug report and problem
In-Reply-To: <14259.47828.462367.393783@lindm.dm>
References: <3.0.6.32.19990812152621.00965710@gpo.iol.ie>
 <14259.47828.462367.393783@lindm.dm>
Message-ID: <14260.8320.385791.157498@amarok.cnri.reston.va.us>

Dieter Maurer writes:
>For some unknown reason (a bug, I think),
>the real attribute information is in the "children[0]" attribute
>of the returned "item".

The DOM implementation builds an internal tree of objects of class
_node; however, users of the implementation never see a _node
instance, but instead an instance of Element, Text, or whatever, that
acts as a proxy for the _node instance.  The user-visible proxy holds
the parent pointer for the node, thus avoiding creating a cycle of
references and leaking memory.  To create the proxy for a _node n, you
use NODE_CLASS[ n.type ](n, self._document ) .  As a note to users, if
you *ever* get returned a _node instance, that is a bug and should be
reported.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
The story so far: In the beginning the Universe was created. This has made a
lot of people very angry and has been widely regarded as a bad move.
    -- Douglas Adams, _The Restaurant at the End of the Universe_


From Fred L. Drake, Jr." <fdrake@acm.org  Fri Aug 13 14:59:28 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 13 Aug 1999 09:59:28 -0400 (EDT)
Subject: [XML-SIG] XML 0.5.1 bug: 'amp' character reference not handled correctly by "HtmlBuilder/HtmlWriter"
In-Reply-To: <199908121706.TAA00810@lindm.dm>
References: <199908121706.TAA00810@lindm.dm>
Message-ID: <14260.9408.396713.728418@weyr.cnri.reston.va.us>

--Apu33M+PUU
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit


Dieter Maurer writes:
 > "HtmlBuilder" translates '&amp;' into an entity reference.
 > This does not follow the DOM spec. It specifies that
 > character references are expected to be expanded by the
 > HTML/XML processor.
 > 
 > "XmlWriter/HtmlWriter" does not output the 'amp' entity reference.
 > This, obviously, is a bug in "XmlWriter/HtmlWriter".

  No, but if & is present as data, it writes out &amp;, so I think
that's OK.

 > By the way, processing instructions are not output, too.

  You you sure they're in your tree?  What I see is that they are
output, but using the XML-style syntax: <?foo bar?> instead of
<?foo bar>.
  I've checked in a fix that allows HtmlWriter to produce SGML-style
PIs.  This *doesn't* do anything to change the handling of PIs as
(target, value) tuples; this was a concept introduced in some of the
XML APIs (not even XML itself as I understand it).
  The patch to xml/dom/writer.py is attached; it also teaches the
*Lineariser classes to use cStringIO when available.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


--Apu33M+PUU
Content-Type: text/plain
Content-Description: xml/dom/writer.py patch
Content-Disposition: inline;
	filename="PATCH"
Content-Transfer-Encoding: 7bit

Index: writer.py
===================================================================
RCS file: /home/cvsroot/xml/dom/writer.py,v
retrieving revision 1.9
retrieving revision 1.10
diff -c -r1.9 -r1.10
*** writer.py	1999/04/28 02:42:19	1.9
--- writer.py	1999/08/13 13:50:18	1.10
***************
*** 124,131 ****
  class XmlLineariser(XmlWriter):
  
      def __init__(self):
!         import StringIO
!         self.buffer = StringIO.StringIO()
          XmlWriter.__init__(self, self.buffer)
  
      def linearise(self, node):
--- 124,134 ----
  class XmlLineariser(XmlWriter):
  
      def __init__(self):
!         try:
!             from cStringIO import StringIO
!         except ImportError:
!             from StringIO import StringIO
!         self.buffer = StringIO()
          XmlWriter.__init__(self, self.buffer)
  
      def linearise(self, node):
***************
*** 169,180 ****
          
          self._setNewLines(nl_dict)
  
  
  class HtmlLineariser(HtmlWriter):
  
      def __init__(self):
!         import StringIO
!         self.buffer = StringIO.StringIO()
          HtmlWriter.__init__(self, self.buffer)
  
      def linearise(self, node):
--- 172,192 ----
          
          self._setNewLines(nl_dict)
  
+     def doOtherNode(self, node):
+         if node.get_nodeType() == PROCESSING_INSTRUCTION_NODE:
+             self.stream.write("<?%s %s>" % (node.target, node.value))
+         else:
+             XmlWriter.doOtherNode(self, node)
  
+ 
  class HtmlLineariser(HtmlWriter):
  
      def __init__(self):
!         try:
!             from cStringIO import StringIO
!         except ImportError:
!             from StringIO import StringIO
!         self.buffer = StringIO()
          HtmlWriter.__init__(self, self.buffer)
  
      def linearise(self, node):

--Apu33M+PUU--


From akuchlin@mems-exchange.org  Fri Aug 13 17:11:19 1999
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 13 Aug 1999 12:11:19 -0400 (EDT)
Subject: [XML-SIG] CVS tree reorg imminent
Message-ID: <199908131611.MAA13940@amarok.cnri.reston.va.us>

A while back we discussed tidying up the directory structure of the
XML-SIG's code.  I'd like to do the rearrangement sometime this
weekend.  

	  This may disrupt the tree for a bit while things settle down
after the rearrangement, so people who follow the CVS tree should be
aware of the impending changes.  On the bright side, this should make
things much neater and allow us to simplify the installation process.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
My early and invincible love of reading, which I would not exchange for the
treasures of India...
    -- Edward Gibbon


From Jeff.Johnson@icn.siemens.com  Fri Aug 13 17:46:02 1999
From: Jeff.Johnson@icn.siemens.com (Jeff.Johnson@icn.siemens.com)
Date: Fri, 13 Aug 1999 12:46:02 -0400
Subject: [XML-SIG] XML 0.5.1 bug: 'amp' character reference not
 handled correctly by "HtmlBuilder/HtmlWriter"
Message-ID: <852567CC.005BDFE0.00@li01.lm.ssc.siemens.com>


I had similar problems a while back and came up with the following hack (nobody
seemed to think it was problem so I had to fix it myself)...  I have no idea if
this is a good fix but it seemed to fix most of my problems...

class MyHtmlBuilder(HtmlBuilder):
    def handle_charref(self, name):
     #print name
        try:
            n = string.atoi(name)
        except string.atoi_error:
            self.unknown_charref(name)
            return
        # JCJ 1999-06-11: This turns &#181; into chr(181) which when saved
        # back as HTML, is no good.
        #if not 0 <= n <= 255:
        if not 0 <= n <= 127:
            self.unknown_charref(name)
            return
        self.handle_data(chr(n))

    def unknown_charref(self, ref):
        #gLog.Warning('unknown_charref %s' % ref)
     Builder.entityref(self, '#' + ref)

    def unknown_entityref(self, ref):
        gLog.Error('unknown_entityref %s' % ref)


Dieter Maurer <dieter@handshake.de> on 08/12/99 01:06:22 PM

To:   xml-sig@python.org
cc:    (bcc: Jeff Johnson/Service/ICN)
Subject:  [XML-SIG] XML 0.5.1 bug: 'amp' character reference not handled
      correctly by "HtmlBuilder/HtmlWriter"


"HtmlBuilder" translates '&amp;' into an entity reference.
This does not follow the DOM spec. It specifies that
character references are expected to be expanded by the
HTML/XML processor.

"XmlWriter/HtmlWriter" does not output the 'amp' entity reference.
This, obviously, is a bug in "XmlWriter/HtmlWriter".
By the way, processing instructions are not output, too.

I have fixed my "&amp;" problem by adding "amp" to the
"expand_entities" tuple in "HtmlBuilde". This, however,
is not a general solution.

- Dieter


_______________________________________________
XML-SIG maillist  -  XML-SIG@python.org
http://www.python.org/mailman/listinfo/xml-sig


From dieter@handshake.de  Fri Aug 13 17:59:35 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Fri, 13 Aug 1999 18:59:35 +0200 (CEST)
Subject: [XML-SIG] XML 0.5.1 bug: 'amp' character reference not handled correctly by "HtmlBuilder/HtmlWriter"
In-Reply-To: <14260.9408.396713.728418@weyr.cnri.reston.va.us>
References: <199908121706.TAA00810@lindm.dm>
 <14260.9408.396713.728418@weyr.cnri.reston.va.us>
Message-ID: <14260.19671.741803.54779@lindm.dm>

Fred L. Drake, Jr. writes:
 > Dieter Maurer writes:
 >  > "HtmlBuilder" translates '&amp;' into an entity reference.
 >  > This does not follow the DOM spec. It specifies that
 >  > character references are expected to be expanded by the
 >  > HTML/XML processor.
 >  > 
 >  > "XmlWriter/HtmlWriter" does not output the 'amp' entity reference.
 >  > This, obviously, is a bug in "XmlWriter/HtmlWriter".
 > 
 >   No, but if & is present as data, it writes out &amp;, so I think
 > that's OK.
I do not think, it is correct.
HTML input files should contain '&amp;' rather than '&', because
'&' may yield invalid HTML code.
"HtmlBuilder" translates "&amp;" into something "HtmlWriter"
ignores. I think, this is a bug.

 >  > By the way, processing instructions are not output, too.
 > 
 >   You you sure they're in your tree?  What I see is that they are
 > output, but using the XML-style syntax: <?foo bar?> instead of
 > <?foo bar>.
In fact, I did not test it at all -- sorry!
I looked at the sources and did not see a definition for
Entity and Processing Instruction output.
Private mail with Dierk Hoeppner suggests that some magic
in XMLWriter processes entity references.
You now tell me that processing instructions are
magically processed.

Seems, that I must have a closer look at this code.

By the way, my copy of "write.py" (from the distribution tar)
has an empty "XmlWriter.doOtherNode".

Thank you for your comment
- Dieter


From Chance@hotmail.com  Sun Aug 15 03:01:20 1999
From: Chance@hotmail.com (Chance@hotmail.com)
Date: zo, 15 aug 1999 02:01:20
Subject: [XML-SIG] YOU Can make $50,000 or more in 90 Days!!!
Message-ID: <199908142327.TAA18806@python.org>

 THE PROGRAM 
  $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 
  
  
  INCREDIBLE $0 to $50,000 in 90 days!!! 
  
  
 Dear Friend, 
  
  
 You can earn $50,000 or more in next the 90 days sending e-mail. Seem 
 impossible? Read on for details. 
  
  
  "AS SEEN ON NATIONAL TV" 
  
  
 Thank you for your time and interest. This is the letter you've been 
 reading about in the news lately. Due to the popularity of this 
 letter on the Internet, a major nightly news program recently devoted 
 an entire show to the investigation of the program described below to 
 see if it really can make people money. 
  
  
 The show also investigated whether or not the program was legal. 
 Their findings proved once and for all that there are absolutely no 
 laws prohibiting the participation in the program. This has helped 
 to show people that this is a simple, harmless and fun way to make 
 some extra money at home. 
  
  
 The results of this show have been truly remarkable. So many people 
 are participating that those involved are doing much better than ever 
 before. Since everyone makes more as more people try it out, its 
 been very exciting to be a part of lately. You will understand once you 
 experience it. 
  
  
  HERE IT IS BELOW: 
  
  
  *** Print This Now For Future Reference *** 
  
  
 The following income opportunity is one you may be interested in 
 taking a look at. It can be started with VERY LITTLE investment and 
 the income return is TREMENDOUS!!! 
  
  
  $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 
  If you would like to make at least $50,000 in less than 90 days ! 
  Please read the enclosed program...THEN READ IT AGAIN!!! 
  $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ 
  
  
 THIS IS A LEGITIMATE, LEGAL, MONEY MAKING OPPORTUNITY.It does 
 not require you to come into contact with people, do any hard work, 
 and best of all, you never have to leave the house except to get the 
 mail. If you believe that someday you'll get that big break that you 
 'vebeen waiting for, THIS IS IT! Simply follow the instructions, 
 andyour dreams will come true. This multi-level e-mail order 
 marketingprogram works perfectly...100% EVERY TIME. 
  
  
 E-mail is the sales tool of the future. Take advantage of this 
 non-commercialized method of advertising NOW!!! The longer you 
 wait, the more people will be doing business using e-mail. Get 
 your piece of this action!!! 
  
  
 MULTI-LEVEL MARKETING (MLM) has finally gained respectability. 
 It is being taught in the Harvard Business School, and both Stanford 
 Research and the Wall Street Journal have stated that between 50% 
 and 65% of all goods and services will be sold through multi-level 
 methods by the mid to late 1990's. This is a Multi-Billion Dollar 
 industry and of the 500,000 millionaires in the U.S., 20% (100,000) 
 made their fortune in the last several years in MLM. Moreover, 
 statistics show 45 people become millionaires everyday through 
 Multi-Level Marketing. 
  
  
 You may have heard this story before, but over the summer Donald 
 Trump made an appearance on the David Letterman show. Dave asked 
 him what he would do if he lost everything and had to start over from 
 scratch. Without hesitating, Trump said he would find a good network 
 marketing company and get to work. The audience started to hoot and 
 boo him. He looked out at the audience and dead-panned his response: 
 "That's why I'm sitting up here and you are all sitting out there!" 
  
  
 The enclosed information is something I almost let slip through my 
 fingers. Fortunately, sometime later I re-read everything and gave 
 somethought and study to it. My name is Johnathon Rourke. Two years 
 ago, the corporation I worked at for the past twelve years down-sized and my 
 position was eliminated. After unproductive job interviews, I decided 
 to open my own business. Over the past year, I incurred many 
 unforeseen financial problems. I owed my family, friends and 
 creditors over $35,000. 
 The economy was taking a toll on my business and I just couldn't seem 
 to make ends meet. I had to refinance and borrow against my home to 
 support my family and struggling business. AT THAT MOMENT something 
 significant happened in my life and I am writing to share the 
 experience in hopes that this will change your life FOREVER 
 FINANCIALLY!!! 
  
  
 In mid December, I received this program via e-mail. Six month's 
 prior to receiving this program I had been sending away for 
 information on various business opportunities. All of the programs I 
 received, in my opinion, were not cost effective. They were either 
 too difficult for me to comprehend or the initial investment was too much 
 for me to risk to see if they would work or not. One claimed that I would 
 make a million dollars in one year...it didn't tell me I'd have to write a 
  book to make it! 
  
  
 But like I was saying, in December of 1997 I received this program. I 
 didn't send for it, or ask for it, they just got my name off a 
 mailing list.THANK GOODNESS FOR THAT!!! After reading it several times, to make 
 sure I was reading it correctly, I couldn't believe my eyes. Here was a MONEY 
 MAKING PHENOMENON. I could invest as much as I wanted to start, 
 without putting me further into debt. After I got a pencil and paper 
 and figured it out, I would at least get my money back. But like most 
 of you I was still a little skeptical and a little worried about the 
 legal aspects of it all. So I checked it out with the U.S. Post Office 
  (1-800-725-2161 24-hrs) and they confirmed that it is indeed legal! After 
  determining the program was LEGAL and NOT A CHAIN LETTER, I decided 
  "WHY NOT." 
  
  
 Initially I sent out 10,000 e-mails. It cost me about $15 for my time 
 on-line. The great thing about e-mail is that I don't need any money 
 for printing to send out the program, and because all of my orders 
 are fulfilled via e-mail, my only expense is my time. I am telling 
 you like it is I hope it doesn't turn you off, but I promised myself that I 
 would not 
 "rip-off" anyone, no matter how much money it made me. 
  
  
 In less than one week, I was starting to receive orders for REPORT #1 
 By January 13, I had received 26 orders for REPORT #1. Your goal is to 
 "RECEIVE at least 20 ORDERS FOR REPORT #1 WITHIN 2 WEEKS. IF 
 YOU DON'T, SEND OUT MORE PROGRAMS UNTIL YOU DO!" My first 
 step in making $50,000 in 90 days was done. By January 30, I had received 
 196 orders for REPORT #2. Your goal is to "RECEIVE AT LEAST 100+ ORDERS 
 FOR REPORT #2 WITHIN 2 WEEKS. IF NOT, SEND OUT MORE PROGRAMS 
 UNTIL YOU DO. ONCE YOU HAVE 100 ORDERS, 
 THE REST IS EASY, RELAX, YOU WILL MAKE YOUR $50,000 GOAL." Well, I 
 had 196 orders for REPORT #2, 96 more than I needed. So I sat back 
 and relaxed. By March 1, of my e-mailing of 10,000, I received $58,000 with 
  more coming in every day. 
  
  
 I paid off ALL my debts and bought a much needed new car. Please take 
 time to read the attached program, IT WILL CHANGE YOUR LIFE FOREVER!! 
 ! Remember, it won't work if you don't try it. This program does work 
 , but you must follow it EXACTLY! Especially the rules of not trying 
 to place your name in a different place. It won't work and you'll 
 lose out on a lot of money! 
 In order for this program to work, you must meet your goal of 20+ 
 orders for REPORT #1, and 100+ orders for REPORT #2 and you will make $50,000 
  or more in 90 days. I AM LIVING PROOF THAT IT WORKS!!! 
  
  
 If you choose not to participate in this program, I am sorry. It 
 really is a great opportunity with little cost or risk to you. If you 
 choose to participate, follow the program and you will be on your way 
 to financial security. If you are a fellow business owner and are in 
 financial trouble like I was, or you want to start your own business, consider 
 this a sign. I DID! 
  
  
  Sincerely, 
  Johnathon Rourke 
  
  
  A PERSONAL NOTE FROM THE ORIGINATOR OF THIS PROGRAM: 
  
  
 By the time you have read the enclosed program and reports, you 
 should have concluded that such a program, and one that is legal, 
 could not have been created by an amateur. 
  
  
 Let me tell you a little about myself. I had a profitable business 
 for 10 years. Then in 1979 my business began falling off. I was doing 
 the same things that were previously successful for me, but it wasn't 
 working. Finally, I figured it out. It wasn't me, it was the economy. 
 Inflation and recession had replaced the stable economy that had been 
 with us since 1945.I don't have to tell you what happened to the 
 unemployment rate... because many of you know from first hand 
 experience. There were more failures and bankruptcies than ever before. 
  
  
 The middle class was vanishing. Those who knew what they were doing 
 invested wisely and moved up. Those who did not, including those who 
 never had anything to save or invest, were moving down into the ranks 
 of the poor. As the saying goes, "THE RICH GET RICHER AND THE POOR 
 GET POORER." The traditional methods of making money will never allow 
 you to "move up" or "get rich", inflation will see to that. 
  
  
 You have just received information that can give you financial 
 freedom for the rest of your life, with "NO RISK" and "JUST A LITTLE 
 BIT OF EFFORT." You can make more money in the next few months than you 
  have ever imagined. I should also point out that I will not see a penny of 
 this 
 money, nor anyone else who has provided a testimonial for this 
 program. I have already made over 4 MILLION DOLLARS!I have retired 
 from the program after sending thousands and thousands of programs. 
  
  
 Follow the program EXACTLY AS INSTRUCTED. Do not change it in any way 
  It works exceedingly well as it is now. Remember to e-mail a copy 
 of this exciting report to everyone you can think of. One of the 
 people you send this to may send out 50,000...and your name will be on 
 everyone of 
 them! 
  
  
 Remember though, the more you send out the more potential customers 
 you will reach. 
  
  
 So my friend, I have given you the ideas, information, materials and 
 opportunity to become financially independent. IT IS UP TO YOU NOW! 
  
  
  "THINK ABOUT IT" 
  
  
 Before you delete this program from your mailbox, as I almost did, 
 take a little time to read it and REALLY THINK ABOUT IT. Get a pencil 
 and figure out what could happen when YOU participate. Figure out the 
 worst possible response and no matter how you calculate it, you will 
 still make a lot of money! You will definitely get back what you 
 invested. Any doubts you have will vanish when your first orders come 
 in. IT WORKS! 
  
  
  Jody Jacobs, Richmond, VA 
  
  
  HERE'S HOW THIS AMAZING PROGRAM WILL MAKE YOU THOUSANDS OF 
 DOLLAR$ 
  
  
  INSTRUCTIONS: 
  
  
 This method of raising capital REALLY WORKS 100% EVERY TIME. 
 I am sure that you could use up to $50,000 or more in the next 90 
 days. Before you say "BULL... ", please read this program carefully. 
  
  
 This is not a chain letter, but a perfectly legal money making 
 opportunity. Basically, this is what you do: As with all multi-level 
 businesses, we build our business by recruiting new partners and 
 selling our products. Every state in the USA allows you to recruit 
 new multi-level business partners, 
 and we offer a product for EVERY dollar sent. YOUR ORDERS COME BY 
 MAIL AND ARE FILLED BY E-MAIL, so you are not involved in personal 
 selling. You do it privately in your own home, store or office. This 
 is the GREATEST Multi-Level Mail Order Marketing anywhere. 
  
  
  This is what you MUST do: 
  
  
 1. Order all 4 reports shown on the list below (you can't sell them 
 if youdon't order them). 
 -- For each report, send $5.00 CASH, the NAME & NUMBER OF THE REPORT 
 YOU ARE ORDERING, YOUR E-MAIL ADDRESS, and YOUR NAME & RETURN 
 ADDRESS (in case of a problem) to the person whose name appears on 
 the list next to the report. MAKE SURE YOUR RETURN ADDRESS IS ON 
 YOUR ENVELOPE IN CASE OF ANY MAIL PROBLEMS! 
 -- When you place your order, make sure you order each of the four 
 reports. You will need all four reports so that you can save them on 
 your computer and resell them. 
 -- Within a few days you will receive, via e-mail, each of the four 
 reports. Save them on your computer so they will be accessible for you to send 
 to the 1,000's of people who will order them from you. 
  
  
 2. IMPORTANT DO NOT alter the names of the people who are listed next 
 to each report, or their sequence on the list, in any way other than 
 is instructed below in steps "a" through "f" or you will lose out on 
 the majority of your profits. Once you understand the way this works, 
 you'll also see how it doesn't work if you change it. Remember, this 
 method has been tested,and if you alter it, it will not work. 
 a. Look below for the listing of available reports. 
 b. After you've ordered the four reports, take this advertisement and 
  remove the name and address under REPORT #4. This person has 
 made it through the cycle and is no doubt counting their $50,000! 
 c. Move the name and address under REPORT #3 down to REPORT #4. 
 d. Move the name and address under REPORT #2 down to REPORT #3. 
 e. Move the name and address under REPORT #1 down to REPORT #2. 
 f. Insert your name/address in the REPORT #1 position. 
  
  
  Please make sure you COPY ALL INFORMATION, every name and 
 address, 
  ACCURATELY! 
  
  
 3. Take this entire letter, including the modified list of names, and 
 save it to your computer. Make NO changes to the instruction portion 
 of this letter. 
  
  
  Your cost to participate in this is practically nothing (surely 
 you can afford $20). You obviously already have an Internet 
 connection and e-mail is FREE! 
  
  
  There are two primary methods of building your downline: 
  
  
  METHOD #1: SENDING BULK E-MAIL 
  
  
 Let's say that you decide to start small, just to see how it goes, 
 and we'll assume you and all those involved send out only 2,000 
 programs each. Let's also assume that the mailing receives a 0.5% 
 response. Using a good list the response could be much better. Also, 
 many people will send out hundreds of 
 thousands of programs instead of 2,000. But continuing with this 
 example, you send out only 2,000 programs. With a 0.5% response, that 
 is only 10 orders for REPORT #1. Those 10 people respond by sending 
 out 2,000 programs each for a total of 20,000. Out of those 0.5%, 100 
 people respond and order REPORT #2. Those 100 mail out 2,000 programs 
 each for a total of 200,000. 
 The 0.5% response to that is 1,000 orders for REPORT #3. Those 1,000 
 send out 2,000 programs each for a 2,000,000 total. The 0.5% response 
 to that is 10,000 orders for REPORT #4. That's 10,000 $5 bills for 
 you. CASH!!! Your total income in this example is $50 + $500 + $5,000 
 + $50,000 for a total of 
 $55,550!!! REMEMBER FRIEND, THIS IS ASSUMING 1,990 OUT OF THE 2,000 
 PEOPLE YOU MAIL TO WILL DO ABSOLUTELY NOTHING AND TRASH THIS 
 PROGRAM! DARE TO THINK FOR A MOMENT WHAT WOULD HAPPEN IF 
 EVERYONE, OR HALF SENT OUT 100,000 PROGRAMS INSTEAD OF 2,000. 
 Believe me, many people will do justthat, and more! By the way, your cost to 
 participate in this is practically nothing. You obviously already have an 
 Internet 
 connection and e-mail is FREE!!! REPORT #2 will show you the best 
 methods for bulk e-mailing, tell you where 
 to obtain free bulk e-mail software and where to obtain e-mail lists. 
  
  
  METHOD #2 - PLACING FREE ADS ON THE INTERNET 
  
  
 Advertising on the internet is very, very inexpensive, and there are 
 HUNDREDS of FREE places to advertise. Let's say you decide to start 
 small just to see how well it works. Assume your goal is to get ONLY 
 10 people to participate on your first level. (Placing a lot of FREE 
 ads on the Internet will EASILY get a larger response.) Also assume that 
  everyone else in YOUR ORGANIZATION gets ONLY 10 downline members. 
  Follow this example to achieve the STAGGERING results below: 
  
  
 1st level--your 10 members with $5.......................................$50 
 2nd level--10 members from those 10 ($5 x 100)..................$500 
 3rd level--10 members from those 100 ($5 x 1,000)...........$5,000 
 4th level--10 members from those 1,000 ($5 x 10,000).....$50,000 
  THIS TOTALS ----------$55,550 
  
  
 Remember friends, this assumes that the people who participate only 
 recruit 10 people each. Think for a moment what would happen if they 
 got 20 people to participate! Most people get 100's of participants! 
 THINK ABOUT IT! For every $5.00 you receive, all you must do is e-mail them 
  the report they ordered. THAT'S IT! ALWAYS PROVIDE SAME-DAY SERVICE 
  ON ALL ORDERS! This will guarantee that the e-mail THEY send out with YOUR 
 name and address on it will be prompt because they can't advertise 
 until they receive the report! 
  
  
  AVAILABLE REPORTS 
  
  
  *** Order Each REPORT by NUMBER and NAME *** 
  Notes: 
 -- ALWAYS SEND $5 CASH (U.S. CURRENCY) FOR EACH REPORT. CHECKS NOT 
  ACCEPTED. 
 -- ALWAYS SEND YOUR ORDER VIA FIRST CLASS MAIL. 
 -- Make sure the cash is concealed by wrapping it in at least two 
 sheets of paper. On one of those sheets of paper, include: 
  (a) the number & name of the report you are ordering, (b) your 
 e-mail address, and (c) your name & postal address. 
  
  
  PLACE YOUR ORDER FOR THESE REPORTS NOW: 
  
  
 REPORT #1 "The Insider's Guide to Advertising for Free on the 
 Internet" 
  
  
  ORDER REPORT #1 FROM: 
  
  David Jonsson
  Helperwestsingel 53A1
  9721 BC Groningen NL
  
 REPORT #2 "The Insider's Guide to Sending Bulk E-mail on the 
 Internet" 
  
  
  ORDER REPORT #2 FROM: 

  Ed Turpin 
  1577 C.R. 236 
  Clyde, OH 43410 

 REPORT #3 "The Secrets to Multilevel Marketing on the Internet" 
  
  
  ORDER REPORT #3 FROM: 
  
  D. Cross
  365 N. Abbe Rd.
  Elyria, OH 44035  
  
 REPORT #4 "How to become a Millionaire utilizing the Power of Multilevel 
 Marketing and the Internet" 
  
  
  ORDER REPORT #4 FROM: 

  J. Hansen 
  P.O. Box 93055 
  19705 Fraser Hwy Langley, BC. 
  Canada, V3A 8H2   

  About 50,000 new people get online every month! 
  
  ******* TIPS FOR SUCCESS ******* 
 -- TREAT THIS AS YOUR BUSINESS! Be prompt, professional, and follow 
 the directions accurately. 
 -- Send for the four reports IMMEDIATELY so you will have them when 
 the orders start coming in because: When you receive a $5 order, you 
 MUST send out the requested product/report. 
 -- ALWAYS PROVIDE SAME-DAY SERVICE ON THE ORDERS YOU RECEIVE. 
 -- Be patient and persistent with this program. If you follow the 
  instructions exactly, your results WILL BE SUCCESSFUL! 
 -- ABOVE ALL, HAVE FAITH IN YOURSELF AND KNOW YOU WILL SUCCEED! 
  
  
  ******* YOUR SUCCESS GUIDELINES ******* 
  Follow these guidelines to guarantee your success: 
  
  
 If you don't receive 20 orders for REPORT #1 within two weeks, 
 continue 
  
  
 advertising or sending e-mails until you do. Then, a couple of weeks 
 later you should receive at least 100 orders for REPORT#2. If you don 
 't, continue advertising or sending e-mails until you do. Once you 
 have received 100 or more orders for REPORT #2, YOU CAN RELAX, 
 because the system is already working for you, and the cash will 
 continue to roll in! 
  
  
  THIS IS IMPORTANT TO REMEMBER: 
 Every time your name is moved down on the list, you are placed in 
 front of a DIFFERENT report. You can KEEP TRACK of your PROGRESS by 
 watching which report people are ordering from you. If you want to 
 generate more income, send another batch of e-mails or continue 
 placing ads and start the whole process again! There is no limit to 
 the income you will generate from this business! 
  
  
 Before you make your decision as to whether or not you participate in 
 this program. Please answer one question. DO YOU WANT TO CHANGE YOUR 
 LIFE? If the answer is yes, please look at the following facts about 
 this program: 
  
  
 1. You are selling a product which does not Cost anything to PRODUCE, 
 SHIP OR ADVERTISE. 
 2. All of your customers pay you in CASH! 
 3. E-mail is without question the most powerful method of 
 distributing information on earth. This program combines the 
 distribution power of e-mail together with the revenue generating 
 power of multi-level marketing. 
 4. Your only expense--other than your initial $20 investment--is your 
 time! 
 5. Virtually all of the income you generate from this program is PURE 
 PROFIT! 
 6. This program will change your LIFE FOREVER. 
  
  
 ACT NOW!Take your first step toward achieving financial independence. 
 Orderthe reports and follow the program outlined above--SUCCESSwill 
 be yourreward. 
  
  
  Thank you for your time and consideration. 
  
  
 PLEASE NOTE: If you need help with starting a business, registering a 
 business name, learning how income tax is handled, etc., contact your 
 localoffice of the Small Business Administration (a Federal Agency) 
 1-800-827-5722 for free help and answers to questions. Also, the 
 InternalRevenue Service offers free help via telephone and free 
 seminars aboutbusiness tax requirements. Your earnings are highly 
 dependant on youractivities and advertising. The information 
 contained on this site and in the report constitutes no guarantees 
 stated nor implied. In the event that it is determined that this site 
 or report constitutes a guarantee of any kind, that guarantee is now 
 void. The earnings amounts listed on this site and in the report are 
 estimates only. If you have any questions of the legality of this 
 program, contact the Office of Associate Director for Marketing 
 Practices, Federal Trade Commission, Bureau of Consumer Protection in 
 Washington, DC. 
 
 
From akuchlin@mems-exchange.org  Mon Aug 16 02:04:20 1999
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Sun, 15 Aug 1999 21:04:20 -0400
Subject: [XML-SIG] CVS tree reorganized
Message-ID: <199908160104.VAA03692@207-172-146-60.s60.tnt3.ann.va.dialup.rcn.com>

I've completed the rearrangement of the XML-SIG's CVS tree, though
there are still some things left to tidy up.  (For example, some test
suite failures haven't yet been looked into.)

The goal was to clean out the root directory of the distribution, and
simplify the installation process.  The important points are:

    * You now want to use the '-P' option to CVS to prune empty
directories; otherwise, you'll get lots of obsolete directories that
are all empty.

    * Python modules that need to be installed are now in the
'xml' subdirectory; for example, the 'dom', 'arch', and 'sax'
subdirectories have all moved down into 'xml'.  Python files that
don't get installed, like those in 'demo' and 'test', are still where
they were.
 
    * C extensions are now in the 'extensions' subdirectory.

    * Binaries for Windows and MacOS should go in the 'windows' and
 'mac' directories.

    * Installation has been changed to follow the procedures set by
the Distutils-SIG.  An end user will run a Python script, setup.py.
It can be given one of three arguments: 'build', 'test', and
'install'.  (Note that it doesn't actually use any Distutils code, but
simply tries to present a similar user interface.)

      The 'build' target will create a subdirectory named 'build', and
copy the 'xml/' subdirectory into 'build', and will then copy compiled
C extensions into build/ at the proper locations.  On Unix it will
also compile the C extensions; on Windows and Mac, it should copy
binary files like DLLs and PYDs into the build/ subdirectory.
(Volunteers to implement that are needed.)  'install' is then a simple
matter of copying the build/xml/ tree to the installation location.

      The point of the setup.py scheme is to simplify installation on
compilerless systems because the build process is reduced to some file
copying.  However, we need someone to write the relevant copying bits
for Windows and Mac, because I'm not sure what's legal.

      In any case, I'm sure there are inadvertent breakages from this
change; please try out the CVS tree and report problems.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl.
    -- Anonymous


From paul@prescod.net  Tue Aug 24 21:13:43 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 24 Aug 1999 16:13:43 -0400
Subject: [XML-SIG] CVS tree reorganized
References: <199908160104.VAA03692@207-172-146-60.s60.tnt3.ann.va.dialup.rcn.com>
Message-ID: <37C2FCF7.91398525@prescod.net>

A.M. Kuchling wrote:
> 
>     * Binaries for Windows and MacOS should go in the 'windows' and
>  'mac' directories.

That should include xmlparse.dll and xmltok.dll. In the current
distribution those go into "expat/bin" which is never in anyone's path.

Also drv_expat expects pyexpat to be in xml.parsers which it isn't in
the current distribution.

 Paul Prescod


From Fred L. Drake, Jr." <fdrake@acm.org  Tue Aug 24 22:33:51 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 24 Aug 1999 17:33:51 -0400 (EDT)
Subject: [XML-SIG] CVS tree reorganized
In-Reply-To: <37C2FCF7.91398525@prescod.net>
References: <199908160104.VAA03692@207-172-146-60.s60.tnt3.ann.va.dialup.rcn.com>
 <37C2FCF7.91398525@prescod.net>
Message-ID: <14275.4031.17751.874877@weyr.cnri.reston.va.us>

Paul Prescod writes:
 > Also drv_expat expects pyexpat to be in xml.parsers which it isn't in
 > the current distribution.

  The pyexpact project files for the Mac development environment
should probably go under extensions as well, instead of a top-level
pyexpat/ directory.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From paul@prescod.net  Tue Aug 24 21:22:49 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 24 Aug 1999 16:22:49 -0400
Subject: [XML-SIG] drv_htmlllib
Message-ID: <37C2FF19.A2B464F8@prescod.net>

In pylibs.py, there is a comment that says:

#handle_starttag is never called!

In accordance with the comment, there is no definition for
handle_starttag. There is a (seemingly correct) definition for
unknown_starttag but it doesn't seem to ever get called. This seems to
fix it:

def handle_starttag( self, tag, method, attributes ):
    self.unknown_startag( tag, attributes )

I don't know why that wasn't in to begin with. The mysterious comment
probably has something to do with it.

 Paul Prescod


From lmariusg@ifi.uio.no  Wed Aug 25 07:43:07 1999
From: lmariusg@ifi.uio.no (Lars Marius Garshol)
Date: 25 Aug 1999 08:43:07 +0200
Subject: [XML-SIG] drv_htmlllib
In-Reply-To: <37C2FF19.A2B464F8@prescod.net>
References: <37C2FF19.A2B464F8@prescod.net>
Message-ID: <wkvha4jr7o.fsf@ifi.uio.no>

* Paul Prescod
|
| In pylibs.py, there is a comment that says:
| 
| #handle_starttag is never called!

That was put in by me. I seem to recall that when I first wrote the
*mllib drivers that method was for some reason never called, and so I
just left it empty.
 
| In accordance with the comment, there is no definition for
| handle_starttag. There is a (seemingly correct) definition for
| unknown_starttag but it doesn't seem to ever get called.

Hmmm. Maybe something to do with version mismatches?

| This seems to fix it:
| 
| def handle_starttag( self, tag, method, attributes ):
|     self.unknown_startag( tag, attributes )

Yup, this is correct (and I have it in my CVS tree already). I have to
get my act together soon and put out a new set of releases for SAX.
The time when I can do that is getting much closer, but is not there
yet.
 
--Lars M.


From c.evans@clear.net.nz  Wed Aug 25 12:37:33 1999
From: c.evans@clear.net.nz (Carey Evans)
Date: 25 Aug 1999 23:37:33 +1200
Subject: [XML-SIG] PyDOM performance
Message-ID: <87ogfwm6pu.fsf@psyche.evansnet>

--=-=-=

Hi.

I've been rather disappointed with the speed when trying out the DOM
support in the XML 0.5.1 package.  To construct a tree of the fairly
simple document at

    http://home.clear.net.nz/pages/c.evans/diary/hols199901.xml

took about 45 seconds.  I tried out the CVS tree and got this down to
17.8 seconds, which is quite an impressive improvement by itself, when 
PyDOM doesn't seem to have changed much.

Looking at this with the profiler, dom/core.py spends a *lot* of time
in __getattr__ and __setattr__.  I didn't have anything better to do,
so I rewrote these methods and got the time down to 11.7 seconds.
I've attached the patch to do this below.

My questions are:

  Is what I'm doing in this patch actually working, or am I on the
  wrong track?

  And, is it worth doing anything to PyDOM, or would I be better off
  looking at 4DOM, for example?

Thanks.

-- 
	 Carey Evans  http://home.clear.net.nz/pages/c.evans/

	       "This is where your sanity gives in..."


--=-=-=
Content-Type: text/x-patch
Content-Disposition: attachment; filename=dom-core.diff

--- core.py.dist	Fri Aug 13 14:33:42 1999
+++ core.py	Wed Aug 25 23:03:37 1999
@@ -323,16 +323,18 @@
     # to attributes such as .parentNode are redirected into calls to 
     # get_parentNode or set_parentNode.
     def __getattr__(self, key):
-        if key[0:4] == 'get_' or key[0:4] == 'set_':
-            raise AttributeError, repr(key[4:])
-        func = getattr(self, 'get_'+key)
-        return func()
+        method = self._get_dict.get(key)
+        if method is not None:
+            return method(self)
+        else:
+            raise AttributeError, key
 
     def __setattr__(self, key, value):
-        if hasattr(self, 'set_'+key):
-            func = getattr(self, 'set_'+key)
-            func( value )
-        self.__dict__[key] = value
+        method = self._set_dict.get(key)
+        if method is not None:
+            method(self, value)
+        else:
+            self.__dict__[key] = value
 
     def __cmp__(self, other):
 	if isinstance(other, Node):
@@ -637,6 +639,19 @@
                       "%s is an ancestor of %s" % (repr(child), repr(parent) )
             p = p.get_parentNode()
 
+    # Dictionaries of allowed get/set properties.
+    _get_dict = {
+        'nodeName': get_nodeName, 'name': get_name,
+        'nodeValue': get_nodeValue, 'value': get_value,
+        'nodeType': get_nodeType, 'attributes': get_attributes,
+        'childNodes': get_childNodes, 'parentNode': get_parentNode,
+        'firstChild': get_firstChild, 'lastChild': get_lastChild,
+        'previousSibling': get_previousSibling,
+        'nextSibling': get_nextSibling,
+        'ownerDocument': get_ownerDocument,
+        }
+    _set_dict = {}
+
         
 class CharacterData(Node):
     # Attributes
@@ -733,7 +748,14 @@
         d.name = "#text"
         d.value = value
         return Text(d, self._document)
-    
+
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({ 'data': get_data, 'length': get_length })
+    _set_dict = Node._set_dict.copy()
+    _set_dict.update({ 'data': set_data, 'nodeValue': set_nodeValue })
+
+
 class Attr(Node):
     childNodeTypes = [TEXT_NODE, ENTITY_REFERENCE_NODE]
     
@@ -789,7 +811,23 @@
     def get_parentNode(self): return None
     def get_previousSibling(self): return None
     def get_nextSibling(self): return None
-    
+
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({
+        'nodeName': get_nodeName, 'name': get_name,
+        'nodeValue': get_nodeValue, 'value': get_value,
+        'specified': get_specified,
+        'parentNode': get_parentNode,
+        'previousSibling': get_previousSibling,
+        'nextSibling': get_nextSibling,
+        })
+    _set_dict = Node._set_dict.copy()
+    _set_dict.update({
+        'nodeValue': set_nodeValue, 'value': set_value,
+        })
+
+
 class Element(Node):
     childNodeTypes = [ELEMENT_NODE, PROCESSING_INSTRUCTION_NODE, COMMENT_NODE,
                       TEXT_NODE, CDATA_SECTION_NODE, ENTITY_REFERENCE_NODE]
@@ -971,6 +1009,11 @@
             if L[i].type == ELEMENT_NODE:
                 n = NODE_CLASS[ L[i].type ] (L[i], self._document)
                 n.normalize()
+
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({ 'tagName': get_tagName, 'attributes': get_attributes })
+
     
 class Text(CharacterData):
     childNodeTypes = []
@@ -1040,6 +1083,13 @@
 
     def toxml(self):
         return '<!DOCTYPE %s>\n' % (self._node.name,)
+
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({
+        'name': get_name, 'entities': get_entities,
+        'notations': get_notations })
+
         
 class Notation(Node):
     readonly = 1    # This is a read-only class
@@ -1061,7 +1111,11 @@
             return '<!NOTATION %s PUBLIC %s %s>' % (self._node.name,
                                                     self._node.publicId,
                                                     self._node.systemId)
-        
+
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({ 'publicId': get_publicId, 'systemId': get_systemId })
+
         
 class Entity(Node):
     readonly = 1    # This is a read-only class
@@ -1077,6 +1131,14 @@
     def get_notationName(self):
         return self._node.notationName
 
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({
+        'publicId': get_publicId, 'systemId': get_systemId,
+        'notationName': get_notationName
+        })
+
+
 class EntityReference(Node):
     childNodeTypes = [ELEMENT_NODE, PROCESSING_INSTRUCTION_NODE,
                       COMMENT_NODE, TEXT_NODE, CDATA_SECTION_NODE,
@@ -1106,6 +1168,12 @@
             raise NoModificationAllowedException("Read-only object")
         self._node.value = data
 
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({ 'target': get_target, 'data': get_data })
+    _set_dict = Node._set_dict.copy()
+    _set_dict.update({ 'data': get_data })
+
 
 class Document(Node):
     childNodeTypes = [ELEMENT_NODE, PROCESSING_INSTRUCTION_NODE,
@@ -1325,6 +1393,17 @@
 
 	Node.replaceChild(self, newChild, oldChild)
 
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({
+        'doctype': get_doctype,
+        'implementation': get_implementation,
+        'childNodes': get_childNodes,
+        'documentElement': get_documentElement,
+        'ownerDocument': get_ownerDocument,
+        })
+
+
 class DocumentFragment(Node):
     childNodeTypes = [ELEMENT_NODE, PROCESSING_INSTRUCTION_NODE,
                       COMMENT_NODE, TEXT_NODE, CDATA_SECTION_NODE,
@@ -1341,7 +1420,12 @@
             n = NODE_CLASS[ child.type ] (child, self._document)
             L.append(n.toxml())
         return string.join(L, "")
-    
+
+    # Dictionaries of allowed get/set properties.
+    _get_dict = Node._get_dict.copy()
+    _get_dict.update({ 'parentNode': get_parentNode })
+
+
 # Dictionary mapping types to the corresponding class object
 
 NODE_CLASS = {

--=-=-=--


From gstein@lyra.org  Wed Aug 25 16:31:21 1999
From: gstein@lyra.org (Greg Stein)
Date: Wed, 25 Aug 1999 08:31:21 -0700 (PDT)
Subject: [XML-SIG] PyDOM performance
In-Reply-To: <87ogfwm6pu.fsf@psyche.evansnet>
Message-ID: <Pine.LNX.3.95.990825083021.30613B-100000@ns1.lyra.org>

If the DOM is not a specific requirement, and you simply need to translate
XML into a usable form for Python, then you may want to look at my qp_xml
module at http://www.lyra.org/greg/python/

Cheers,
-g

--
Greg Stein, http://www.lyra.org/

On 25 Aug 1999, Carey Evans wrote:

> Hi.
> 
> I've been rather disappointed with the speed when trying out the DOM
> support in the XML 0.5.1 package.  To construct a tree of the fairly
> simple document at
> 
>     http://home.clear.net.nz/pages/c.evans/diary/hols199901.xml
> 
> took about 45 seconds.  I tried out the CVS tree and got this down to
> 17.8 seconds, which is quite an impressive improvement by itself, when 
> PyDOM doesn't seem to have changed much.
> 
> Looking at this with the profiler, dom/core.py spends a *lot* of time
> in __getattr__ and __setattr__.  I didn't have anything better to do,
> so I rewrote these methods and got the time down to 11.7 seconds.
> I've attached the patch to do this below.
> 
> My questions are:
> 
>   Is what I'm doing in this patch actually working, or am I on the
>   wrong track?
> 
>   And, is it worth doing anything to PyDOM, or would I be better off
>   looking at 4DOM, for example?
> 
> Thanks.
> 
> -- 
> 	 Carey Evans  http://home.clear.net.nz/pages/c.evans/
> 
> 	       "This is where your sanity gives in..."
> 
> 


From Mike.Olson@FourThought.com  Wed Aug 25 19:07:49 1999
From: Mike.Olson@FourThought.com (Mike Olson)
Date: Wed, 25 Aug 1999 13:07:49 -0500
Subject: [XML-SIG] PyDOM performance
References: <87ogfwm6pu.fsf@psyche.evansnet>
Message-ID: <37C430F5.4488E2E2@FourThought.com>


Carey Evans wrote:

>
>   And, is it worth doing anything to PyDOM, or would I be better off
>   looking at 4DOM, for example?

I don't think you will get much speed increase (infact it may be slower) with
4DOM.  We wrote 4DOM more conscerned with meeting the W3c spec to the letter
then speed.

One note, we are going to rewite all of the tree stuff in 4DOM in Red Black or
avl tree in C by the end of the month or early next month which should give us
some speed increases.  At that we will do some serious bench marks netween the
2 and work out a pythonic interface.

Mike


>
> Thanks.
>
> --
>          Carey Evans http://home.clear.net.nz/pages/c.evans/
>
>                "This is where your sanity gives in..."
>
>   ------------------------------------------------------------------------
>
>    dom-core.diffName: dom-core.diff
>                 Type: text/x-patch

--
----------------
Mike Olson
Consulting Member
FourThought LLC
http://www.fourthought.com  http://opentechnology.org


From Mike.Olson@FourThought.com  Wed Aug 25 19:49:32 1999
From: Mike.Olson@FourThought.com (Mike Olson)
Date: Wed, 25 Aug 1999 13:49:32 -0500
Subject: [XML-SIG] PyDOM performance
References: <87ogfwm6pu.fsf@psyche.evansnet> <37C430F5.4488E2E2@FourThought.com>
Message-ID: <37C43ABB.8230DD30@FourThought.com>

Out of curiosity I did a quick benchmark on the file you referenced and I was
rather suprised by the results.

Both tests where on a Celeron 366/ 128MB running linux.  Neither test performed
XML validation as I did not have a DTD.

4DOM was made with the orbless option and used the Ext.Builder.FromXmlFile
method.  the 4DOM version was from our cvstree which will be available by weeks
end.

Non validating 4DOM
Start:  935605414.575
End:  935605415.11
Delta:  0.535288095474

pydom is version 0.5.1 from the RPMs and used the utils.FileReader()

Non validating pydom
Start:  935605415.148
End:  935605418.212
Delta:  3.0643119812


I was quite suprised by this.  I don't know enough about the pydom internals to
explain why it is slower.  I just always assumed it was faster.

We will still be looking to speed up 4DOM with the C implementation of the trees.

Later
Mike

Mike Olson wrote:

> Carey Evans wrote:
>
> >
> >   And, is it worth doing anything to PyDOM, or would I be better off
> >   looking at 4DOM, for example?
>
> I don't think you will get much speed increase (infact it may be slower) with
> 4DOM.  We wrote 4DOM more conscerned with meeting the W3c spec to the letter
> then speed.
>
> One note, we are going to rewite all of the tree stuff in 4DOM in Red Black or
> avl tree in C by the end of the month or early next month which should give us
> some speed increases.  At that we will do some serious bench marks netween the
> 2 and work out a pythonic interface.
>
> Mike
>
> >
> > Thanks.
> >
> > --
> >          Carey Evans http://home.clear.net.nz/pages/c.evans/
> >
> >                "This is where your sanity gives in..."
> >
> >   ------------------------------------------------------------------------
> >
> >    dom-core.diffName: dom-core.diff
> >                 Type: text/x-patch
>
> --
> ----------------
> Mike Olson
> Consulting Member
> FourThought LLC
> http://www.fourthought.com http://opentechnology.org
>
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://www.python.org/mailman/listinfo/xml-sig

--
----------------
Mike Olson
Consulting Member
FourThought LLC
http://www.fourthought.com  http://opentechnology.org


From Fred L. Drake, Jr." <fdrake@acm.org  Wed Aug 25 20:13:59 1999
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 25 Aug 1999 15:13:59 -0400 (EDT)
Subject: [XML-SIG] PyDOM performance
In-Reply-To: <37C43ABB.8230DD30@FourThought.com>
References: <87ogfwm6pu.fsf@psyche.evansnet>
 <37C430F5.4488E2E2@FourThought.com>
 <37C43ABB.8230DD30@FourThought.com>
Message-ID: <14276.16503.954788.160580@weyr.cnri.reston.va.us>

Mike Olson writes:
 > I was quite suprised by this.  I don't know enough about the pydom
 > internals to explain why it is slower.  I just always assumed it

  PyDOM pays a *huge* penalty in two places: the proxies used to avoid 
circular references cause a lot of object creation/destruction when
using the document, though I'm not sure it affects construction time
so much.  It also used instances for the internal data format, where
perhaps only lists, tuples and dictionaries are really needed (at the
expense of making the code more obscure).
  I'd love to see the proxies disappear, and just require explicit
calls to a .destroy() method, but that means another massive code
change.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From dieter@handshake.de  Thu Aug 26 19:03:58 1999
From: dieter@handshake.de (Dieter Maurer)
Date: Thu, 26 Aug 1999 20:03:58 +0200 (CEST)
Subject: [XML-SIG] PyDOM performance
In-Reply-To: <14276.16503.954788.160580@weyr.cnri.reston.va.us>
References: <37C43ABB.8230DD30@FourThought.com>
 <14276.16503.954788.160580@weyr.cnri.reston.va.us>
Message-ID: <14277.32729.249378.680126@lindm.dm>

Fred L. Drake, Jr. writes:
 > 
 >   I'd love to see the proxies disappear, and just require explicit
 > calls to a .destroy() method, but that means another massive code
 > change.
Marc-Andre Lemburg recently released a new version of
mxProxy. It supports weak references and thus allows for
circular structures (with a somewhat unintuitive behaviour
when the root element is released while references to
internal tree nodes are hold, like "weakdicts").
I expect changes to be rather local, when mxProxy should
be used.

- Dieter


From paul@prescod.net  Fri Aug 27 18:43:06 1999
From: paul@prescod.net (Paul Prescod)
Date: Fri, 27 Aug 1999 13:43:06 -0400
Subject: [XML-SIG] Python Tools Make a Strong Showing
Message-ID: <37C6CE2A.BDAB1FD0@prescod.net>

http://www.xml.com/pub/1999/08/excelon/montreal.html#python

 Paul Prescod


From hoel@germanlloyd.org  Mon Aug 30 11:47:52 1999
From: hoel@germanlloyd.org (Berthold Hoellmann)
Date: Mon, 30 Aug 1999 12:47:52 +0200
Subject: [XML-SIG] problem processing XML files
Message-ID: <37CA6158.11623E2D@GermanLloyd.org>

Hello,

I just downloaded and installed "xml-0.5.1.tgz". I want to process the
ScientificPython documentation using this. My first test was a file like

--- snip ---
import sys
from xml.dom.utils import FileReader

class DomDumper(FileReader):
    def __init__(self,filename):
        FileReader.__init__(self,filename)
        print self.document
        print self.getFileType(filename)
        self.document.dump()

d = DomDumper(sys.argv[1])
print d
--- snip ---

Calling this with "ScientificPython.xml" as argument only returns

>python dumper.py ScientificPython.xml 
<DOM Document; root=None >
XML
<DOM Document; root=None >
<__main__.DomDumper instance at bbc00>

but not the XML structure as expected. Does the parser silently ignore
syntax errors? Running

>python dumper.py sample.xml

with "sample.xml" copied from the "xml-0.5.1/demo/quotes" directory
gives the expected result. How do I check the files syntax using python?

Thanks

Berthold
-- 
email: hoel@GermanLloyd.org
   )
  (
C[_]  These opinions might be mine, but never those of my employer.


From paul@prescod.net  Mon Aug 30 14:03:55 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 30 Aug 1999 09:03:55 -0400
Subject: [XML-SIG] problem processing XML files
References: <37CA6158.11623E2D@GermanLloyd.org>
Message-ID: <37CA813B.9D1334E3@prescod.net>

Berthold Hoellmann wrote:
> 
> but not the XML structure as expected. Does the parser silently ignore
> syntax errors? 

The problem code is in FileReader:

        p = saxexts.make_parser(parserName)
        dh = SaxBuilder()
        p.setDocumentHandler(dh)
        p.feed(stream.read())
        doc = dh.document
 
It doesn't set up an error handler. We haven't decided what the base SAX
module should do when there is no error handler. It's pretty clear that
it should *either* output error messages to stderr (what XML/SGML tools
have done traditionally) or it should throw an exception.

In this case, though, dom.utils should probably set up an explicit error
handler until we figure out a good default.

 Paul Prescod


From hinsen@cnrs-orleans.fr  Tue Aug 31 15:55:26 1999
From: hinsen@cnrs-orleans.fr (Konrad Hinsen)
Date: Tue, 31 Aug 1999 16:55:26 +0200
Subject: [XML-SIG] problem processing XML files
In-Reply-To: <199908310505.BAA13359@python.org> (xml-sig-admin@python.org)
References: <199908310505.BAA13359@python.org>
Message-ID: <199908311455.QAA31772@chinon.cnrs-orleans.fr>

Berthold Hoellmann wrote:

> but not the XML structure as expected. Does the parser silently ignore
> syntax errors? Running

Don't know (but I'd be interested in the answer myself!), but I can
tell you what the error in ScientificPython.xml is: there's no
filename (or "system identifier") for the DTD.

This is one of my favourite quarrels with XML, because I find it
highly inconvenient to be forced to put machine-dependent data
into my documents. Especially since the DocBook DTD is not at the
same location on the two machines that I use regularly.

Fortunately nsgmls is more tolerant; it lets me specify the filename
in a catalog entry and continues parsing after reporting the error.
I wish other parsers would do the same, at least optionally.
-- 
-------------------------------------------------------------------------------
Konrad Hinsen                            | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron                       | Fax:  +33-2.38.63.15.17
45071 Orleans Cedex 2                    | Deutsch/Esperanto/English/
France                                   | Nederlands/Francais
-------------------------------------------------------------------------------


From lmariusg@ifi.uio.no  Tue Aug 31 20:04:16 1999
From: lmariusg@ifi.uio.no (Lars Marius Garshol)
Date: 31 Aug 1999 21:04:16 +0200
Subject: [XML-SIG] problem processing XML files
In-Reply-To: <199908311455.QAA31772@chinon.cnrs-orleans.fr>
References: <199908310505.BAA13359@python.org> <199908311455.QAA31772@chinon.cnrs-orleans.fr>
Message-ID: <wkwvubdb67.fsf@ifi.uio.no>

* Konrad Hinsen
| 
| Fortunately nsgmls is more tolerant; it lets me specify the filename
| in a catalog entry and continues parsing after reporting the error.
| I wish other parsers would do the same, at least optionally.

xmlproc supports catalog files, both SGML Open ones and XCatalog
ones. You need to specify both pubid and sysid, but if xmlproc can
resolve the former it will use it.

xmlproc will also continue after errors, but will not pass data to the
application anymore. (This is required by the XML recommendation.)
However, this is optional, and you can change it with the
set_data_after_wf_error method.

<URL:http://www.stud.ifi.uio.no/~lmariusg/download/python/xml/xmlproc-doco.html#Parser
>

--Lars M.