[XML-SIG] Pyxie

Paul Prescod paul@prescod.net
Fri, 30 Jun 2000 04:33:42 -0500


[from my vantage point the Internet is doing strange things right now
but I'll give it a try anyhow]

> >You just said that Pyxie can work directly from the output of pyexat.
> 
> Yes. Internally, in order to avoid the unnecessary overhead
> of forking a subprocess, Pyxie uses Pyexpat to parse
> XML and create a PYX stream.

This is the "double parsing" I mentioned. If Pyxie is parsing a one
gigabyte document (as an extreme example) it needs 1 gigabyte of extra
disk space for its tempfile. Fredrick's pull parsing technique can
eliminate this and eliminate the need to use pyx internally. With
pulldom, I can parse a gigabyte document with 0 bytes free disk space
and as little as 1K of RAM (above and beyond that required by
Python+modules).

Python optimization is a tricky issue but I think that even in the case
of small files, the fact that you don't do double the disk IO should
make the pulldom approach more efficient. And to the end user there is
no difference.

Also, the pull approach can be used in a streaming environment, You can
download the gigabyte document over a 300baud modem and get "output"
immediately.

In short, PYX is okay as an XML normalization syntax (though I would
prefer a line-oriented XML subset) but I still do not believe that it
needs to be the core of the Pyxie XML processing library. I bet I could
rewrite Pyxie without using PYX internally and Pyxie's users would would
not know that I had done so (except that they would get less disk IO).

Sometime after Python 1.6 is shipping, I'll implement this to
demonstrate.

> > It
> >seems that you could have a perfectly useful Pyxie app that doesn't use
> >pyx, right?
>
> No. The tree builder and event dispatcher need to be fed
> data in PYX notation. How you get the PYX data stream
> is up to you. For example, you could generate PYX from
> an Access database using COM and feed it to the event
> dispatcher.

Why would you generate PYX rather than XML? If we start moving PYX
between XML-aware programs, it becomes an XML competitor.
-- 
 Paul Prescod - Not encumbered by corporate consensus
The calculus and the rich body of mathematical analysis to which it 
gave rise made modern science possible, but it was the algorithm that 
made the modern world possible.
	- The Advent of the Algorithm (pending), by David Berlinski