[XML-SIG] Pyx

Paul Prescod paul@prescod.net
Mon, 03 Jul 2000 14:41:34 -0500


I'm catching up on some mail I missed:

http://www.python.org/pipermail/xml-sig/2000-June/004505.html

> Nope. Think pipes. Think os.popen(). [1] You have two choices,
> use pyexat directly and write the external file. This avoids
> the sub-process but costs more disk space. Alternatively,
> live with the sub-process call (hardly an issue these days)
> and use popen() to create a piped connection to the created
> PYX. This is a very disk efficient way of doing things.

Using pipes and two parsers is only efficient compared to the
alternative of saving the Pyx to disk. It is not efficient compared to
just using Expat in a single pass. Pyxie uses two options and neither is
particularly efficient. One is space inefficient and one is time
inefficient (and also doesn't work well on some versions of Windows, in
my experience).

Fredrick has invented a way that is portable, and time/space efficient.
It would be trivial to incorporate it into Pyxie and thus relieve Pyxie
of its dependence on Pyx.

> >Why would you generate PYX rather than XML? If we start moving PYX
> >between XML-aware programs, it becomes an XML competitor.
> 
> There is obviously a fundamental misconncect here. I don't
> know what else I can do to explain this to you!
> 
> PYX is *line oriented*, I pass it through line oriented tools
> using the Unix pipe philsophy. I cannot do that with
> XML.

#1. You yourself said that it is possible for an XML subset to be its
own line normalized format.

#2. Let's ignore that and pretend it is not possible. It is entirely
possible to use XML as the interchange format between databases and
applications and so forth and just use Pyx when it is necessary to make
the information available as line-oriented information. Translation to
Pyx can be just the end result of a chain of filters. Therefore you do
not need ODBC->PYX and HTML->PYX and ...->PYX. You need *->XML and XML->
Pyx (which you already have). If you start making Pyx "drivers" for
every data source in the world then you are duplicating all of the work
that has already been done for XML!

> Why are you so hostile to it?

I'm not hostile to Pyx. I am hostile to what I see as a very fuzzy
description of what Pyx does and does not do.

#1. You claim that Pyxie is a Pyx procesing library but I could
implement the entire Pyxie API without PYX. So let's separate Pyxie and
Pyx so that we can see what are good and bad about each. The first step
is to recognize that Pyx and Pyxie they are not at all dependent on each
other -- except according to your current implementation scheme. So
Pyxie's API is great and innovative but that derives not one whit from
Pyx.

#2. You claim that we should make pyx generators for ODBC and various
apps. I claim that the combination of XML and XML->Pyx gives us defacto
such generators. Therefore we should push for *xml generators* first,
because they have a much broader utility than Pyx generators.

#3. You and I agree that an XML subset can serve as its own
normalized-line syntax. If software generated this subset then it would
be automatically compatible with both line-oriented software and with
XML-compatible software. I cannot see an advantage of Pyx over such a
subset.

I've worked with ESIS for years and it never bothered me. It was a
useful hack for programming languages (most) that didn't have built-in
SGML parsers. It remains that for XML and languages like SED, AWK and
GREP without built-in XML parsers. Fantastic!

What bothers me about it is not that it exists or is used, but merely
that its importance is exaggerated.

One one page you say this: http://www.digitome.com/Download.html

"The entire Pyxie library revolves around a very simple, line-oriented
notation for the information emitted by an XML parser. This notation is
known as PYX. The first character of each line is used to say what type
of information the line contains:"

And yet, on the example pages, I see nothing that requires any knowledge
of the Pyx notation AT ALL: http://www.digitome.com/Examples.html

Why would a Pyxie programmer care about the syntax you describe on the
overview page? In fact, I see the great virtues of Pyx for awk, sed and
grep programmers, but if I am only interested in Python, why would I
care whether the input format was line oriented or not?

-- 
 Paul Prescod - Not encumbered by corporate consensus
The calculus and the rich body of mathematical analysis to which it 
gave rise made modern science possible, but it was the algorithm that 
made the modern world possible.
	- The Advent of the Algorithm (pending), by David Berlinski