[XML-SIG] Re: Pyx

Paul Prescod paul@prescod.net
Mon, 03 Jul 2000 17:16:47 -0500


Sean McGrath wrote:
> 
> ...
> >It would be trivial to incorporate it into Pyxie and thus relieve Pyxie
> >of its dependence on Pyx.
>
> Although doubtless possible, I see no great benefit in doing this.

I guess we are too far apart on this issue. My approach is that if you
can reduce I/O (even piped) then you always do because it is so
expensive, especially if you have to do do splits, attribute bundling
and so forth in Python code.

> If we subset XML to produce a similifed line oriented versions,
> the powers that be will be very vexed because it will be seen
> as fragmenting XML, confusing developers etc. etc. Witness
> the SML experience.

The SML experience is not illustrative. Their approach was: "XML is too
complex. That's a bug. Let's fix it." That is sure to get up people's
hackles. Their "simplification" did not really enable any new
applications, it just scratched their perception of a general messiness
in the XML spec.

Line-ML would (or at least could!) be very different. You would say: "I
need compatiblity with these legacy tools. XML does not meet that
requirement. I intend to make an XML subset *complimentary to full XML*
that is compatible with these tools. It would be clear when it was
appropriate to use Line-ML and when to use "full XML." If you didn't
care about compatibility with line-oriented tools you would use Line-ML.

I am the wrong person to lead such a charge because I use Python for all
of my XML processing needs and do not worry about line breaks. I
understand and respect that some people are more comfortable with awk,
grep and so forth. I'm just not the right person to promote their
world-view.

> Syntactically, PYX is clearly not XML and is in no way
> a threat to XML 1.0 and does not confuse developers.

It would be a threat to XML if people started making ODBC, LaTeX, etc.
drivers in preference to XML drivers. I don't see how you can accuse me
of forking the universe when you are proposing this "line-oriented XML
alternative" that happens to be isomorphic with a subset of XML.

> How would you propose that an XML document conforming
> to your XML-subset would signal this fact to software?

I wouldn't require any signal at all, but I would *allow* a processing
instruction.

> You mean "parse" it into PYX. But you are complaining about
> parsing overhead and here you are introducing it!

I complained about parsing overhead *inside of Pyxie* because it is
unnecessary. It is an implementation detail that could (should!) be
optimized away. On the other hand, using conversion as a means to avoid
doubling the number of data format drivers in the world strikes me as
good sense! That isn't about optimization, it's about the fundamental
interfaces exposed by software.

> >Translation to
> >Pyx can be just the end result of a chain of filters. Therefore you do
> >not need ODBC->PYX and HTML->PYX and ...->PYX. You need *->XML and XML->
> >Pyx (which you already have). If you start making Pyx "drivers" for
> >every data source in the world then you are duplicating all of the work
> >that has already been done for XML!
> 
> This paragraph pre-supposes a line oriented subset of XML which I believe
> to be politically if not technically infeasible at this point.

No it does not. The whole paragraph presumes that pyx is still the
line-oriented format.

> With respect, you are using your not insignificant
> intelligence to read too much into it! 

I feel the opposite is true:

> Yes, I accept that it is feasible to separate PYX from Pyxie
> but they are inextricably linked in my head. Maybe this is
> a bad thing...

There. It is you who are reading too much into it. 

I think that separating pyx from pyxie *at least logically* is necessary
before we can start talking about incorporating pyxie's best parts into
Python. If we DID incorporate both Pyx and Pyxie I would say that they
would be as separate modules oriented towards their distinct and
separate strengths and weaknesses. Half of pyx would probably goes with
the marshallers and half with the parsers.

You are I are not the only person who are unclear on the Pyx/Pyxie
relationship:

http://www.deja.com/threadmsg_ct.xp?AN=561555955

The poor guy asked a very reasonable question but never got a clear
answer back. The lack of clarity around this situation has been a
long-time annoyance for me in discussion of Pyxie and I think that this
discussion has been productive in clearing it up in my mind.

--

The another annoyance has been the assertion that Pyxie is "Pythonic"
and SAX/DOM are "language independent." I see no evidence of this
dichotomy. Pyxie is innovative. It would be innovative if it had been
invented for Java or Perl too. PyDOM and PySAX re Pythonic. They make a
lot of use of tuples, dictionaries, __getitem__, __getattr__ and other
Python idioms. We just spent last week defending the use of __getattr__.

The Python DOMs have lots of flaws but none of them derive from having
been specified in IDL rather than Python. I would be happy to entertain
criticisms based on real weaknesses like redundancy, performance or API
design. *Evidence* of poor pythonicity would also be welcomed.

-- 
 Paul Prescod - Not encumbered by corporate consensus
The calculus and the rich body of mathematical analysis to which it 
gave rise made modern science possible, but it was the algorithm that 
made the modern world possible.
	- The Advent of the Algorithm (pending), by David Berlinski