[Python-Dev] Request for Opinions

Paul Prescod paul@prescod.net
Thu, 13 Jul 2000 23:06:52 -0500


I don't want to PEP this, I want to discuss it. In fact, I think we are
supposed to discuss it four more times before I PEP it. And Tim has to
participate in each discussion, right?

There is a language called "XPath" for navigating XML trees. It makes
your life a lot easier. XPath is to XML trees as SQL is to relational
databases. Imagine working with relational databases by looping over
records with no query language! I think that some portion of XPath
should go into Python 2.

4Thought has a large, sophisticated implementation of the entire XPath
language. They use Bison (not PyBison!) and FLEX for parsing
performance. I guess there is about 2000 lines of C/yacc. The resulting
binary is XXX. There is also about 2000 lines of Python code. On the one
hand, this strikes me as a lot of code for one subpackage (xpath) of a
package (xml). On the other hand, XPath is pretty stable so once the
code is done, it is pretty stable and also self contained. It wasn't
something we invented and will be tweaking forever. Rather it is a
simple API to something relatively static. (XPath has extension
mechanisms so it can solve new problems without growing syntactically)
Is this an appropriate extension to Python, like pyexpat?

We have three options:

 1. use the relatively large 4XPath as is

 2. use a tiny subset of XPath (analogous SQL with only simple SELECT)
that can be implemented in a couple of hundred lines of Python code
(this code is mostly done already, in a module called TinyXPath)

 3. try to scale 4XPath back by moving its parser to SRE, and making
some of its features "options" that can be added separately (not clear
how easy this is)

What do you think?

-- 
 Paul Prescod - Not encumbered by corporate consensus
Simplicity does not precede complexity, but follows it. 
	- http://www.cs.yale.edu/~perlis-alan/quotes.html