[XML-SIG] XPath in Python 2

Uche Ogbuji uogbuji@fourthought.com
Mon, 10 Jul 2000 12:14:00 -0600


> Python is delayed and we don't know how long it will be so.

Bloody hell!  That's what happens when gurus get married.

Note: I hate smileys, but I guess I'd better throw one in, just in case: %^)

> Why XPath? XPath is the W3C-provided mechanism for navigating XML
> documents in a declarative way. That means that rather than specifying
> an exact path to a node, you describe the relationship between the node
> you are on and the node you want to get to. This makes the creation of
> complex applications much easier and allows for more efficiency "under
> the hood" of the XPath implementation.

I agree that XPath is a big nice-to-have.  Microsoft's GetByQuery (or 
something like that) is a very popular addition to their DOM and IBM et al are 
being forced to imitate, even in advance of DOM Level 3, which might address 
query.

> 4XPath is cleaner from a user's point of view, but it requires a lot bit
> of C/lex code for parsing the XPaths. I don't know if we would have to
> go back to the BDFL to get permission for that code to go into Python.

Would it be good enough for us just to check in ANSI C code from FLEX/Bison?

> We also have the option of creating a new XPath implementation also. The
> primary virtue of doing so would be the opportunity to implement a tiny
> subset of XPath in a much smaller amount of code. The two existing
> implementations probably have more code than the rest of the Python 1.6
> XML package. And in 4XPath's case, a lot of that is C code.
> 
> ---
> 
> My feeling is that implementing 10% of XPath in 10% of the code would
> get us 80% of the benefit. Those that need the rest can download 4XPath.

OK.

> I also think that 4XPath should be part of the pyxml distribution.

Already in motion.

> The 10% that is most interesting:
> 
> 	* a/b/c
> 	* a//b
> 	* ../
> 
> Actually, that's probably not even 10% and it can be "parsed" mostly
> with a "string.split" on "/". Things like positional predicates can be
> implemented with Python sequence syntax. Attribute access can use DOM
> syntax. All in all, this looks like an afternoon's work, if we agree
> that it should go into Python.

I don't know.  I agree that most people don't need XPath's zoo of axes, but I 
think predicates would be sorely missed very quickly.

I should note that other benefits of the mini-xpath -> 4XPath migration would 
be indexing and extension functions.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +01 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python