[XML-SIG] 4XPath and Unicode

uche.ogbuji@fourthought.com uche.ogbuji@fourthought.com
Sun, 10 Dec 2000 12:19:59 -0700


> > I've used it to develop parsers for a couple of different formats and
> > found it very nice to use. It is a "spare time" project of Andrew's,
> > but he is working on it quite often, so it is currently very
> > well-maintained. 
> 
> It seems that this supports only regular expressions, so it can't
> really express an LR(n) language, such as XPath, can it?

I think this shoots it down.  XPath is not an enormously complex language, but 
it's not a regular grammar either.  I don't have a formal proof that XPath is 
LR(k), but I've written enough parsers that I think I can confidently say so 
(besides, Martin thinks so as well).  This means that we'll either have to 
find an LR(k) parser engine for Python, or just replace the scanner and stick 
with Bison.  I'm inclined to agree with Martin in his other post that we 
should just find a scanner package for Python that already takes advantage of 
SRE and feed its token stream to Bison.

I'll try to investigate some lexer toolkits today.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python